- Automatic Classification Technique for Document Retrieval
- No.7, p.117-130
The concept of classification in library and information Science is being changed to the direction from the storage-oriented to the retrieva1-oriented.
Classification is one of men’s approaches to analyze or infer something from the objects and determine their decision from it. As classification is connected closely with human knowledge, there are many different kinds of view-points and approaches about it’s concept and purpose. Therefore it is necessary to have an insight into the concept, purpose and techniques of automatic classification to clarify the difference of these from those in other fields, to consider the possibility of adaptation of automatic classification techniques developed in other fields, to refer to the problems when they are adapted, then we can determine what kind of classification technique is useful for the library and information science field.
First of all this paper describes the concept of automatic classification and the difference of it’s purpose in the library and information science field and other fields.
Secondly the paper describes the factors which affect automatic classification techniques, such as characteristics of objects, processing steps of which classificatory operation is composed, criteria that determine to which category the objects should be distributed and general problems of the techniques.
Lastly it describes what kind of approach or viewpoint is necessary to make classification algorithms in library and information science, and describes a new technique from this point of view. The technique is based on what Bonner has developed. It has the following processing steps:
1)To select a number of characteristics which are representative of a group of documents,and to make a data matrix from the group of documents and characteristics.
2)To make a similarity matrix by calculating similarity measure between documents.
3)To make clumps which are categories and to distribute documents into them from the similarity matrix.
(School of Library and Information Science)
- 本文PDF (1,612K)