- Automatic Conversion of Keywords in the Secondary Information File
- No.9, p.427-438
In information retrieval it is sometimes better to use a vocabulary not coincident with that used for indexing. Here comes the problem of automatically converting a vocabulary of magnetic tape storage of documents indexed with keywords into that retrieval language.
First we need a conversion dictionary indicating correspondence between index language and retrieval one. By the use of this dictionary, following two methods of conversion are possible:
1. Keyword-for-keyword method, i.e., individual keywords in storage file are converted into those of retrieval file. As to this method conditions of the dictionary and feasibility of automatically preparing the dictionary come into question.
2. Document-for-document method, i.e., a set of keywords in a given document is converted into a set of retrieving keywords. In other words, this is a method based on statistics of frequency of occurrence of keywords in sample documents.
The storage file should preferably have a vocabulary controlled by thesaurus, indexed from various points of view, and represented with specific descriptors as possible. However, it is difficult to convert storage file satisfactorily into retrieval file by any single method of automatic conversion. Consequently combination of several conversion methods, that of automatic conversion and automatic indexing, human revision after conversion, etc. are required.
Standardization of thesauri is important in order to facilitate auto-conversion as well.