三田図書館・情報学会誌論文(論文ID LIS004017)

Problems Related to Automatic Abstracting Methods
No.4, p.17-27

Through the observation of the results of two experiments on preparation of automatic abstracts between July 1964 and February 1966, the author tried to consider problems of techniques employed to automatic preparation of abstracts of Japanese texts.

As for the selection of key words, besides counting raw frequency of words occurrence, we tried to test the method proposed by Edmundson-Wyllys, namely “relative frequenc” approach, when applied to Japanese texts. The “relative frequenc” is used to adjust occurrence of words in a target document by conditions of occurrence of words in document population which includes the target document. This method is meaningful to the selection of key words for automatic indexing but can't be applied to the selection of key words for automatic abstracting. Based upon this interpretation, we tried to select key words employing a method to adjust frequency of words occurrence in individual document by conditions of occurrence of words in the document itself.

In the process of extracting a sentence used for constructing element of abstract, there are two methods, namely to consider each document as a unit and to consider each chapter and paragraph as a unit. The author discussed merit and demerit of those two methods.

In the automatic processing of Japanese texts, the automatic recognition of processing unit is always the center of debate. In this article, the author introduced a method to recognize automatically processing unit required for the selection of key words in automatic abstracting and indexing only, and gave evaluation of the results.

(Institute of Behavioral Sciences. Division of Automatic Processing of Language and Document)