The purpose of this study is to develop an improved procedure for the automatic identification of duplicate monographic records in two catalog record files. Following procedures are used. (1) Selecting bibliographic elements for identification. (2) Converting the form of these elements to the unified key form for matching. (3) Matching these keys.
Test files used are LC/MARC (109,430 records) and University of Tsukuba Library catalog file (127,608 records). Author, title, publisher, and edition statement are chosen as identifiers and eighteen conversion methods are examined. When author, title and publisher keys (not converted) are used, the match rate is very low. But, it is possible to bring the match rate up to 86.9%~96.5% by the conversion (delite delimitors, convert to capital letters etc.) Author key is low matching rate than other keys. By using the combination of these four converted keys, it is possible to identify nearly 80% of the all duplicated records in two files.
© 1984 三田図書館・情報学会© 1984 Mita Society for Library and Information Science
This page was created on 2022-03-23T16:43:58.507+09:00
This page was last modified on
このサイトは(株)国際文献社によって運用されています。