Library and Information Science 28: 105-119 (1990)

原著論文Original Article

索引語間の関連性を考慮した情報検索モデルA term dependence model in information retrieval

図書館情報大学助手University of Library and Information Science ◇ 〒305‒8550 茨城県つくば市春日1番2号 ◇ Kasuga 1‒2, Tsukuba-shi, Ibaraki 305‒8550, Japan

受付日:1991年2月23日Received: February 23, 1991
発行日:1991年3月31日Published: March 31, 1991

In most information retrieval systems or models, the assumption is normally made that index terms assigned to the documents of a collection occur independently of each other. So as to improve the retrieval effectiveness of systems, there is a need to take dependencies between certain index term pairs into account.

As the similarity measure between a query and a document is important in quantitative retrieval, two measures, which reflect directly the relationships between index terms when they are given by pairwise correlations, are proposed in this paper. One of the proposed measures is an extension of the cosine function model. This measure is based on oblique coordinates whose degree of angle between axes corresponds to the pairwise correlation between index terms, in contrast to the conventional cosine function measure based on rectangular coordinates. The other measure is an extension of the extended Boolean model, which was proposed by G. Salton et al. Using these measures, we need no assumption of term independence.

Retrieval experiments to evaluate the proposed measures was performed on a test collection of 623 document records and 5 queries, in a weighted mode, in which index terms assigned to the document record were weighted, and in an unweighted mode. The experiment showed following results: 1) it is useful to incorporate term dependencies into the similarity measures; and 2) the proposed measures, however, did not have much better effectiveness than conventional ones.

