The tree dependence model has been used successfully to incorporate dependencies between certain term pairs on the information retrieval process, while the Bahadur Lazarsfeld Expansion (BLE) which specifies dependencies between all subsets of terms has been used to identify productive clusters of items in a clustered data base environment. The successes of these models are unlikely to be accidental; it is of interest therefore to examine the similarities between the two models. The disadvantage of the BLE model is the exponential number of terms appearings in the full expression, while a truncated BLE system may produce negative probability values. The disadvantage of the tree dependence model is the restriction to dependencies between certain term pairs only and the exclusion of higher-order dependencies. A generalized term dependence model is introduced in this study which does not carry the disadvantages of either the tree dependence or the BLE models. Sample evaluation results are included to demonstrate the usefulness of the generalized system.
Cited By
- Eickhoff C, de Vries A and Hofmann T Modelling Term Dependence with Copulas Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, (783-786)
- Bendersky M and Croft W Modeling higher-order term dependencies in information retrieval using query hypergraphs Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, (941-950)
- He B, Huang J and Zhou X (2011). Modeling term proximity for probabilistic information retrieval models, Information Sciences: an International Journal, 181:14, (3017-3031), Online publication date: 1-Jul-2011.
- Song R, Taylor M, Wen J, Hon H and Yu Y Viewing term proximity from a different perspective Proceedings of the IR research, 30th European conference on Advances in information retrieval, (346-357)
- Metzler D and Croft W A Markov random field model for term dependencies Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, (472-479)
- Savoy J and Desbois D Bayesian inference networks in hypertext Intelligent Text and Image Handling - Volume 2, (662-681)
- Fagan J Automatic phrase indexing for document retrieval Proceedings of the 10th annual international ACM SIGIR conference on Research and development in information retrieval, (91-101)
- Kantor P and Lee J The maximum entropy principle in information retrieval Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval, (269-274)
- Salton G (2019). Some research problems in automatic information retrieval, ACM SIGIR Forum, 17:4, (252-263), Online publication date: 1-Jun-1983.
- Salton G Some research problems in automatic information retrieval Proceedings of the 6th annual international ACM SIGIR conference on Research and development in information retrieval, (252-263)
Recommendations
Non-Compositional Term Dependence for Information Retrieval
SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information RetrievalModelling term dependence in IR aims to identify co-occurring terms that are too heavily dependent on each other to be treated as a bag of words, and to adapt the indexing and ranking accordingly. Dependent terms are predominantly identified using ...
Query structuring and expansion with two-stage term dependence for Japanese web retrieval
AbstractIn this paper, we propose a new term dependence model for information retrieval, which is based on a theoretical framework using Markov random fields. We assume two types of dependencies of terms given in a query: (i) long-range dependencies that ...
Exploring term dependences in probabilistic information retrieval model
Most previous information retrieval (IR) models assume that terms of queries and documents are statistically independent from each another. However, this kind of conditional independence assumption is obviously and openly understood to be wrong, so we ...