Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1458082.1458140acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Relating dependent indexes using dempster-shafer theory

Published: 26 October 2008 Publication History

Abstract

Traditional information retrieval (IR) approaches assume that the indexing terms are independent, which is not true in reality. Although some previous studies have tried to consider term relationships, strong simplifications had to be made at the very basic indexing step, namely, dependent terms are assigned independent counts or probabilities.
In this study, we propose to consider dependencies between terms using Dempster-Shafer theory of evidence. An occurrence of a string in a document is considered to represent the set of all the terms implied in it. Probability is assigned to such a set of terms instead of individual terms. During query evaluation phase, a part of the probability of a set can be transferred to those of the query that are related, allowing us to integrate language-dependent relations in IR.
This approach has been tested on several Chinese IR collections. Our experimental results show that our model can outperform the existing state-of-the-art approaches. The proposed method can be used as a general way to consider different types of relationship between terms and for other languages.

References

[1]
Campos, L. M., J. M. Fernández, J. F. Huete. 2003. The BNR model: foundations and performance of a Bayesian network-based retrieval model, Int. J. Approx. Reasoning, 34: 265--285.
[2]
Crestani, F. and van Rijsbergen, C. J. 1995. Information retrieval by logical imaging. Journal of Documentation, 51(1): 3--17.
[3]
Croft, W. B., Turtle, H. R., and Lewis, D. D. 1991. The use of phrases and structured queries in information retrieval. ACM SIGIR Conference, pp. 32--45.
[4]
Dempster, A. P. 1968. A generalization of Bayesian inference, Journal of the Royal Statistical Society, Series B, Vol. 30, pp. 205--247.
[5]
Evans D. A. and Zhai, C. 1996. Noun-Phrase Analysis in Unrestricted Text for Information Retrieval, ACL, pp.17--24.
[6]
Kwok, K. L. 1997. Comparing representations in Chinese information retrieval. ACM SIGIR Conf., pp. 34--41.
[7]
Lepage, F. 2001. Partial probabilistic interpretations and general imaging. DEXA Workshop, pp. 254--258.
[8]
Metzler, D. 2007. Beyond Bags of Words: Effectively Modeling Dependence and Features in Information Retrieval, Ph.D. Dissertation. University of Massachusetts Amherst. Amherst, MA.
[9]
Nallapati, R. and Allan, J. 2002. Capturing term dependencies using a sentence tree based language model, CIKM Conference.
[10]
Nie, J. Y. Gao, J. Zhang, J. and Zhou, M. 2000. On the use of words and n-grams for Chinese information retrieval. IRAL-2000.
[11]
Pôssas, B., Ziviani, N., Meira, W., and Ribeiro-Neto, B. 2002. Set-based model: a new approach for information retrieval. ACM SIGIR Conference, pp. 230--237. DOI= http://doi.acm.org/10.1145/564376.564417
[12]
Plachouras, V. and Ounis, I. 2005. Dempster-Shafer theory for a query-biased combination of evidence on the Web. Information Retrieval, 8(2):197--218.
[13]
Ruthven, I., Lalmas, M.:2002. Using Dempster-Shafer's theory of evidence to combine aspects of information use. Journal of Intelligent Systems. 19: 267--302.
[14]
Shafer, G. 1976. A Mathematical Theory of Evidence, Princeton University Press.
[15]
Shi, L. Nie, J.Y. Bai, J. 2007. Comparing different units for query translation for Chinese cross-language information retrieval, Infoscale Conference.
[16]
Theophylactou, M. Lalmas, M. 1998. A Dempster-Shafer Model for Document Retrieval using Phrases, Tech Report, Dept of Computing Science, University of Glasgow.
[17]
Turtle, H. and Croft, W. B. 1991. Evaluation of an inference network-based retrieval model. ACM Trans. Inf. Syst. 9(3): 187--222. DOI= http://doi.acm.org/10.1145/125187.125188
[18]
Wong, S. K. M., Ziarko, W. and Wong, P. C. N. 1985. Generalized vector space model in information retrieval. ACM-SIGIR Conference, pp. 18--25.
[19]
Urban, J. Jose, J. M. and van Rijsbergen, C. J. 2006. An adaptive technique for content-based image retrieval, Multimed Tools Appl., 31:1--28. DOI 10.1007/s11042-006-0035-1

Cited By

View all
  • (2010)A subjective logic formalisation of the principle of polyrepresentation for information needsProceedings of the third symposium on Information interaction in context10.1145/1840784.1840804(125-134)Online publication date: 18-Aug-2010
  • (2010)Modeling Variable Dependencies between Characters in Chinese Information RetrievalInformation Retrieval Technology10.1007/978-3-642-17187-1_51(539-551)Online publication date: 2010
  • (2009)Integrating phrase inseparability in phrase-based modelProceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval10.1145/1571941.1572089(708-709)Online publication date: 19-Jul-2009
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management
October 2008
1562 pages
ISBN:9781595939913
DOI:10.1145/1458082
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. chinese
  2. dempster-shafer theory
  3. indexing
  4. information retrieval
  5. term dependency

Qualifiers

  • Research-article

Conference

CIKM08
CIKM08: Conference on Information and Knowledge Management
October 26 - 30, 2008
California, Napa Valley, USA

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2010)A subjective logic formalisation of the principle of polyrepresentation for information needsProceedings of the third symposium on Information interaction in context10.1145/1840784.1840804(125-134)Online publication date: 18-Aug-2010
  • (2010)Modeling Variable Dependencies between Characters in Chinese Information RetrievalInformation Retrieval Technology10.1007/978-3-642-17187-1_51(539-551)Online publication date: 2010
  • (2009)Integrating phrase inseparability in phrase-based modelProceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval10.1145/1571941.1572089(708-709)Online publication date: 19-Jul-2009
  • (2009)A Belief Model of Query Difficulty That Uses Subjective LogicProceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory10.1007/978-3-642-04417-5_9(92-103)Online publication date: 3-Sep-2009

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media