Article

Discriminative models for information retrieval

Author:

Ramesh NallapatiAuthors Info & Claims

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

Pages 64 - 71

https://doi.org/10.1145/1008992.1009006

Published: 25 July 2004 Publication History

Abstract

Discriminative models have been preferred over generative models in many machine learning problems in the recent past owing to some of their attractive theoretical properties. In this paper, we explore the applicability of discriminative classifiers for IR. We have compared the performance of two popular discriminative models, namely the maximum entropy model and support vector machines with that of language modeling, the state-of-the-art generative model for IR. Our experiments on ad-hoc retrieval indicate that although maximum entropy is significantly worse than language models, support vector machines are on par with language models. We argue that the main reason to prefer SVMs over language models is their ability to learn arbitrary features automatically as demonstrated by our experiments on the home-page finding task of TREC-10.

References

[1]

Berger, A. L., Della Pietra, D., Stephen A. and Della Pietra, V. J., A Maximum Entropy Approach to Natural Language Processing, Computational Linguistics, vol. 22(1), p39--71, 1996.]]

Digital Library

[2]

Burges, C., A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, vol. 2(2), p121--167, 1998.]]

Digital Library

[3]

Cooper, W. S. and Huizinga, P., The maximum entropy principle and its application to the design of probabilistic retrieval systems, Information Technology, Research and Development, 1:99--112, 1982.]]

[4]

Cooper, W. S., Exploiting the maximum entropy principle to increase retrieval effectiveness, Journal of the American Society for Information Science, 34(1):31--39, 1983.]]

[5]

Cooper, W. S., Gey, F. and Dabney, D., Probabilistic Retrieval based on Staged Logistic regression, ACM SIGIR, p198--210, 1992.]]

Digital Library

[6]

Craswell, N., Home-page finding training queries, http://es.cmis.csiro.au/TRECWeb/Qrels/homepages.wt10g.training01.]]

[7]

Gey, F., Inferring probability of relevance using the method of logistic regression, ACM SIGIR, p222--231, 1994.]]

Digital Library

[8]

Greiff, W. R. and Ponte, J. M., The maximum entropy approach and probabilistic IR models, ACM Trans. on Information Systems, 18(3):246--287, 2000.]]

Digital Library

[9]

Harter, S. P., A probabilistic approach to automatic keyword indexing. Part I: On the distribution of specialty words in a technical literature, Journal of the ASIS, vol. 26, 197--206.]]

[10]

Hawking, D. and Craswell, N., Overview of the TREC-2001 web track, TREC proceedings, 2001.]]

[11]

Kantor P. B. and Lee, J. J., The maximum entropy principle in information retrieval, SIGIR, 1986.]]

Digital Library

[12]

Joachims, T., Text categorization with support vector machines: learning with many relevant features, Proceedings of 10th European Conference on Machine Learning, p137--142, 1998.]]

Digital Library

[13]

Kantor P. B. and Lee, J. J., Testing the maximum entropy principle for information retrieval, Journal of the American Society for Information Science, 49(6):557--566, 1998.]]

Digital Library

[14]

Kraaij, W., Westerveld T. and Hiemstra, D., The importance of prior probabilities for entry page search, SIGIR, pages 27--34, 2002.]]

Digital Library

[15]

Lafferty, J. and Zhai, C., Probabilistic relevance models based on document and query generation, Workshop on Language Modeling and Information Retrieval, 2001.]]

[16]

Joachims, T., Making large-Scale SVM Learning Practical, Advances in Kernel Methods - Support Vector Learning, B. Schölkopf and C. Burges and A. Smola(ed.), MIT-Press, 1999.]]

Digital Library

[17]

Malouf, R., A comparison of algorithms for maximum entropy parameter estimation, http://citeseer.nj.nec.com/malouf02comparison.html.]]

[18]

Nallapati, R. and Allan, J., Capturing Term Dependencies using a Sentence Tree based Language Model, CIKM, 2002.]]

Digital Library

[19]

Ng., A. and Jordan, M., On Discriminative vs. Generative classifiers: A comparison of logistic regression and naïve Bayes, Neural Information Processing Systems, 2002.]]

[20]

Nigam, K., Lafferty, J. and McCallum, A., Using maximum entropy for text classification, IJCAI-99 Workshop on Machine Learning for Information Filtering, pages 61--67, 1999.]]

[21]

Ogilvie, P., and Callan J., Combining Document Representations for Known Item Search, SIGIR, 2003.]]

Digital Library

[22]

Page, L., Brin, S., Motwani, R. and Winograd, T., The PageRank Citation Ranking: Bringing Order to the Web, Stanford Digital Library Technologies Project, 1998.]]

[23]

Ponte, J. M. and Croft, W. B., A Language Modeling Approach to Information Retrieval, ACM SIGIR, 275--281, 1998.]]

Digital Library

[24]

Ratnaparkhi, A., A Maximum Entropy Part-Of-Speech Tagger, Empirical Methods in Natural Language Processing, 1996.]]

[25]

Robertson S. E. and Sparck Jones, K., Relevance weighting of search terms, Journal of American Society for Information Sciences, 27(3):129--146, 1976.]]

[26]

Robertson, S. E., On Bayesian models and event spaces in information retrieval, Workshop on Mathematical and Formal methods for IR, 2002.]]

[27]

Robertson, S. E., van Rijsbergen, C.J., and Porter, M. F., Probabilistic models of indexing and searching, Proceedings of SIGIR, 1980.]]

Digital Library

[28]

Salton, G., The SMART Retrieval System - Experiments in Automatic Document Processing, Prentice hall Inc., Englewood Cliffs, NJ, 1971.]]

Digital Library

[29]

Teevan, J. and Karger, D., Empirical Development of an Exponential Probabilistic Model for Text Retrieval: Using Textual Analysis to Build a Better Model, In Proceedings of the 26th Annual ACM Conference on Research and Development in Information Retrieval, 2003.]]

Digital Library

[30]

Vapnik, V. N., Statistical Learning Theory, John Wiley & Sons, 1998.]]

Digital Library

[31]

Zhai, C. and Lafferty, J., A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval, SIGIR, 2001.]]

Digital Library

[32]

Zhang, J. and Mani, I., kNN approach to unbalanced data distributions: A case study involving Information Extraction, Workshop on learning from imbalanced datasets II, ICML, 2003.]]

[33]

Zhang, L., A Maximum Entropy Modeling Toolkit for Python and C++, http://www.nlplab.cn/zhangle/maxent.html.]]

[34]

Language Modeling Toolkit for Information Retrieval, http://www-2.cs.cmu.edu/lemur/.]]

Cited By

Agrawal GKaur AMyneni S(2024)A Review of Generative Models in Generating Synthetic Attack Data for CybersecurityElectronics10.3390/electronics1302032213:2(322)Online publication date: 11-Jan-2024
https://doi.org/10.3390/electronics13020322
Shi YZhang HLi NYang T(2024)An overview of sentence ordering taskInternational Journal of Data Science and Analytics10.1007/s41060-024-00550-918:1(1-18)Online publication date: 25-Apr-2024
https://doi.org/10.1007/s41060-024-00550-9
Reyhani Hamedani MRyu JKim S(2023)GELTOR: A Graph Embedding Method based on Listwise Learning to RankProceedings of the ACM Web Conference 202310.1145/3543507.3583193(6-16)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543507.3583193
Show More Cited By

Index Terms

Discriminative models for information retrieval
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Twin Support Vector Machines for Pattern Classification

We propose Twin SVM, a binary SVM classifier that determines two nonparallel planes by solving two related SVM-type problems, each of which is smaller than in a conventional SVM. The Twin SVM formulation is in the spirit of proximal SVMs via generalized ...
Fuzzy support vector machines for multilabel classification

The problem of one-against-all support vector machines (SVMs) for multilabel classification is that a data sample may be classified into a multilabel class that is not defined or it may not be classified into any class. To solve this problem, in this ...
Extending twin support vector machine classifier for multi-category classification problems

Twin support vector machine classifier TWSVM was proposed by Jayadeva et al., which was used for binary classification problems. TWSVM not only overcomes the difficulties in handling the problem of exemplar unbalance in binary classification problems, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

July 2004

624 pages

ISBN:1581138814

DOI:10.1145/1008992

General Chair:
Mark Sanderson
University of Sheffield (UK)
,
Program Chairs:
Kalervo Järvelin
University of Tampere (Finland)
,
James Allan
University of Massachusetts (USA)
,
Peter Bruza
Distributed Systems Technology Centre (Australia)

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

SIGIR04

Sponsor:

SIGIR04: The 27th ACM/SIGIR International Symposium on Information Retrieval 2004

July 25 - 29, 2004

Sheffield, United Kingdom

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

180
Total Citations
View Citations
2,601
Total Downloads

Downloads (Last 12 months)49
Downloads (Last 6 weeks)4

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Agrawal GKaur AMyneni S(2024)A Review of Generative Models in Generating Synthetic Attack Data for CybersecurityElectronics10.3390/electronics1302032213:2(322)Online publication date: 11-Jan-2024
https://doi.org/10.3390/electronics13020322
Shi YZhang HLi NYang T(2024)An overview of sentence ordering taskInternational Journal of Data Science and Analytics10.1007/s41060-024-00550-918:1(1-18)Online publication date: 25-Apr-2024
https://doi.org/10.1007/s41060-024-00550-9
Reyhani Hamedani MRyu JKim S(2023)GELTOR: A Graph Embedding Method based on Listwise Learning to RankProceedings of the ACM Web Conference 202310.1145/3543507.3583193(6-16)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543507.3583193
Wilkho RGharaibeh NChang SZou L(2023)FF-IREnvironmental Modelling & Software10.1016/j.envsoft.2023.105734167:COnline publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1016/j.envsoft.2023.105734
Yu HPiryani RJatowt AInagaki RJoho HKim K(2023)An in-depth study on adversarial learning-to-rankInformation Retrieval Journal10.1007/s10791-023-09419-026:1Online publication date: 28-Feb-2023
https://doi.org/10.1007/s10791-023-09419-0
Dimlioglu TWang JBisla DChoromanska AOdie SBukhman LOlomola AWong J(2023)Automatic document classification via transformers for regulations compliance management in large utility companiesNeural Computing and Applications10.1007/s00521-023-08555-435:23(17167-17185)Online publication date: 28-Apr-2023
https://doi.org/10.1007/s00521-023-08555-4
Ravanmehr RMohamadrezaei RRavanmehr RMohamadrezaei R(2023)Learning to Rank in Session-Based Recommender SystemsSession-Based Recommender Systems Using Deep Learning10.1007/978-3-031-42559-2_6(245-292)Online publication date: 21-Dec-2023
https://doi.org/10.1007/978-3-031-42559-2_6
Yeh JTsai C(2022)A graph-based feature selection method for learning to rank using spectral clustering for redundancy minimization and biased PageRank for relevance analysisComputer Science and Information Systems10.2298/CSIS201220042Y19:1(141-164)Online publication date: 2022
https://doi.org/10.2298/CSIS201220042Y
Zamani HBendersky MMetzler DZhuang HWang XCrestani FPasi GGaussier E(2022)Stochastic Retrieval-Conditioned RerankingProceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3539813.3545141(81-91)Online publication date: 23-Aug-2022
https://dl.acm.org/doi/10.1145/3539813.3545141
Keshvarikhojasteh HMohammadzade HBehroozi H(2022)Temporal action localization using gated recurrent unitsThe Visual Computer10.1007/s00371-022-02495-139:7(2823-2834)Online publication date: 16-May-2022
https://doi.org/10.1007/s00371-022-02495-1
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents