Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1008992.1009006acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Discriminative models for information retrieval

Published: 25 July 2004 Publication History

Abstract

Discriminative models have been preferred over generative models in many machine learning problems in the recent past owing to some of their attractive theoretical properties. In this paper, we explore the applicability of discriminative classifiers for IR. We have compared the performance of two popular discriminative models, namely the maximum entropy model and support vector machines with that of language modeling, the state-of-the-art generative model for IR. Our experiments on ad-hoc retrieval indicate that although maximum entropy is significantly worse than language models, support vector machines are on par with language models. We argue that the main reason to prefer SVMs over language models is their ability to learn arbitrary features automatically as demonstrated by our experiments on the home-page finding task of TREC-10.

References

[1]
Berger, A. L., Della Pietra, D., Stephen A. and Della Pietra, V. J., A Maximum Entropy Approach to Natural Language Processing, Computational Linguistics, vol. 22(1), p39--71, 1996.]]
[2]
Burges, C., A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, vol. 2(2), p121--167, 1998.]]
[3]
Cooper, W. S. and Huizinga, P., The maximum entropy principle and its application to the design of probabilistic retrieval systems, Information Technology, Research and Development, 1:99--112, 1982.]]
[4]
Cooper, W. S., Exploiting the maximum entropy principle to increase retrieval effectiveness, Journal of the American Society for Information Science, 34(1):31--39, 1983.]]
[5]
Cooper, W. S., Gey, F. and Dabney, D., Probabilistic Retrieval based on Staged Logistic regression, ACM SIGIR, p198--210, 1992.]]
[6]
Craswell, N., Home-page finding training queries, http://es.cmis.csiro.au/TRECWeb/Qrels/homepages.wt10g.training01.]]
[7]
Gey, F., Inferring probability of relevance using the method of logistic regression, ACM SIGIR, p222--231, 1994.]]
[8]
Greiff, W. R. and Ponte, J. M., The maximum entropy approach and probabilistic IR models, ACM Trans. on Information Systems, 18(3):246--287, 2000.]]
[9]
Harter, S. P., A probabilistic approach to automatic keyword indexing. Part I: On the distribution of specialty words in a technical literature, Journal of the ASIS, vol. 26, 197--206.]]
[10]
Hawking, D. and Craswell, N., Overview of the TREC-2001 web track, TREC proceedings, 2001.]]
[11]
Kantor P. B. and Lee, J. J., The maximum entropy principle in information retrieval, SIGIR, 1986.]]
[12]
Joachims, T., Text categorization with support vector machines: learning with many relevant features, Proceedings of 10th European Conference on Machine Learning, p137--142, 1998.]]
[13]
Kantor P. B. and Lee, J. J., Testing the maximum entropy principle for information retrieval, Journal of the American Society for Information Science, 49(6):557--566, 1998.]]
[14]
Kraaij, W., Westerveld T. and Hiemstra, D., The importance of prior probabilities for entry page search, SIGIR, pages 27--34, 2002.]]
[15]
Lafferty, J. and Zhai, C., Probabilistic relevance models based on document and query generation, Workshop on Language Modeling and Information Retrieval, 2001.]]
[16]
Joachims, T., Making large-Scale SVM Learning Practical, Advances in Kernel Methods - Support Vector Learning, B. Schölkopf and C. Burges and A. Smola(ed.), MIT-Press, 1999.]]
[17]
Malouf, R., A comparison of algorithms for maximum entropy parameter estimation, http://citeseer.nj.nec.com/malouf02comparison.html.]]
[18]
Nallapati, R. and Allan, J., Capturing Term Dependencies using a Sentence Tree based Language Model, CIKM, 2002.]]
[19]
Ng., A. and Jordan, M., On Discriminative vs. Generative classifiers: A comparison of logistic regression and naïve Bayes, Neural Information Processing Systems, 2002.]]
[20]
Nigam, K., Lafferty, J. and McCallum, A., Using maximum entropy for text classification, IJCAI-99 Workshop on Machine Learning for Information Filtering, pages 61--67, 1999.]]
[21]
Ogilvie, P., and Callan J., Combining Document Representations for Known Item Search, SIGIR, 2003.]]
[22]
Page, L., Brin, S., Motwani, R. and Winograd, T., The PageRank Citation Ranking: Bringing Order to the Web, Stanford Digital Library Technologies Project, 1998.]]
[23]
Ponte, J. M. and Croft, W. B., A Language Modeling Approach to Information Retrieval, ACM SIGIR, 275--281, 1998.]]
[24]
Ratnaparkhi, A., A Maximum Entropy Part-Of-Speech Tagger, Empirical Methods in Natural Language Processing, 1996.]]
[25]
Robertson S. E. and Sparck Jones, K., Relevance weighting of search terms, Journal of American Society for Information Sciences, 27(3):129--146, 1976.]]
[26]
Robertson, S. E., On Bayesian models and event spaces in information retrieval, Workshop on Mathematical and Formal methods for IR, 2002.]]
[27]
Robertson, S. E., van Rijsbergen, C.J., and Porter, M. F., Probabilistic models of indexing and searching, Proceedings of SIGIR, 1980.]]
[28]
Salton, G., The SMART Retrieval System - Experiments in Automatic Document Processing, Prentice hall Inc., Englewood Cliffs, NJ, 1971.]]
[29]
Teevan, J. and Karger, D., Empirical Development of an Exponential Probabilistic Model for Text Retrieval: Using Textual Analysis to Build a Better Model, In Proceedings of the 26th Annual ACM Conference on Research and Development in Information Retrieval, 2003.]]
[30]
Vapnik, V. N., Statistical Learning Theory, John Wiley & Sons, 1998.]]
[31]
Zhai, C. and Lafferty, J., A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval, SIGIR, 2001.]]
[32]
Zhang, J. and Mani, I., kNN approach to unbalanced data distributions: A case study involving Information Extraction, Workshop on learning from imbalanced datasets II, ICML, 2003.]]
[33]
Zhang, L., A Maximum Entropy Modeling Toolkit for Python and C++, http://www.nlplab.cn/zhangle/maxent.html.]]
[34]
Language Modeling Toolkit for Information Retrieval, http://www-2.cs.cmu.edu/lemur/.]]

Cited By

View all

Index Terms

  1. Discriminative models for information retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
    July 2004
    624 pages
    ISBN:1581138814
    DOI:10.1145/1008992
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 July 2004

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. discriminative models
    2. machine learning
    3. maximum entropy
    4. pattern classification
    5. support vector machines

    Qualifiers

    • Article

    Conference

    SIGIR04
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)49
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 30 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Review of Generative Models in Generating Synthetic Attack Data for CybersecurityElectronics10.3390/electronics1302032213:2(322)Online publication date: 11-Jan-2024
    • (2024)An overview of sentence ordering taskInternational Journal of Data Science and Analytics10.1007/s41060-024-00550-918:1(1-18)Online publication date: 25-Apr-2024
    • (2023)GELTOR: A Graph Embedding Method based on Listwise Learning to RankProceedings of the ACM Web Conference 202310.1145/3543507.3583193(6-16)Online publication date: 30-Apr-2023
    • (2023)FF-IREnvironmental Modelling & Software10.1016/j.envsoft.2023.105734167:COnline publication date: 1-Sep-2023
    • (2023)An in-depth study on adversarial learning-to-rankInformation Retrieval Journal10.1007/s10791-023-09419-026:1Online publication date: 28-Feb-2023
    • (2023)Automatic document classification via transformers for regulations compliance management in large utility companiesNeural Computing and Applications10.1007/s00521-023-08555-435:23(17167-17185)Online publication date: 28-Apr-2023
    • (2023)Learning to Rank in Session-Based Recommender SystemsSession-Based Recommender Systems Using Deep Learning10.1007/978-3-031-42559-2_6(245-292)Online publication date: 21-Dec-2023
    • (2022)A graph-based feature selection method for learning to rank using spectral clustering for redundancy minimization and biased PageRank for relevance analysisComputer Science and Information Systems10.2298/CSIS201220042Y19:1(141-164)Online publication date: 2022
    • (2022)Stochastic Retrieval-Conditioned RerankingProceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3539813.3545141(81-91)Online publication date: 23-Aug-2022
    • (2022)Temporal action localization using gated recurrent unitsThe Visual Computer10.1007/s00371-022-02495-139:7(2823-2834)Online publication date: 16-May-2022
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media