research-article

LETOR: A benchmark collection for research on learning to rank for information retrieval

Authors:

Hang LiAuthors Info & Claims

Information Retrieval, Volume 13, Issue 4

Pages 346 - 374

https://doi.org/10.1007/s10791-009-9123-y

Published: 01 August 2010 Publication History

Abstract

LETOR is a benchmark collection for the research on learning to rank for information retrieval, released by Microsoft Research Asia. In this paper, we describe the details of the LETOR collection and show how it can be used in different kinds of researches. Specifically, we describe how the document corpora and query sets in LETOR are selected, how the documents are sampled, how the learning features and meta information are extracted, and how the datasets are partitioned for comprehensive evaluation. We then compare several state-of-the-art learning to rank algorithms on LETOR, report their ranking performances, and make discussions on the results. After that, we discuss possible new research topics that can be supported by LETOR, in addition to algorithm comparison. We hope that this paper can help people to gain deeper understanding of LETOR, and enable more interesting research projects on learning to rank and related topics.

References

[1]

Asuncion, A., & Newman, D. J. (2007). UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science.

[2]

Baeza-Yates R. and Ribeiro-Neto B. Modern information retrieval 1999 New York Addison Wesley

[3]

Brin S. and Page L. The anatomy of a large-scale hypertextual web search engine Computer Networks and ISDN Systems 1998 30 1–7 107-117

[4]

Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., et al. (2005). Learning to rank using gradient descent. In ICML ’05: Proceedings of the 22nd International Conference on Machine Learning (pp. 89–96). New York, NY: ACM Press.

[5]

Cao, Y., Xu, J., Liu, T.-Y., Li, H., Huang, Y., & Hon, H.-W. (2006). Adapting ranking svm to document retrieval. In SIGIR ’06: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 186–193). New York, NY: ACM Press.

[6]

Cao, Z., Qin, T., Liu, T.-Y., Tsai, M.-F., et al. (2007). Learning to rank: From pairwise approach to listwise approach. In ICML ’07: Proceedings of the 24th International Conference on Machine Learning (pp. 129–136). New York, NY: ACM Press.

[7]

Chechik G., Heitz G., Elidan G., Abbeel P., and Koller D. Max-margin classification of data with absent features Journal of Machine Learning Research 2008 9 1-21

[8]

Chirita, P., Diederich, J., & Nejdl, W. (2005). MailRank: Using ranking for spam detection. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management (pp. 373–380). New York, NY: ACM.

[9]

Collins, M. (2002). Ranking algorithms for named-entity extraction: Boosting and the voted perceptron. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, July (pp. 07–12).

[10]

Craswell, N., & Hawking, D. (2004). Overview of the TREC 2004 Web track. In Proceedings of the 13th Text Retrieval Conference (TREC 2004). Gaithersburg, MD: NIST.

[11]

Craswell, N., Hawking, D., Wilkinson, R., & Wu, M. (2003). Overview of the TREC 2003 web track. In Proceedings of TREC 2003 (pp. 78–92).

[12]

Dave, K., Lawrence, S., & Pennock, D. (2003). Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In Proceedings of the 12th International Conference on World Wide Web (pp. 519–528). New York, NY: ACM Press.

[13]

Freund Y., Iyer R., Schapire R. E., and Singer Y. An efficient boosting algorithm for combining preferences Journal of Machine Learning Research 2003 4 933-969

[14]

Geng, X., Liu, T.-Y., Qin, T., Arnold, A., Li, H., & Shum, H.-Y. (2008). Query dependent ranking using k-nearest neighbor. In SIGIR ’08: Proceedings of the 31th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 115–122). New York, NY: ACM.

[15]

Gyöngyi, Z., Garcia-Molina, H., & Pedersen, J. (2004). Combating web spam with trustrank. In Proceedings of the Thirtieth International Conference on Very Large Data Bases-Volume 30 (pp. 576–587). VLDB Endowment.

[16]

Harrington, E. F. (2003). Online ranking/collaborative filtering using the perceptron algorithm. In Proceedings of the 20th International Conference on Machine Learning (pp. 250–257).

[17]

Herbrich, R., Graepel, T., & Obermayer, K. (1999). Support vector learning for ordinal regression. In ICANN1999 (pp. 97–102).

[18]

Hersh, W., Buckley, C., Leone, T. J., & Hickam, D. (1994). Ohsumed: An interactive retrieval evaluation and new large test collection for research. In SIGIR ’94 (pp. 192–201). New York, NY: Springer.

[19]

Hu, Y., Xin, G., Song, R., Hu, G., Shi, S., Cao, Y., et al. (2005). Title extraction from bodies of html documents and its application to web page retrieval. In SIGIR ’05: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 250–257). New York, NY: ACM.

[20]

Huang J. C. and Frey B. J. Koller D., Schuurmans D., Bengio Y., and Bottou L. Structured ranking learning using cumulative distribution networks Advances in neural information processing systems 21 2009 Cambridge MIT Press

[21]

Järvelin K. and Kekäläinen J. Cumulated gain-based evaluation of IR techniques ACM Transactions on Information Systems 2002 20 4 422-446

[22]

Joachims, T. (2002). Optimizing search engines using clickthrough data. In KDD ’02: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 133–142). New York, NY: ACM Press.

[23]

Lewis D., Yang Y., Rose T., and Li F. RCV1: A new benchmark collection for text categorization research The Journal of Machine Learning Research 2004 5 361-397

[24]

Li, L., & Lin, H.-T. (2006). Ordinal regression by extended binary classification. In NIPS (pp. 865–872).

[25]

Li, P., Burges, C., & Wu, Q. (2008). Mcrank: Learning to rank using multiple classification and gradient boosting. In Advances in Neural Information Processing Systems 20 (pp. 897–904). Cambridge, MA: MIT Press.

[26]

Matveeva, I., Burges, C., Burkard, T., Laucius, A., & Wong, L. (2006). High accuracy retrieval with multiple nested ranker. In SIGIR ’06: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 437–444). New York, NY: ACM.

[27]

Minka, T., & Robertson, S. (2008). Selection bias in the LETOR datasets. In Proceedings of SIGIR 2008 Workshop on Learning to Rank for Information Retrieval.

[28]

Nie, L., Davison, B. D., & Qi, X. (2006). Topical link analysis for web search. In SIGIR ’06: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 91–98). New York, NY: ACM.

[29]

Pang, B., & Lee, L. (2005). Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (pp. 115–124). NJ: Association for Computational Linguistics Morristown.

[30]

Qin, T., Liu, T., Xu, J., & Li, H. (2008a). How to make LETOR more useful and reliable. In Proceedings of SIGIR 2008 Workshop on Learning to Rank for Information Retrieval.

[31]

Qin, T., Liu, T.-Y., & Li, H. (2008b). A general approximation framework for direct optimization of information retrieval measures. Technical Report MSR-TR-2008-164, Microsoft Corporation.

[32]

Qin, T., Liu, T.-Y., Zhang, X.-D., Chen, Z., & Ma, W.-Y. (2005). A study of relevance propagation for web search. In SIGIR ’05 (pp. 408–415). New York, NY: ACM Press.

[33]

Qin, T., Liu, T.-Y., Zhang, X.-D., Wang, D.-S., & Li, H. (2008c). Global ranking using continuous conditional random fields. In D. Koller, D. Schuurmans, Y. Bengio, & L. Bottou (Eds.), NIPS. MIT Press.

[34]

Qin, T., Liu, T.-Y., Zhang, X.-D., Wang, D.-S., Xiong, W.-Y., & Li, H. (2008d). Learning to rank relational objects and its application to web search. In WWW ’08: Proceeding of the 17th International Conference on World Wide Web (pp. 407–416). New York, NY: ACM.

[35]

Qin T., Zhang X.-D., Tsai M.-F., Wang D.-S., Liu T.-Y., and Li H. Query-level loss functions for information retrieval Information Processing & Management 2008 44 2 838-855

[36]

Qin, T., Zhang, X.-D., Wang, D.-S., Liu, T.-Y., Lai, W., & Li, H. (2007). Ranking with multiple hyperplanes. In SIGIR ’07 (pp. 279–286). New York, NY: ACM Press.

[37]

Robertson, S., Zaragoza, H., & Taylor, M. (2004). Simple bm25 extension to multiple weighted fields. In CIKM ’04: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management (pp. 42–49). New York, NY: ACM.

[38]

Robertson S. E., & Hull, D. A. (2000). The TREC-9 filtering track final report. In TREC (pp. 25–40).

[39]

Shakery, A., & Zhai, C. (2003). Relevance propagation for topic distillation UIUC TREC-2003 Web track experiments. In Proceedings of TREC (pp. 673–677).

[40]

Taylor, M., Guiver, J., Robertson, S., & Minka, T. (2008). Softrank: Optimizing non-smooth rank metrics. In WSDM ’08: Proceedings of the International Conference on Web Search and Web Data Mining (pp. 77–86). New York, NY: ACM.

[41]

TREC. (2004). Web track guideline. http://www.research.microsoft.com/en-us/um/people/nickcr/guidelines_2004.html.

[42]

Tsai, M.-F., Liu, T.-Y., Qin, T., Chen, H.-H., & Ma, W.-Y. (2007). Frank: A ranking method with fidelity loss. In SIGIR ’07 (pp. 383–390). New York, NY: ACM Press

[43]

Volkovs, M. N., & Zemel, R. S. (2009). Boltzrank: Learning to maximize expected ranking gain. In ICML ’09: Proceedings of the 26th Annual International Conference on Machine Learning (pp. 1089–1096). New York, NY: ACM

[44]

Voorhees E. and Harman D. TREC: Experiment and evaluation in information retrieval 2005 Cambridge, MA MIT Press

[45]

Xia, F., Liu, T.-Y., Wang, J., Zhang, W., & Li, H. (2008). Listwise approach to learning to rank—Theory and algorithm. In ICML ’08: Proceedings of the 25th International Conference on Machine Learning. New York, NY: ACM Press.

[46]

Xu, J., Cao, Y., Li, H., & Zhao, M. (2005). Ranking definitions with supervised learning methods. In International World Wide Web Conference (pp. 811–819). New York, NY: ACM Press.

[47]

Xu, J., & Li, H. (2007). Adarank: A boosting algorithm for information retrieval. In SIGIR ’07: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 391–398). New York, NY: ACM Press.

[48]

Xu, J., Liu, T.-Y., Lu, M., Li, H., & Ma, W.-Y. (2008). Directly optimizing evaluation measures in learning to rank. In SIGIR ’08: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 107–114). New York, NY: ACM.

[49]

Xue, G.-R., Yang, Q., Zeng, H.-J., Yu, Y., & Chen, Z. (2005). Exploiting the hierarchical structure for link analysis. In SIGIR ’05: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 186–193). New York, NY: ACM.

[50]

Yue, Y., Finley, T., Radlinski, F., & Joachims, T. (2007). A support vector method for optimizing average precision. In SIGIR ’07: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 271–278). New York, NY: ACM Press

[51]

Zhai, C., & Lafferty, J. (2001). A study of smoothing methods for language models applied to Ad Hoc information retrieval. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 334–342). New York, NY: ACM Press.

[52]

Zhai, C. X., Cohen, W. W., & Lafferty, J. (2003). Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In SIGIR ’03: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval (pp. 10–17). New York, NY: ACM Press.

Cited By

Oosterhuis HJagerman RQin ZWang XBendersky MBaeza-Yates RBonchi F(2024)Reliable Confidence Intervals for Information Retrieval Evaluation Using Generative A.I.Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671883(2307-2317)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671883
Yang THan CLuo CGupta PPhillips JAi QChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes ApproachProceedings of the ACM Web Conference 202410.1145/3589334.3645487(1486-1496)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645487
Mao HZou LZheng YTang JChu XZhao JWang QYin DChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Whole Page Unbiased Learning to RankProceedings of the ACM Web Conference 202410.1145/3589334.3645474(1431-1440)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645474
Show More Cited By

Index Terms

LETOR: A benchmark collection for research on learning to rank for information retrieval
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
    2. Retrieval models and ranking
  2. Information systems applications

Index terms have been assigned to the content through auto-classification.

Recommendations

On Application of Learning to Rank for E-Commerce Search
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

E-Commerce (E-Com) search is an emerging important new application of information retrieval. Learning to Rank (LETOR) is a general effective strategy for optimizing search engines, and is thus also a key technology for E-Com search. While the use of ...
Learning to rank for information retrieval
SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

This tutorial is concerned with a comprehensive introduction to the research area of learning to rank for information retrieval. In the first part of the tutorial, we will introduce three major approaches to learning to rank, i.e., the pointwise, ...
Learning to rank by optimizing expected reciprocal rank
AIRS'11: Proceedings of the 7th Asia conference on Information Retrieval Technology

Learning to rank is one of the most hot research areas in information retrieval, among which listwise approach is an important research direction and the methods that directly optimizing evaluation metrics in listwise approach have been used for ...

Comments

Information & Contributors

Information

Published In

cover image Information Retrieval

Information Retrieval Volume 13, Issue 4

Aug 2010

92 pages

ISSN:1386-4564

Issue’s Table of Contents

© Springer Science+Business Media, LLC 2009.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 August 2010

Accepted: 01 December 2009

Received: 29 April 2009

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

189
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Oosterhuis HJagerman RQin ZWang XBendersky MBaeza-Yates RBonchi F(2024)Reliable Confidence Intervals for Information Retrieval Evaluation Using Generative A.I.Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671883(2307-2317)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671883
Yang THan CLuo CGupta PPhillips JAi QChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes ApproachProceedings of the ACM Web Conference 202410.1145/3589334.3645487(1486-1496)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645487
Mao HZou LZheng YTang JChu XZhao JWang QYin DChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Whole Page Unbiased Learning to RankProceedings of the ACM Web Conference 202410.1145/3589334.3645474(1431-1440)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645474
Kweon WKang SJang SYu HChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Top-Personalized-K RecommendationProceedings of the ACM Web Conference 202410.1145/3589334.3645417(3388-3399)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645417
Xu SPang LXu JShen HCheng XChua TNgo CKa-Wei Lee RKumar RLauw H(2024)List-aware Reranking-Truncation Joint Model for Search and Retrieval-augmented GenerationProceedings of the ACM Web Conference 202410.1145/3589334.3645336(1330-1340)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645336
Aydın AArslan ADinçer B(2024)A set of novel HTML document quality features for Web information retrievalExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123177246:COnline publication date: 15-Jul-2024
https://dl.acm.org/doi/10.1016/j.eswa.2024.123177
Sotudian SPaschalidis I(2024)ITNRComputers in Biology and Medicine10.1016/j.compbiomed.2024.108312172:COnline publication date: 2-Jul-2024
https://dl.acm.org/doi/10.1016/j.compbiomed.2024.108312
Zhou ZZhao YZuo HChen W(2024)Ranking Enhanced Supervised Contrastive Learning for RegressionAdvances in Knowledge Discovery and Data Mining10.1007/978-981-97-2253-2_2(15-27)Online publication date: 7-May-2024
https://dl.acm.org/doi/10.1007/978-981-97-2253-2_2
Lyu LRoy NOosterhuis HAnand A(2024)Is Interpretable Machine Learning Effective at Feature Selection for Neural Learning-to-Rank?Advances in Information Retrieval10.1007/978-3-031-56066-8_29(384-402)Online publication date: 24-Mar-2024
https://dl.acm.org/doi/10.1007/978-3-031-56066-8_29
Tan HYang KYu H(2024)An In-Depth Comparison of Neural and Probabilistic Tree Models for Learning-to-rankAdvances in Information Retrieval10.1007/978-3-031-56063-7_39(468-476)Online publication date: 24-Mar-2024
https://dl.acm.org/doi/10.1007/978-3-031-56063-7_39
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents