Learning to rank refers to machine learning techniques for training a model in a ranking task. Learning to rank is useful for many applications in information retrieval, natural language processing, and data mining. Intensive studies have been conducted on its problems recently, and significant progress has been made. This lecture gives an introduction to the area including the fundamental problems, major approaches, theories, applications, and future work. The author begins by showing that various ranking problems in information retrieval and natural language processing can be formalized as two basic ranking tasks, namely ranking creation (or simply ranking) and ranking aggregation. In ranking creation, given a request, one wants to generate a ranking list of offerings based on the features derived from the request and the offerings. In ranking aggregation, given a request, as well as a number of ranking lists of offerings, one wants to generate a new ranking list of the offerings. Ranking creation (or ranking) is the major problem in learning to rank. It is usually formalized as a supervised learning task. The author gives detailed explanations on learning for ranking creation and ranking aggregation, including training and testing, evaluation, feature creation, and major approaches. Many methods have been proposed for ranking creation. The methods can be categorized as the pointwise, pairwise, and listwise approaches according to the loss functions they employ. They can also be categorized according to the techniques they employ, such as the SVM based, Boosting based, and Neural Network based approaches. The author also introduces some popular learning to rank methods in details. These include: PRank, OC SVM, McRank, Ranking SVM, IR SVM, GBRank, RankNet, ListNet & ListMLE, AdaRank, SVM MAP, SoftRank, LambdaRank, LambdaMART, Borda Count, Markov Chain, and CRanking. The author explains several example applications of learning to rank including web search, collaborative filtering, definition search, keyphrase extraction, query dependent summarization, and re-ranking in machine translation. A formulation of learning for ranking creation is given in the statistical learning framework. Ongoing and future research directions for learning to rank are also discussed.
Cited By
- Liu Y, Zhang R, Guo J, de Rijke M, Fan Y and Cheng X Multi-granular Adversarial Attacks against Black-box Neural Ranking Models Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, (1391-1400)
- Hu S, Wang X and Lyu S (2023). Rank-Based Decomposable Losses in Machine Learning: A Survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, 45:11, (13599-13620), Online publication date: 1-Nov-2023.
- Wu C, Zhang R, Guo J, De Rijke M, Fan Y and Cheng X (2022). PRADA: Practical Black-box Adversarial Attacks against Neural Ranking Models, ACM Transactions on Information Systems, 41:4, (1-27), Online publication date: 31-Oct-2023.
- Liu Y, Zhang R, Guo J, de Rijke M, Chen W, Fan Y and Cheng X Topic-oriented Adversarial Attacks against Black-box Neural Ranking Models Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, (1700-1709)
- Ibrahim O and Younis E (2022). Hybrid online–offline learning to rank using simulated annealing strategy based on dependent click model, Knowledge and Information Systems, 64:10, (2833-2847), Online publication date: 1-Oct-2022.
- Ignaczak L, Goldschmidt G, Costa C and Righi R (2021). Text Mining in Cybersecurity, ACM Computing Surveys, 54:7, (1-36), Online publication date: 30-Sep-2022.
- Ghanbari E and Shakery A (2019). ERR.Rank, Applied Intelligence, 49:3, (1185-1199), Online publication date: 1-Mar-2019.
- Bhattacharyya A, Dey P and Woodruff D (2018). An Optimal Algorithm for ℓ1-Heavy Hitters in Insertion Streams and Related Problems, ACM Transactions on Algorithms, 15:1, (1-27), Online publication date: 25-Jan-2019.
- Yang P, Fang H and Lin J (2018). Anserini, Journal of Data and Information Quality, 10:4, (1-20), Online publication date: 3-Nov-2018.
- Jiang Z, Gao L, Yuan K, Gao Z, Tang Z and Liu X Mathematics Content Understanding for Cyberlearning via Formula Evolution Map Proceedings of the 27th ACM International Conference on Information and Knowledge Management, (37-46)
- Di M, Klabjan D, Sha L and Lucey P (2018). Large-Scale Adversarial Sports Play Retrieval with Learning to Rank, ACM Transactions on Knowledge Discovery from Data, 12:6, (1-18), Online publication date: 17-Oct-2018.
- Salehian H, Yerva S, Barjasteh I, Howell P and Lee C A deep multi-modal pairwise ranking model for user generated food data Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, (503-510)
- He X, He Z, Du X and Chua T Adversarial Personalized Ranking for Recommendation The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, (355-364)
- Ren Z, Jiang H, Xuan J and Yang Z Automated localization for unreproducible builds Proceedings of the 40th International Conference on Software Engineering, (71-81)
- Feng J, Li H, Huang M, Liu S, Ou W, Wang Z and Zhu X Learning to Collaborate Proceedings of the 2018 World Wide Web Conference, (1939-1948)
- Nanni F, Mitra B, Magnusson M and Dietz L Benchmark for Complex Answer Retrieval Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval, (293-296)
- Nanni F, Ponzetto S and Dietz L Building entity-centric event collections Proceedings of the 17th ACM/IEEE Joint Conference on Digital Libraries, (199-208)
- Bhowmik A and Ghosh J LETOR Methods for Unsupervised Rank Aggregation Proceedings of the 26th International Conference on World Wide Web, (1331-1340)
- Ibrahim O and Landa-Silva D ES-Rank Proceedings of the Symposium on Applied Computing, (944-950)
- Busjaeger B and Xie T Learning for test prioritization: an industrial case study Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, (975-980)
- Fang Y and Liu M A Unified Energy-based Framework for Learning to Rank Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval, (171-180)
- Jiao Y, Korba A and Sibony E Controlling the distance to a Kemeny consensus without computing it Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, (2971-2980)
- Tabrizi S, Dadashkarimi J, Dehghani M, Nasr Esfahani H and Shakery A Revisiting Optimal Rank Aggregation Proceedings of the 2015 International Conference on The Theory of Information Retrieval, (353-356)
Recommendations
Learning to rank for information retrieval
SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrievalThis tutorial is concerned with a comprehensive introduction to the research area of learning to rank for information retrieval. In the first part of the tutorial, we will introduce three major approaches to learning to rank, i.e., the pointwise, ...