research-article

Open access

Combining Decision Trees and Neural Networks for Learning-to-Rank in Personal Search

Authors:

Donald MetzlerAuthors Info & Claims

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 2032 - 2040

https://doi.org/10.1145/3292500.3330676

Published: 25 July 2019 Publication History

Abstract

Decision Trees (DTs) like LambdaMART have been one of the most effective types of learning-to-rank algorithms in the past decade. They typically work well with hand-crafted dense features (e.g., BM25 scores). Recently, Neural Networks (NNs) have shown impressive results in leveraging sparse and complex features (e.g., query and document keywords) directly when a large amount of training data is available. While there is a large body of work on how to use NNs for semantic matching between queries and documents, relatively less work has been conducted to compare NNs with DTs for general learning-to-rank tasks, where dense features are also available and DTs can achieve state-of-the-art performance. In this paper, we study how to combine DTs and NNs to effectively bring the benefits from both sides in the learning-to-rank setting. Specifically, we focus our study on personal search where clicks are used as the primary labels with unbiased learning-to-rank algorithms and a significantly large amount of training data is easily available. Our combination methods are based on ensemble learning. We design 12 variants and compare them based on two aspects, ranking effectiveness and ease-of-deployment, using two of the largest personal search services: Gmail search and Google Drive search. We show that direct application of existing ensemble methods can not achieve both aspects. We thus design a novel method that uses NNs to compensate DTs via boosting. We show that such a method is not only easier to deploy, but also gives comparable or better ranking accuracy.

Supplementary Material

MP4 File (p2032-li.mp4)

Download
1069.12 MB

References

[1]

Eugene Agichtein, Eric Brill, and Susan Dumais. 2006. Improving web search ranking by incorporating user behavior information. In Proceedings of the 29th annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 19--26.

Digital Library

[2]

Qingyao Ai, Keping Bi, Jiafeng Guo, and W Bruce Croft. 2018. Learning a Deep Listwise Context Model for Ranking Refinement. In Proceedings of the 41st annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 135--144.

Digital Library

[3]

Michael Bendersky, Xuanhui Wang, Donald Metzler, and Marc Najork. 2017. Learning from user interactions in personal search via attribute parameterization. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. ACM, 791--799.

Digital Library

[4]

Leo Breiman. 2001. Random forests. Machine learning, Vol. 45, 1 (2001), 5--32.

Digital Library

[5]

Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine learning. ACM, 89--96.

Digital Library

[6]

Christopher Burges, Krysta Svore, Paul Bennett, Andrzej Pastusiak, and Qiang Wu. 2011. Learning to rank using an ensemble of lambda-gradient models. In Proceedings of the Learning to Rank Challenge. 25--35.

Digital Library

[7]

Christopher JC Burges. 2010. From RankNet to LambdaRank to LambdaMART: An overview. Learning, Vol. 11, 23--581 (2010), 81.

[8]

Christopher J Burges, Robert Ragno, and Quoc V Le. 2007. Learning to rank with nonsmooth cost functions. In Advances in Neural Information Processing Systems. 193--200.

Digital Library

[9]

Olivier Chapelle and Yi Chang. 2011. Yahoo! learning to rank challenge overview. In Proceedings of the Learning to Rank Challenge . 1--24.

Digital Library

[10]

Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et almbox. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM, 7--10.

Digital Library

[11]

W Bruce Croft, Donald Metzler, and Trevor Strohman. 2010. Search engines: Information retrieval in practice. Vol. 283. Addison-Wesley Reading.

Digital Library

[12]

Christopher D Manning, Prabhakar Raghavan, and Hinrich Schutza. 2008. Introduction to information retrieval. An Introduction To Information Retrieval, Vol. 151, 177 (2008), 5.

[13]

Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps, and W Bruce Croft. 2017. Neural ranking models with weak supervision. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 65--74.

Digital Library

[14]

Pinar Donmez, Krysta M Svore, and Christopher JC Burges. 2009. On the local optimality of LambdaRank. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 460--467.

Digital Library

[15]

Yoav Freund, Raj Iyer, Robert E Schapire, and Yoram Singer. 2003. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, Vol. 4, Nov (2003), 933--969.

Digital Library

[16]

Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics (2001), 1189--1232.

[17]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.

[18]

Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management. ACM, 2333--2338.

Digital Library

[19]

Muhammad Ibrahim and Mark Carman. 2016. Comparing pointwise and listwise objective functions for random-forest-based learning-to-rank. ACM Transactions on Information Systems (TOIS), Vol. 34, 4 (2016), 20.

Digital Library

[20]

Kalervo J"arvelin and Jaana Kek"al"ainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), Vol. 20, 4 (2002), 422--446.

Digital Library

[21]

Thorsten Joachims. 2006. Training linear SVMs in linear time. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 217--226.

Digital Library

[22]

Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased learning-to-rank with biased feedback. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. ACM, 781--789.

Digital Library

[23]

Steven G Krantz and Harold R Parks. 2002. A primer of real analytic functions .Springer Science & Business Media.

[24]

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature, Vol. 521, 7553 (2015), 436.

[25]

Francesco Lettich, Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, and Rossano Venturini. 2018. Parallel Traversal of Large Ensembles of Decision Trees. IEEE Transactions on Parallel and Distributed Systems (2018).

[26]

Pan Li, Arya Mazumdar, and Olgica Milenkovic. 2017. Efficient Rank Aggregation via Lehmer Codes. In Artificial Intelligence and Statistics . 450--459.

[27]

Ping Li, Qiang Wu, and Christopher J Burges. 2008. Mcrank: Learning to rank using multiple classification and gradient boosting. In Advances in Neural Information Processing Systems. 897--904.

Digital Library

[28]

Xiaoliang Ling, Weiwei Deng, Chen Gu, Hucheng Zhou, Cui Li, and Feng Sun. 2017. Model ensemble for click prediction in bing search ads. In Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 689--698.

Digital Library

[29]

Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends® in Information Retrieval, Vol. 3, 3 (2009), 225--331.

Digital Library

[30]

Tie-Yan Liu, Jun Xu, Tao Qin, Wenying Xiong, and Hang Li. 2007. Letor: Benchmark dataset for research on learning to rank for information retrieval. In Proceedings of SIGIR 2007 workshop on learning to rank for information retrieval, Vol. 310. ACM Amsterdam, The Netherlands.

[31]

Yoelle Maarek. 2017. Mail Search: It's Getting Personal!. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '17). ACM, New York, NY, USA, 3--3.

Digital Library

[32]

Llew Mason, Jonathan Baxter, Peter L Bartlett, and Marcus R Frean. 2000. Boosting algorithms as gradient descent. In Advances in Neural Information Processing Systems. 512--518.

Digital Library

[33]

Bhaskar Mitra and Nick Craswell. 2017. Neural Models for Information Retrieval. arXiv preprint arXiv:1705.01509 (2017).

[34]

David Opitz and Richard Maclin. 1999. Popular ensemble methods: An empirical study. Journal of artificial intelligence research, Vol. 11 (1999), 169--198.

Digital Library

[35]

Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: Bringing order to the web. Technical Report. Stanford InfoLab.

[36]

Natalia Ponomareva, Thomas Colthurst, Gilbert Hendry, Salem Haykal, and Soroush Radpour. 2017. Compact multi-class boosted trees. In 2017 IEEE International Conference on Big Data . IEEE, 47--56.

[37]

Jay M Ponte and W Bruce Croft. 1998. A language modeling approach to information retrieval. In Proceedings of the 21st annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 275--281.

Digital Library

[38]

Si Si, Huan Zhang, Sathiya Keerthi, Druv Mahajan, Inderjit Dhillon, and Cho-Jui Hsieh. 2017. Gradient boosted decision trees for high dimensional sparse output. In International Conference on Machine Learning .

Digital Library

[39]

Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & cross network for ad click predictions. In Proceedings of the ADKDD'17. ACM, 12.

Digital Library

[40]

Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to rank with selection bias in personal search. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval . ACM, 115--124.

Digital Library

[41]

Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018. Position bias estimation for unbiased learning to rank in personal search. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 610--618.

Digital Library

[42]

Qiang Wu, Christopher JC Burges, Krysta M Svore, and Jianfeng Gao. 2010. Adapting boosting for information retrieval measures. Information Retrieval, Vol. 13, 3 (2010), 254--270.

Digital Library

[43]

Jun Xu and Hang Li. 2007. AdaRank: a boosting algorithm for information retrieval. In Proceedings of the 30th annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 391--398.

Digital Library

[44]

Dawei Yin, Yuening Hu, Jiliang Tang, Tim Daly, Mianwei Zhou, Hua Ouyang, Jianhui Chen, Changsung Kang, Hongbo Deng, Chikashi Nobata, et almbox. 2016. Ranking relevance in yahoo search. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 323--332.

Digital Library

[45]

Hamed Zamani, Michael Bendersky, Xuanhui Wang, and Mingyang Zhang. 2017. Situational context for ranking in personal search. In Proceedings of the 26th International Conference on World Wide Web. 1531--1540.

Digital Library

Cited By

Chung JShin DHwang SPark G(2024)Horse race rank prediction using learning-to-rank approachesKorean Journal of Applied Statistics10.5351/KJAS.2024.37.2.23937:2(239-253)Online publication date: 30-Apr-2024
https://doi.org/10.5351/KJAS.2024.37.2.239
Aravanis THatzilygeroudis ISpiliopoulos G(2024)Ensemble Modelling for Predicting Fish MortalityApplied Sciences10.3390/app1415654014:15(6540)Online publication date: 26-Jul-2024
https://doi.org/10.3390/app14156540
Qin ZJagerman RPasumarthi RZhuang HZhang HBai AHui KYan LWang XOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)RD-SuiteProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667673(35748-35760)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667673
Show More Cited By

Index Terms

Combining Decision Trees and Neural Networks for Learning-to-Rank in Personal Search
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms
      1. Ensemble methods
    2. Machine learning approaches
      1. Neural networks
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank
    2. Search engine architectures and scalability

Recommendations

Learning to Rank with Selection Bias in Personal Search
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Click-through data has proven to be a critical resource for improving search ranking quality. Though a large amount of click data can be easily collected by search engines, various biases make it difficult to fully leverage this type of data. In the past,...
Situational Context for Ranking in Personal Search
WWW '17: Proceedings of the 26th International Conference on World Wide Web

Modern search engines leverage a variety of sources, beyond the conventional query-document content similarity, to improve their ranking performance. Among them, query context has attracted attention in prior work. Previously, query context was mainly ...
Learning to rank code examples for code search engines

Source code examples are used by developers to implement unfamiliar tasks by learning from existing solutions. To better support developers in finding existing solutions, code search engines are designed to locate and rank code examples relevant to user'...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

July 2019

3305 pages

ISBN:9781450362016

DOI:10.1145/3292500

General Chairs:
Ankur Teredesai
KenSci
,
Vipin Kumar
University of Minnesota
,
Program Chairs:
Ying Li
EV Analysis Corporation
,
Rómer Rosales
LinkedIn
,
Evimaria Terzi
Boston University
,
George Karypis
University of Minnesota

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '19

Sponsor:

KDD '19: The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 4 - 8, 2019

AK, Anchorage, USA

Acceptance Rates

KDD '19 Paper Acceptance Rate 110 of 1,200 submissions, 9%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
2,659
Total Downloads

Downloads (Last 12 months)279
Downloads (Last 6 weeks)33

Reflects downloads up to 24 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chung JShin DHwang SPark G(2024)Horse race rank prediction using learning-to-rank approachesKorean Journal of Applied Statistics10.5351/KJAS.2024.37.2.23937:2(239-253)Online publication date: 30-Apr-2024
https://doi.org/10.5351/KJAS.2024.37.2.239
Aravanis THatzilygeroudis ISpiliopoulos G(2024)Ensemble Modelling for Predicting Fish MortalityApplied Sciences10.3390/app1415654014:15(6540)Online publication date: 26-Jul-2024
https://doi.org/10.3390/app14156540
Qin ZJagerman RPasumarthi RZhuang HZhang HBai AHui KYan LWang XOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)RD-SuiteProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667673(35748-35760)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667673
Lin XChen XSong LLiu JLi BJiang PSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)Tree based Progressive Regression Model for Watch-Time Prediction in Short-video RecommendationProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599919(4497-4506)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599919
Biçici E(2023)Residual Fusion Models with Neural Networks for CTR Prediction2023 8th International Conference on Computer Science and Engineering (UBMK)10.1109/UBMK59864.2023.10286706(01-04)Online publication date: 13-Sep-2023
https://doi.org/10.1109/UBMK59864.2023.10286706
Brehler MCamphausen LHeidebroek BKrön DGründer HCamphausen S(2023)Making Machine Learning More Energy Efficient by Bringing It Closer to the SensorIEEE Micro10.1109/MM.2023.331634843:6(11-18)Online publication date: 1-Nov-2023
https://dl.acm.org/doi/10.1109/MM.2023.3316348
Treviño Gavito AKlabjan DUtke J(2023)Gradient-Boosted Based Structured and Unstructured LearningArtificial Neural Networks and Machine Learning – ICANN 202310.1007/978-3-031-44213-1_37(439-451)Online publication date: 22-Sep-2023
https://doi.org/10.1007/978-3-031-44213-1_37
Yan LQin ZWang XBendersky MNajork MZhang ARangwala H(2022)Scale Calibration of Deep Ranking ModelsProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539072(4300-4309)Online publication date: 14-Aug-2022
https://dl.acm.org/doi/10.1145/3534678.3539072
Jagerman RQin ZWang XBendersky MNajork MAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)On Optimizing Top-K Metrics for Neural Ranking ModelsProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531849(2303-2307)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3531849
Gabidolla MCarreira-Perpinan M(2022)Pushing the Envelope of Gradient Boosting Forests via Globally-Optimized Oblique Trees2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52688.2022.00038(285-294)Online publication date: Jun-2022
https://doi.org/10.1109/CVPR52688.2022.00038
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten