Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3397271.3401256acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Query-level Early Exit for Additive Learning-to-Rank Ensembles

Published: 25 July 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Search engine ranking pipelines are commonly based on large ensembles of machine-learned decision trees. The tight constraints on query response time recently motivated researchers to investigate algorithms to make faster the traversal of the additive ensemble or to early terminate the evaluation of documents that are unlikely to be ranked among the top-k. In this paper, we investigate the novel problem of query-level early exiting, aimed at deciding the profitability of early stopping the traversal of the ranking ensemble for all the candidate documents to be scored for a query, by simply returning a ranking based on the additive scores computed by a limited portion of the ensemble. Besides the obvious advantage on query latency and throughput, we address the possible positive impact on ranking effectiveness. To this end, we study the actual contribution of incremental portions of the tree ensemble to the ranking of the top-k documents scored for a given query. Our main finding is that queries exhibit different behaviors as scores are accumulated during the traversal of the ensemble and that query-level early stopping can remarkably improve ranking quality. We present a reproducible and comprehensive experimental evaluation, conducted on two public datasets, showing that query-level early exiting achieves an overall gain of up to 7.5% in terms of NDCG@10 with a speedup of the scoring process of up to 2.2x.

    Supplementary Material

    MOV File (3397271.3401256.mov)
    In this paper, we investigate the novel problem of query-level early exiting, aimed at deciding the profitability of early stopping the traversal of the ranking ensemble for all the candidate documents to be scored for a query, by simply returning a ranking based on the additive scores computed by a limited portion of the ensemble. Besides the obvious advantage on query latency and throughput, we address the possible positive impact on ranking effectiveness. Our main finding is that queries exhibit different behaviors as scores are accumulated during the traversal of the ensemble and that query-level early stopping can remarkably improve ranking quality.\r\n\r\nWe present a reproducible and comprehensive experimental evaluation, conducted on two public datasets, showing that query-level early exiting achieves an overall gain of up to 7.5% in terms of NDCG@10 with a speedup of the scoring process of up to 2.2x.

    References

    [1]
    Nima Asadi and Jimmy Lin. 2013. Training efficient tree-based models for document ranking. In Advances in Information Retrieval. Springer, 146--157.
    [2]
    Christopher JC Burges. 2010. From ranknet to lambdarank to lambdamart: An overview. Learning, Vol. 11, 23--581 (2010), 81.
    [3]
    Berkant Barla Cambazoglu, Hugo Zaragoza, Olivier Chapelle, Jiang Chen, Ciya Liao, Zhaohui Zheng, and Jon Degenhardt. 2010. Early exit optimizations for additive machine learned ranking systems. In Proc. WSDM. ACM, 411--420.
    [4]
    G. Capannini, C. Lucchese, F. M. Nardini, S. Orlando, R. Perego, and N. Tonellotto. 2016. Quality versus efficiency in document scoring with learning-to-rank models. Information Processing & Management, Vol. 52, 6 (2016), 1161 -- 1177.
    [5]
    Ruey-Cheng Chen, Luke Gallagher, Roi Blanco, and J. Shane Culpepper. 2017. Efficient Cost-Aware Cascade Ranking in Multi-Stage Retrieval. In Proc. SIGIR. ACM, 445--454.
    [6]
    D. Dato, C. Lucchese, F.M. Nardini, S. Orlando, R. Perego, N. Tonellotto, and R. Venturini. 2016. Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. ACM TOIS, Vol. 35, 2 (2016).
    [7]
    J. H. Friedman. 2000. Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics, Vol. 29 (2000), 1189--1232.
    [8]
    Xin Jin, Daniel Agun, Tao Yang, Qinghao Wu, Yifan Shen, and Susen Zhao. 2016. Hybrid Indexing for Versioned Document Search with Cluster-based Retrieval. In Proc. CIKM. ACM, 377--386.
    [9]
    Francesco Lettich, Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, and Rossano Venturini. 2019. Parallel Traversal of Large Ensembles of Decision Trees. IEEE TPDS, Vol. 30, 9 (2019), 2075--2089.
    [10]
    Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Fabrizio Silvestri, and Trani Salvatore. 2018. X-CLEaVER: Learning Ranking Ensembles by Growing and Pruning Trees. ACM TIST (2018).
    [11]
    Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Fabrizio Silvestri, and Salvatore Trani. 2016. Post-Learning Optimization of Tree Ensembles for Efficient Ranking. In Proc. SIGIR. ACM, 949--952.
    [12]
    C. Lucchese, F. M. Nardini, S. Orlando, R. Perego, N. Tonellotto, and R. Venturini. 2015. QuickScorer: A Fast Algorithm to Rank Documents with Additive Ensembles of Regression Trees. In Proc. SIGIR. ACM, 73--82.
    [13]
    Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, and Salvatore Trani. 2017. X-DART: Blending Dropout and Pruning for Efficient Learning to Rank. In Proc. SIGIR. ACM, 1077--1080.
    [14]
    Lidan Wang, Jimmy Lin, and Donald Metzler. 2010. Learning to Efficiently Rank. In Proc. SIGIR. ACM, New York, NY, USA, 138--145.
    [15]
    Q. Wu, C.J.C. Burges, K.M. Svore, and J. Gao. 2010. Adapting boosting for information retrieval measures. Information Retrieval (2010).
    [16]
    Ting Ye, Hucheng Zhou, Will Y. Zou, Bin Gao, and Ruofei Zhang. 2018. RapidScorer: Fast Tree Ensemble Evaluation by Maximizing Compactness in Data Level Parallelization. In Proc. SIGKDD. ACM, 941--950.
    [17]
    D. Yin, Y. Hu, J. Tang, T. Daly Jr., M. Zhou, H. Ouyang, J. Chen, C. Kang, H. Deng, C. Nobata, J.M. Langlois, and Y. Chang. 2016. Ranking Relevance in Yahoo Search. In Proc. ACM SIGKDD. ACM.

    Cited By

    View all
    • (2023)Early Exit Strategies for Learning-to-Rank CascadesIEEE Access10.1109/ACCESS.2023.333108811(126691-126704)Online publication date: 2023
    • (2022)Ensemble Model Compression for Fast and Energy-Efficient Ranking on FPGAsAdvances in Information Retrieval10.1007/978-3-030-99736-6_18(260-273)Online publication date: 5-Apr-2022
    • (2021)Learning Early Exit Strategies for Additive Ranking EnsemblesProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3463088(2217-2221)Online publication date: 11-Jul-2021

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2020
    2548 pages
    ISBN:9781450380164
    DOI:10.1145/3397271
    © 2020 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 July 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. additive regression trees
    2. efficiency/effectiveness trade-offs
    3. learning to rank
    4. query-level earlyexit

    Qualifiers

    • Short-paper

    Funding Sources

    Conference

    SIGIR '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Early Exit Strategies for Learning-to-Rank CascadesIEEE Access10.1109/ACCESS.2023.333108811(126691-126704)Online publication date: 2023
    • (2022)Ensemble Model Compression for Fast and Energy-Efficient Ranking on FPGAsAdvances in Information Retrieval10.1007/978-3-030-99736-6_18(260-273)Online publication date: 5-Apr-2022
    • (2021)Learning Early Exit Strategies for Additive Ranking EnsemblesProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3463088(2217-2221)Online publication date: 11-Jul-2021

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media