research-article

Parallel boosted regression trees for web search ranking

Authors:

Kilian Q. Weinberger,

Jennifer PaykinAuthors Info & Claims

WWW '11: Proceedings of the 20th international conference on World wide web

Pages 387 - 396

https://doi.org/10.1145/1963405.1963461

Published: 28 March 2011 Publication History

Abstract

Gradient Boosted Regression Trees (GBRT) are the current state-of-the-art learning paradigm for machine learned web-search ranking - a domain notorious for very large data sets. In this paper, we propose a novel method for parallelizing the training of GBRT. Our technique parallelizes the construction of the individual regression trees and operates using the master-worker paradigm as follows. The data are partitioned among the workers. At each iteration, the worker summarizes its data-partition using histograms. The master processor uses these to build one layer of a regression tree, and then sends this layer to the workers, allowing the workers to build histograms for the next layer. Our algorithm carefully orchestrates overlap between communication and computation to achieve good performance.

Since this approach is based on data partitioning, and requires a small amount of communication, it generalizes to distributed and shared memory machines, as well as clouds. We present experimental results on both shared memory machines and clusters for two large scale web search ranking data sets. We demonstrate that the loss in accuracy induced due to the histogram approximation in the regression tree creation can be compensated for through slightly deeper trees. As a result, we see no significant loss in accuracy on the Yahoo data sets and a very small reduction in accuracy for the Microsoft LETOR data. In addition, on shared memory machines, we obtain almost perfect linear speed-up with up to about 48 cores on the large data sets. On distributed memory machines, we get a speedup of 25 with 32 processors. Due to data partitioning our approach can scale to even larger data sets, on which one can reasonably expect even higher speedups.

References

[1]

N. Amado, J. Gama, and F. Silva. Parallel implementation of decision tree learning algorithms. Progress in Artificial Intelligence, pages 34--52, 2001.

[2]

Y. Ben-Haim and E. Yom-Tov. A streaming parallel decision tree algorithm. The Journal of Machine Learning Research, 11:849--872, 2010.

Digital Library

[3]

L. Breiman. Bagging predictors. Machine learning, 24(2):123--140, 1996.

[4]

L. Breiman. Random forests. Machine learning, 45(1):5--32, 2001.

Digital Library

[5]

L. Breiman, J. Friedman, C. J. Stone, and R. A. Olshen. Classification and regression trees. Chapman & Hall/CRC, 1984.

[6]

C. Burges. From RankNet to LambdaRank to LambdaMART: An Overview. 2010.

[7]

C. Burges, T. Shaked, E. Renshaw, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In Internation Conference on Machine Learning, pages 89--96, 2005.

Digital Library

[8]

Z. Cao and T.-Y. Liu. Learning to rank: From pairwise approach to listwise approach. In Proceedings of the 24th International Conference on Machine Learning, pages 129--136, 2007.

Digital Library

[9]

O. Chapelle and Y. Chang. Yahoo! Learning to Rank Challenge overview. Journal of Machine Learning Research, Workshop and Conference Proceedings, 14:1--24, 2011.

[10]

O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In Proceeding of the 18th ACM Conference on Information and Knowledge Management, pages 621--630. ACM, 2009.

Digital Library

[11]

O. Chapelle and M. Wu. Gradient descent optimization of smoothed information retrieval metrics. Information Retrieval Journal, Special Issue on Learning to Rank for Information Retrieval, 2010. to appear.

Digital Library

[12]

J. Darlington, Y. Guo, J. Sutiwaraphun, and H. To. Parallel induction algorithms for data mining. Advances in Intelligent Data Analysis Reasoning about Data, pages 437--445, 1997.

Digital Library

[13]

A. Freitas and S. Lavington. Mining very large databases with parallel processing. Springer, 1998.

Digital Library

[14]

J. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29:1189--1232, 2001.

[15]

J. Gehrke, R. Ramakrishnan, and V. Ganti. RainForest - a framework for fast decision tree construction of large datasets. Data Mining and Knowledge Discovery, 4(2):127--162, 2000.

Digital Library

[16]

R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression, pages 115--132. MIT Press, Cambridge, MA, 2000.

[17]

K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), 20(4):422--446, 2002.

Digital Library

[18]

T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD). ACM, 2002.

Digital Library

[19]

A. Lazarevic and Z. Obradovic. Boosting algorithms for parallel and distributed learning. Distributed and Parallel Databases, 11(2):203--229, 2002.

Digital Library

[20]

P. Li, C. J. C. Burges, and Q. Wu. Mcrank: Learning to rank using multiple classification and gradient boosting. In J. C. Platt, D. Koller, Y. Singer, and S. T. Roweis, editors, NIPS. MIT Press, 2007.

[21]

T. Liu, J. Xu, T. Qin, W. Xiong, and H. Li. Letor: Benchmark dataset for research on learning to rank for information retrieval. In Proceedings of SIGIR 2007 Workshop on Learning to Rank for Information Retrieval, pages 3--10, 2007.

[22]

A. Mohan, Z. Chen, and K. Q. Weinberger. Web-search ranking with initialized gradient boosted regression trees. Journal of Machine Learning Research, Workshop and Conference Proceedings, 14:77--89, 2011.

[23]

B. Panda, J. Herbach, S. Basu, and R. Bayardo. Planet: Massively parallel learning of tree ensembles with mapreduce. Proceedings of the Very Large Database Endowment, 2(2):1426--1437, 2009.

Digital Library

[24]

D. Pavlov and C. Brunk. Bagboo: Bagging the gradient boosting. Talk at Workshop on Websearch Ranking at the 27th International Conference on Machine Learning, 2010.

[25]

J. Shafer, R. Agrawal, and M. Mehta. SPRINT: A scalable parallel classifier for data mining. In Proceedings of the International Conference on Very Large Data Bases, pages 544--555, 1996.

Digital Library

[26]

M. Snir. MPI - the Complete Reference: The MPI core. The MIT Press, 1998.

Digital Library

[27]

A. Srivastava, E. Han, V. Kumar, and V. Singh. Parallel formulations of decision-tree classification algorithms. High Performance Data Mining, pages 237--261, 2002.

Digital Library

[28]

M. Taylor, J. Guiver, S. Robertson, and T. Minka. SoftRank: optimizing non-smooth rank metrics. In Proc. 1st ACM Int'l Conf. on Web Search and Data Mining, pages 77--86, 2008.

Digital Library

[29]

N. Uyen and T. Chung. A new framework for distributed boosting algorithm. Future Generation Communication and Networking, 1:420--423, 2007.

Digital Library

[30]

G. Webb. Multiboosting: A technique for combining boosting and wagging. Machine learning, 40(2):159--196, 2000.

Digital Library

[31]

J. Ye, J. Chow, J. Chen, and Z. Zheng. Stochastic gradient boosted distributed decision trees. In CIKM '09: Proceeding of the 18th ACM Conference on Information and Knowledge Management, pages 2061--2064. ACM, 2009.

Digital Library

[32]

C. Yu and D. Skillicorn. Parallelizing boosting and bagging. 2001.

[33]

Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In Proc. 30th Int'l ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 271--278, 2007.

Digital Library

[34]

Z. Zheng, H. Zha, T. Zhang, O. Chapelle, K. Chen, and G. Sun. A general boosting method and its application to learning ranking functions for web search. In NIPS, 2007.

Cited By

Liu GWen YGu YZhou JChen S(2025)Decision Tree Clusters: Non-destructive detection of overheating defects in porcelain insulators using quantitative thermal imaging techniquesMeasurement10.1016/j.measurement.2024.115723241(115723)Online publication date: Feb-2025
https://doi.org/10.1016/j.measurement.2024.115723
Vasudevan MYuksel M(2024)Machine Learning for Radio Propagation Modeling: A Comprehensive SurveyIEEE Open Journal of the Communications Society10.1109/OJCOMS.2024.34464575(5123-5153)Online publication date: 2024
https://doi.org/10.1109/OJCOMS.2024.3446457
Hussain IRaza WSajjad UAbbas NAli HHamid KYan W(2024)A physics-informed, data-driven framework for estimation and optimization of two-phase pressure drop of refrigerants in mini- and macro channelsResults in Engineering10.1016/j.rineng.2024.102538(102538)Online publication date: Jul-2024
https://doi.org/10.1016/j.rineng.2024.102538
Show More Cited By

Recommendations

Web-search ranking with initialized gradient boosted regression trees
YLRC'10: Proceedings of the 2010 International Conference on Yahoo! Learning to Rank Challenge - Volume 14

In May 2010 Yahoo! Inc. hosted the Learning to Rank Challenge. This paper summarizes the approach by the highly placed team Washington University in St. Louis. We investigate Random Forests (RF) as a low-cost alternative algorithm to Gradient Boosted ...
Improving Ranking Consistency for Web Search by Leveraging a Knowledge Base and Search Logs
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

In this paper, we propose a new idea called ranking consistency in web search. Relevance ranking is one of the biggest problems in creating an effective web search system. Given some queries with similar search intents, conventional approaches typically ...
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '11: Proceedings of the 20th international conference on World wide web

March 2011

840 pages

ISBN:9781450306324

DOI:10.1145/1963405

General Chairs:
S. Sadagopan
IIIT-Bangalore, India
,
Krithi Ramamritham
IIT-Bombay, India
,
Arun Kumar
IBM Research, India
,
M. P. Ravindra
Infosys E & R, India
,
Program Chairs:
Elisa Bertino
Purdue University, USA
,
Ravi Kumar
Yahoo! Research, USA

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
The International Institute of Information Technology Bangalore: The International Institute of Information Technology Bangalore

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 March 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '11

WWW '11: 20th International World Wide Web Conference

March 28 - April 1, 2011

Hyderabad, India

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

99
Total Citations
View Citations
879
Total Downloads

Downloads (Last 12 months)36
Downloads (Last 6 weeks)4

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liu GWen YGu YZhou JChen S(2025)Decision Tree Clusters: Non-destructive detection of overheating defects in porcelain insulators using quantitative thermal imaging techniquesMeasurement10.1016/j.measurement.2024.115723241(115723)Online publication date: Feb-2025
https://doi.org/10.1016/j.measurement.2024.115723
Vasudevan MYuksel M(2024)Machine Learning for Radio Propagation Modeling: A Comprehensive SurveyIEEE Open Journal of the Communications Society10.1109/OJCOMS.2024.34464575(5123-5153)Online publication date: 2024
https://doi.org/10.1109/OJCOMS.2024.3446457
Hussain IRaza WSajjad UAbbas NAli HHamid KYan W(2024)A physics-informed, data-driven framework for estimation and optimization of two-phase pressure drop of refrigerants in mini- and macro channelsResults in Engineering10.1016/j.rineng.2024.102538(102538)Online publication date: Jul-2024
https://doi.org/10.1016/j.rineng.2024.102538
Mukherjee BKar SSain K(2024)Machine Learning Assisted State-of-the-Art-of Petrographic Classification From Geophysical LogsPure and Applied Geophysics10.1007/s00024-024-03563-4Online publication date: 31-Aug-2024
https://doi.org/10.1007/s00024-024-03563-4
Blockeel HDevos LFrénay BNanfack GNijssen S(2023)Decision trees: from efficient prediction to responsible AIFrontiers in Artificial Intelligence10.3389/frai.2023.11245536Online publication date: 26-Jul-2023
https://doi.org/10.3389/frai.2023.1124553
Zheng YXu SWang SGao YHua Z(2023)Privet: A Privacy-Preserving Vertical Federated Learning Service for Gradient Boosted Decision TablesIEEE Transactions on Services Computing10.1109/TSC.2023.327983916:5(3604-3620)Online publication date: Sep-2023
https://doi.org/10.1109/TSC.2023.3279839
Ou DJiang CZheng MRen Y(2023)Container Power Consumption Prediction Based on GBRT-PL for Edge Servers in Smart CityIEEE Internet of Things Journal10.1109/JIOT.2023.328136810:21(18799-18807)Online publication date: 1-Nov-2023
https://doi.org/10.1109/JIOT.2023.3281368
Xiao HYang JLiu YLiu JDu DLu Z(2023)A Fast Gradient Boosting Based Approach for Predicting Frags in Tactic Games2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)10.1109/ICMEW59549.2023.00007(6-10)Online publication date: Jul-2023
https://doi.org/10.1109/ICMEW59549.2023.00007
Khosravi MGhazani M(2023)Novel insights into the modeling financial time-series through machine learning methods: Evidence from the cryptocurrency marketExpert Systems with Applications10.1016/j.eswa.2023.121012234(121012)Online publication date: Dec-2023
https://doi.org/10.1016/j.eswa.2023.121012
Lee SJoo KSim SLee JLee ILee J(2022)CRFalign: A Sequence-Structure Alignment of Proteins Based on a Combination of HMM-HMM Comparison and Conditional Random FieldsMolecules10.3390/molecules2712371127:12(3711)Online publication date: 9-Jun-2022
https://doi.org/10.3390/molecules27123711
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents