short-paper

Fast Feature Selection for Learning to Rank

Authors:

Claudio Lucchese,

Franco Maria Nardini,

Raffaele PeregoAuthors Info & Claims

ICTIR '16: Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval

Pages 167 - 170

https://doi.org/10.1145/2970398.2970433

Published: 12 September 2016 Publication History

Abstract

An emerging research area named Learning-to-Rank (LtR) has shown that effective solutions to the ranking problem can leverage machine learning techniques applied to a large set of features capturing the relevance of a candidate document for the user query. Large-scale search systems must however answer user queries very fast, and the computation of the features for candidate documents must comply with strict back-end latency constraints. The number of features cannot thus grow beyond a given limit, and Feature Selection (FS) techniques have to be exploited to find a subset of features that both meets latency requirements and leads to high effectiveness of the trained models. In this paper, we propose three new algorithms for FS specifically designed for the LtR context where hundreds of continuous or categorical features can be involved. We present a comprehensive experimental analysis conducted on publicly available LtR datasets and we show that the proposed strategies outperform a well-known state-of-the-art competitor.

References

[1]

A. Agresti. Analysis of Ordinal Categorical Data (Second ed.). 2010.

[2]

S. Baccianella, A. Esuli, and F. Sebastiani. Feature selection for ordinal text classification. Neural computation, 26(3):557--591, 2014.

Digital Library

[3]

G. Capannini, C. Lucchese, F. M. Nardini, S. Orlando, R. Perego, and N. Tonellotto. Quality versus efficiency in document scoring with learning-to-rank models. Information Processing & Management, 2016.

Digital Library

[4]

V. Dang and B. Croft. Feature selection for document ranking using best first search and coordinate ascent. In ACM SIGIR workshop on feature generation and selection for information retrieval, 2010.

[5]

X. Geng, T.-Y. Liu, T. Qin, and H. Li. Feature selection for ranking. In Proc. SIGIR'07. ACM, 2007.

Digital Library

[6]

J. C. Gower and G. Ross. Minimum spanning trees and single linkage cluster analysis. Applied statistics, pages 54--64, 1969.

[7]

I. Guyon and A. Elisseeff. An introduction to variable and feature selection. The Journal of Machine Learning Research, 3:1157--1182, 2003.

Digital Library

[8]

G. Hua, M. Zhang, Y. Liu, S. Ma, and L. Ru. Hierarchical feature selection for ranking. 2010.

[9]

K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM TOIS, 20(4):422--446, 2002.

Digital Library

[10]

H. Lai, Y. Pan, Y. Tang, and R. Yu. Fsmrank: Feature selection algorithm for learning to rank. Transactions on Neural Networks and Learning Systems, 24(6), 2013.

[11]

L. Laporte, R. Flamary, S. Canu, S. Déjean, and J. Mothe. Non-convex regularizations for feature selection in ranking with sparse svm. Transactions on Neural Networks and Learning Systems, 10(10), 2012.

[12]

T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):225--331, 2009.

Digital Library

[13]

K. D. Naini and I. S. Altingovde. Exploiting result diversification methods for feature selection in learning to rank. In Proc. ECIR, pages 455--461. Springer, 2014.

[14]

F. Pan, T. Converse, D. Ahn, F. Salvetti, and G. Donato. Feature selection for ranking using boosted trees. In Proc. CIKM'09. ACM, 2009.

Digital Library

[15]

M. D. Smucker, J. Allan, and B. Carterette. A comparison of statistical significance tests for information retrieval evaluation. In Proc. CIKM '07. ACM, 2007.

Digital Library

[16]

J. A. T. Thomas M. Cover. Elements of Information Theory. 2006.

[17]

J. H. Ward Jr. Hierarchical grouping to optimize an objective function. Journal of the American statistical association, 58(301):236--244, 1963.

[18]

Q. Wu, C. J. Burges, K. M. Svore, and J. Gao. Ranking, boosting, and model adaptation. Technical report, Microsoft Research, 2008.

Cited By

Fröbe MMackenzie JMitra BNardini FPotthast MHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)ReNeuIR at SIGIR 2024: The Third Workshop on Reaching Efficiency in Neural Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657994(3051-3054)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657994
Faggioli GFerro NPerego RTonellotto NHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Dimension Importance Estimation for Dense Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657691(1318-1328)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657691
Lyu LRoy NOosterhuis HAnand A(2024)Is Interpretable Machine Learning Effective at Feature Selection for Neural Learning-to-Rank?Advances in Information Retrieval10.1007/978-3-031-56066-8_29(384-402)Online publication date: 24-Mar-2024
https://dl.acm.org/doi/10.1007/978-3-031-56066-8_29
Show More Cited By

Index Terms

Fast Feature Selection for Learning to Rank
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms
      1. Feature selection
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank

Recommendations

Feature selection for ranking
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

Ranking is a very important topic in information retrieval. While algorithms for learning ranking models have been intensively studied, this is not the case for feature selection, despite of its importance. The reality is that many feature selection ...
Feature selection under learning to rank model for multimedia retrieve
ICIMCS '10: Proceedings of the Second International Conference on Internet Multimedia Computing and Service

Most multimedia retrieval problem can be described by ranking model, i.e. the images in the database could be ranked according to the similarity compared with the query image. Existing ranking models generally use the features that are pre-defined by ...
Graph-based Feature Selection Method for Learning to Rank
ICCIP '20: Proceedings of the 6th International Conference on Communication and Information Processing

In this paper, a graph-based feature selection method for learning to rank, called FS-SCPR, is proposed. FS-SCPR models feature relationships as a graph and selects a subset of features that have minimum redundancy with each other and have maximum ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICTIR '16: Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval

September 2016

318 pages

ISBN:9781450344975

DOI:10.1145/2970398

General Chairs:
Ben Carterette
University of Delaware, USA
,
Hui Fang
University of Delaware, USA
,
Program Chairs:
Mounia Lalmas
Yahoo! Labs, UK
,
Jian-Yun Nie
University of Montreal, Canada

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 September 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

European Commission

Conference

ICTIR '16

Sponsor:

SIGIR

ICTIR '16: ACM SIGIR International Conference on the Theory of Information Retrieval

September 12 - 16, 2016

Delaware, Newark, USA

Acceptance Rates

ICTIR '16 Paper Acceptance Rate 41 of 79 submissions, 52%;

Overall Acceptance Rate 235 of 527 submissions, 45%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

23
Total Citations
View Citations
476
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)5

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Fröbe MMackenzie JMitra BNardini FPotthast MHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)ReNeuIR at SIGIR 2024: The Third Workshop on Reaching Efficiency in Neural Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657994(3051-3054)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657994
Faggioli GFerro NPerego RTonellotto NHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Dimension Importance Estimation for Dense Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657691(1318-1328)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657691
Lyu LRoy NOosterhuis HAnand A(2024)Is Interpretable Machine Learning Effective at Feature Selection for Neural Learning-to-Rank?Advances in Information Retrieval10.1007/978-3-031-56066-8_29(384-402)Online publication date: 24-Mar-2024
https://dl.acm.org/doi/10.1007/978-3-031-56066-8_29
Busolin FLucchese CNardini FOrlando SPerego RTrani S(2023)Early Exit Strategies for Learning-to-Rank CascadesIEEE Access10.1109/ACCESS.2023.333108811(126691-126704)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3331088
Yeh JTsai C(2022)A graph-based feature selection method for learning to rank using spectral clustering for redundancy minimization and biased PageRank for relevance analysisComputer Science and Information Systems10.2298/CSIS201220042Y19:1(141-164)Online publication date: 2022
https://doi.org/10.2298/CSIS201220042Y
Ferrari Dacrema MMoroni FNembrini RFerro NFaggioli GCremonesi PAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)Towards Feature Selection for Ranking and Classification Exploiting Quantum AnnealersProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531755(2814-2824)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3531755
Dato DMacAvaney SNardini FPerego RTonellotto NAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)The Istella22 DatasetProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531740(3099-3107)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3531740
Tortelli Portela TTyska Carvalho JBogorny V(2022)HiPerMovelets: high-performance movelet extraction for trajectory classificationInternational Journal of Geographical Information Science10.1080/13658816.2021.201859336:5(1012-1036)Online publication date: 3-Jan-2022
https://doi.org/10.1080/13658816.2021.2018593
Purpura ABuchner KSilvello GSusto G(2021)Neural Feature Selection for Learning to RankAdvances in Information Retrieval10.1007/978-3-030-72240-1_34(342-349)Online publication date: 28-Mar-2021
https://dl.acm.org/doi/10.1007/978-3-030-72240-1_34
Sousa DCanuto SGonçalves MRosa TMartins W(2019)Risk-Sensitive Learning to Rank with Evolutionary Multi-Objective Feature SelectionACM Transactions on Information Systems10.1145/330019637:2(1-34)Online publication date: 14-Feb-2019
https://dl.acm.org/doi/10.1145/3300196
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents