research-article

Learning to Rank User Queries to Detect Search Tasks

Authors:

Claudio Lucchese,

Franco Maria Nardini,

Salvatore Orlando,

Gabriele TolomeiAuthors Info & Claims

ICTIR '16: Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval

Pages 157 - 166

https://doi.org/10.1145/2970398.2970407

Published: 12 September 2016 Publication History

Abstract

We present a framework for discovering sets of web queries having similar latent needs, called search tasks, from user queries stored in a search engine log. The framework is made of two main modules: Query Similarity Learning (QSL) and Graph-based Query Clustering (GQC). The former is devoted to learning a query similarity function from a ground truth of manually-labeled search tasks. The latter represents each user search log as a graph whose nodes are queries, and uses the learned similarity function to weight edges between query pairs. Finally, search tasks are detected by clustering those queries in the graph which are connected by the strongest links, in fact by detecting the strongest connected components of the graph. To discriminate between "strong" and "weak" links also the GQC module entails a learning phase whose goal is to estimate the best threshold for pruning the edges of the graph. We discuss how the QSL module can be effectively implemented using Learning to Rank (L2R) techniques. Experiments on a real-world search engine log show that query similarity functions learned using L2R lead to better performing GQC implementations when compared to similarity functions induced by other state-of-the-art machine learning solutions, such as logistic regression and decision trees.

References

[1]

P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, and S. Vigna. The query-flow graph: model and applications. In CIKM'08, pages 609--618. ACM, 2008.

Digital Library

[2]

A. Broder. A taxonomy of web search. SIGIR Forum, 36:3--10, September 2002.

Digital Library

[3]

G. Capannini, C. Lucchese, F. M. Nardini, S. Orlando, R. Perego, and N. Tonellotto. Quality versus efficiency in document scoring with learning-to-rank models. Information Processing & Management, 2016.

Digital Library

[4]

D. Donato, F. Bonchi, T. Chi, and Y. Maarek. Do you want to take notes? identifying research missions in yahoo! search pad. In WWW '10, pages 321--330. ACM, 2010.

Digital Library

[5]

J. H. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics, pages 1189--1232, 2001.

[6]

J. Guo, X. Cheng, G. Xu, and X. Zhu. Intent-aware query similarity. In CIKM '11, pages 259--268. ACM, 2011.

Digital Library

[7]

M. Hagen, J. Gomoll, A. Beyer, and B. Stein. From search session detection to search mission detection. In OAIR'13, pages 85--92, 2013.

Digital Library

[8]

D. He, A. Göker, and D. J. Harper. Combining evidence for automatic web session identification. IP&M, 38:727--742, September 2002.

Digital Library

[9]

R. Jones and K. L. Klinkner. Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In CIKM'08. ACM, 2008.

Digital Library

[10]

A. Kotov, P. N. Bennett, R. W. White, S. T. Dumais, and J. Teevan. Modeling and analysis of cross-session search tasks. In SIGIR'11, pages 5--14. ACM, 2011.

Digital Library

[11]

H. Li. A short introduction to learning to rank. IEICE Transactions, 94-D(10):1854--1862, 2011.

[12]

L. Li, H. Deng, A. Dong, Y. Chang, and H. Zha. Identifying and labeling search tasks via query-based hawkes processes. In KDD'14, pages 731--740. ACM, 2014.

Digital Library

[13]

L. Li, H. Deng, Y. He, A. Dong, Y. Chang, and H. Zha. Behavior driven topic transition for search task identification. In WWW '16, pages 555--565. ACM, 2016.

Digital Library

[14]

T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):225--331, March 2009.

Digital Library

[15]

C. Lucchese, S. Orlando, R. Perego, F. Silvestri, and G. Tolomei. Identifying task-based sessions in search engine query logs. In WSDM'11, pages 277--286. ACM.

Digital Library

[16]

C. Lucchese, S. Orlando, R. Perego, F. Silvestri, and G. Tolomei. Discovering user tasks in long-term web search engine logs. ACM TOIS, 31(3):1--43, July 2013.

Digital Library

[17]

C. Lucchese, S. Orlando, R. Perego, F. Silvestri, and G. Tolomei. Modeling and predicting the task-by-task behavior of search engine users. In OAIR'13, 2013.

Digital Library

[18]

C. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, New York City, NY, USA, 2008.

Digital Library

[19]

Q. Mei, K. Klinkner, R. Kumar, and A. Tomkins. An analysis framework for search sequences. In CIKM '09, pages 1991--1994, New York City, NY, USA, 2009. ACM.

Digital Library

[20]

T. Pang-Ning, M. Steinbach, V. Kumar, et al. Introduction to data mining. In Library of Congress, page 74, 2006.

[21]

F. Radlinski and T. Joachims. Query chains: learning to rank from implicit feedback. In KDD '05, pages 239--248. ACM, 2005.

Digital Library

[22]

K. Raman, P. N. Bennett, and K. Collins-Thompson. Toward whole-session relevance: exploring intrinsic diversity in web search. In SIGIR'13. ACM, 2013.

Digital Library

[23]

L. Rokach and O. Maimon. Top-down induction of decision trees classifiers - a survey. IEEE Transactions on Systems, Man, and Cybernetics (Part C), 35(4):476--487, November 2005.

Digital Library

[24]

F. Silvestri. Mining query logs: Turning search usage data into knowledge. Foundations and Trends in Information Retrieval, 1(1-2):1--174, January 2010.

Digital Library

[25]

A. Spink, M. Park, B. J. Jansen, and J. Pedersen. Multitasking during web search sessions. IP&M, 42(1):264--275, January 2006.

Digital Library

[26]

H. Wang, Y. Song, M.-W. Chang, X. He, R. W. White, and W. Chu. Learning to extract cross-session search tasks. In WWW'13, pages 1353--1364. ACM, 2013.

Digital Library

[27]

Q. Wu, C. Burges, K. Svore, and J. Gao. Adapting boosting for information retrieval measures. Information Retrieval, 2010.

Digital Library

Cited By

Garigliotti DBalog KHose KBjerva J(2023)Recommending tasks based on search queries and missionsNatural Language Engineering10.1017/S1351324923000219(1-25)Online publication date: 17-May-2023
https://doi.org/10.1017/S1351324923000219
Drutsa AGusev GSerdyukov P(2017)Periodicity in User Engagement with a Search Engine and Its Application to Online Controlled ExperimentsACM Transactions on the Web10.1145/285682211:2(1-35)Online publication date: 14-Apr-2017
https://dl.acm.org/doi/10.1145/2856822

Recommendations

Reranking search results for sparse queries
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

It is well known that clickthrough data can be used to improve the effectiveness of search results: broadly speaking, a query's past clicks are a predictor of future clicks on documents. However, when a new or unusual query appears, or when a system is ...
Search Engine Optimization to detect user's intent
BDCA'17: Proceedings of the 2nd international Conference on Big Data, Cloud and Applications

The major evolution of the search engine is to understand the user's query and user's intent. The other size change and you all know, is the evolution of the mobile query. Indeed, research on search engines is trying today to bring results of research ...
Learning to suggest: a machine learning framework for ranking query suggestions
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

We consider the task of suggesting related queries to users after they issue their initial query to a web search engine. We propose a machine learning approach to learn the probability that a user may find a follow-up query both useful and relevant, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICTIR '16: Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval

September 2016

318 pages

ISBN:9781450344975

DOI:10.1145/2970398

General Chairs:
Ben Carterette
University of Delaware, USA
,
Hui Fang
University of Delaware, USA
,
Program Chairs:
Mounia Lalmas
Yahoo! Labs, UK
,
Jian-Yun Nie
University of Montreal, Canada

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 September 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICTIR '16

Sponsor:

SIGIR

ICTIR '16: ACM SIGIR International Conference on the Theory of Information Retrieval

September 12 - 16, 2016

Delaware, Newark, USA

Acceptance Rates

ICTIR '16 Paper Acceptance Rate 41 of 79 submissions, 52%;

Overall Acceptance Rate 235 of 527 submissions, 45%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
145
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Garigliotti DBalog KHose KBjerva J(2023)Recommending tasks based on search queries and missionsNatural Language Engineering10.1017/S1351324923000219(1-25)Online publication date: 17-May-2023
https://doi.org/10.1017/S1351324923000219
Drutsa AGusev GSerdyukov P(2017)Periodicity in User Engagement with a Search Engine and Its Application to Online Controlled ExperimentsACM Transactions on the Web10.1145/285682211:2(1-35)Online publication date: 14-Apr-2017
https://dl.acm.org/doi/10.1145/2856822

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents