research-article

Open access

Evaluating aggregated search using interleaving

Authors:

Aleksandr Chuklin,

Pavel Serdyukov,

Maarten de RijkeAuthors Info & Claims

CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

Pages 669 - 678

https://doi.org/10.1145/2505515.2505698

Published: 27 October 2013 Publication History

Abstract

A result page of a modern web search engine is often much more complicated than a simple list of "ten blue links." In particular, a search engine may combine results from different sources (e.g., Web, News, and Images), and display these as grouped results to provide a better user experience. Such a system is called an aggregated or federated search system.

Because search engines evolve over time, their results need to be constantly evaluated. However, one of the most efficient and widely used evaluation methods, interleaving, cannot be directly applied to aggregated search systems, as it ignores the need to group results originating from the same source (vertical results).

We propose an interleaving algorithm that allows comparisons of search engine result pages containing grouped vertical documents. We compare our algorithm to existing interleaving algorithms and other evaluation methods (such as A/B-testing), both on real-life click log data and in simulation experiments. We find that our algorithm allows us to perform unbiased and accurate interleaved comparisons that are comparable to conventional evaluation techniques. We also show that our interleaving algorithm produces a ranking that does not substantially alter the user experience, while being sensitive to changes in both the vertical result block and the non-vertical document rankings. All this makes our proposed interleaving algorithm an essential tool for comparing IR systems with complex aggregated pages.

References

[1]

R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM. ACM, 2009.

Digital Library

[2]

J. Arguello, F. Diaz, J. Callan, and B. Carterette. A methodology for evaluating aggregated search results. In ECIR. Springer, 2011.

Digital Library

[3]

O. Chapelle, D. Metzler, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In CIKM. ACM, 2009.

Digital Library

[4]

O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. Large-scale validation and analysis of interleaved search evaluation. ACM TOIS, 2012.

Digital Library

[5]

D. Chen, W. Chen, and H. Wang. Beyond ten blue links: enabling user click modeling in federated web search. In WSDM. ACM, 2012.

Digital Library

[6]

A. Chuklin, P. Serdyukov, and M. de Rijke. Using intent information to model user behavior in diversified search. In ECIR, 2013.

Digital Library

[7]

C. Clarke, M. Kolla, and G. Cormack. Novelty and diversity in information retrieval evaluation. In SIGIR. ACM, 2008.

Digital Library

[8]

C. W. Cleverdon, J. Mills, and M. Keen. Factors determining the performance of indexing systems. Techn. report, ASLIB Cranfield project, 1966.

[9]

S. Dumais, E. Cutrell, and H. Chen. Optimizing search by showing results in context. In CHI, 2001.

Digital Library

[10]

J. He, C. Zhai, and X. Li. Evaluation of methods for relative comparison of retrieval systems based on click throughs. In CIKM. ACM, 2009.

Digital Library

[11]

K. Hofmann, S. Whiteson, and M. de Rijke. A probabilistic method for inferring preferences from clicks. In CIKM. ACM, 2011.

Digital Library

[12]

K. Hofmann, S. Whiteson, and M. Rijke. Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval. Information Retrieval, 16(1), Apr. 2012.

Digital Library

[13]

K. Hofmann, S. Whiteson, and M. de Rijke. Fidelity, soundness, and efficiency of interleaved comparison methods. ACM Trans. Inf. Syst., 31(4), Oct. 2013.

Digital Library

[14]

T. Joachims. Optimizing search engines using clickthrough data. In KDD. ACM, 2002.

Digital Library

[15]

T. Joachims. Evaluating retrieval performance using clickthrough data. Text Mining, 2003.

[16]

T. Joachims, L. Granka, B. Pan, H. Hembrooke, and G. Gay. Accurately interpreting clickthrough data as implicit feedback. In SIGIR. ACM, 2005.

Digital Library

[17]

A. K. Ponnuswami, K. Pattabiraman, Q. Wu, R. Gilad-Bachrach, and T. Kanungo. On composition of a federated web search result page: using online users to provide pairwise preference for heterogeneous verticals. In WSDM. ACM, 2011.

Digital Library

[18]

F. Radlinski and N. Craswell. Optimized interleaving for online retrieval evaluation. In WSDM, 2013.

Digital Library

[19]

F. Radlinski, M. Kurup, and T. Joachims. How does clickthrough data reflect retrieval quality? In CIKM. ACM, 2008.

Digital Library

[20]

A. Schuth, K. Hofmann, S. Whiteson, and M. de Rijke. Lerot: an Online Learning to Rank Framework. In Living Labs workshop at CIKM. ACM, 2013.

Digital Library

[21]

J. Seo, W. B. Croft, K. H. Kim, and J. H. Lee. Smoothing click counts for aggregated vertical search. Advances in Information Retrieval, 2011.

Digital Library

[22]

K. Zhou, R. Cummins, M. Lalmas, and J. M. Jose. Evaluating Aggregated Search Pages. In SIGIR, 2012.

Digital Library

Cited By

Breuer TFuhr NSchaer P(2023)Validating Synthetic Usage Data in Living Lab EnvironmentsJournal of Data and Information Quality10.1145/3623640Online publication date: 24-Sep-2023
https://dl.acm.org/doi/10.1145/3623640
Lasri SNfaoui E(2022)Ranking Task in RAS: A Comparative Study of Learning to Rank Algorithms and Interleaving MethodsDigital Technologies and Applications10.1007/978-3-031-01942-5_16(158-168)Online publication date: 8-May-2022
https://doi.org/10.1007/978-3-031-01942-5_16
Zhou JZahiri SHughes SAl Jadda KKallumadi SAgichtein EDiaz FShah CSuel TCastells PJones RSakai T(2021)De-Biased Modeling of Search Click Behavior with Reinforcement LearningProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3463228(1637-1641)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3404835.3463228
Show More Cited By

Index Terms

Evaluating aggregated search using interleaving
1. Information systems
  1. Information retrieval

Recommendations

A Comparative Analysis of Interleaving Methods for Aggregated Search

A result page of a modern search engine often goes beyond a simple list of “10 blue links.” Many specific user needs (e.g., News, Image, Video) are addressed by so-called aggregated or vertical search solutions: specially presented documents, often ...
Aggregated Search and Interleaving Methods: A survey
BDAW '16: Proceedings of the International Conference on Big Data and Advanced Wireless Technologies

Aggregated search attempts to satisfy user's need by searching and assembling information from variety verticals and placing them into a single result page. Aggregated search has two research directions namely, cross-vertical Aggregated Search (cvAS) ...
Interest and Evaluation of Aggregated Search
WI-IAT '11: Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01

Major search engines perform what is known as Aggregated Search (AS). They integrate results coming from different vertical search engines (images, videos, news, etc.) with typical Web search results. Aggregated search is relatively new and its ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

October 2013

2612 pages

ISBN:9781450322638

DOI:10.1145/2505515

General Chairs:
Qi He
LinkedIn, USA
,
Arun Iyengar
IBM T.J. Watson Research Center, USA
,
Program Chairs:
Wolfgang Nejdl
L3S Research Center, Germany
,
Jian Pei
Simon Fraser University, Canada
,
Rajeev Rastogi
Amazon, India

Copyright © 2013 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2013

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM'13

Sponsor:

CIKM'13: 22nd ACM International Conference on Information and Knowledge Management

October 27 - November 1, 2013

California, San Francisco, USA

Acceptance Rates

CIKM '13 Paper Acceptance Rate 143 of 848 submissions, 17%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

23
Total Citations
View Citations
705
Total Downloads

Downloads (Last 12 months)92
Downloads (Last 6 weeks)20

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Breuer TFuhr NSchaer P(2023)Validating Synthetic Usage Data in Living Lab EnvironmentsJournal of Data and Information Quality10.1145/3623640Online publication date: 24-Sep-2023
https://dl.acm.org/doi/10.1145/3623640
Lasri SNfaoui E(2022)Ranking Task in RAS: A Comparative Study of Learning to Rank Algorithms and Interleaving MethodsDigital Technologies and Applications10.1007/978-3-031-01942-5_16(158-168)Online publication date: 8-May-2022
https://doi.org/10.1007/978-3-031-01942-5_16
Zhou JZahiri SHughes SAl Jadda KKallumadi SAgichtein EDiaz FShah CSuel TCastells PJones RSakai T(2021)De-Biased Modeling of Search Click Behavior with Reinforcement LearningProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3463228(1637-1641)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3404835.3463228
Lewandowski DSünkler SSchultheiß S(2020)Studies on Search: Designing Meaningful IIR Studies on Commercial Search EnginesDatenbank-Spektrum10.1007/s13222-020-00331-120:1(5-15)Online publication date: 24-Jan-2020
https://doi.org/10.1007/s13222-020-00331-1
Takanobu RZhuang THuang MFeng JTang HZheng B(2019)Aggregating E-commerce Search Results from Heterogeneous Sources via Hierarchical Reinforcement LearningThe World Wide Web Conference10.1145/3308558.3313455(1771-1781)Online publication date: 13-May-2019
https://dl.acm.org/doi/10.1145/3308558.3313455
Arguello J(2017)Aggregated SearchFoundations and Trends in Information Retrieval10.1561/150000005210:5(365-502)Online publication date: 6-Mar-2017
https://dl.acm.org/doi/10.1561/1500000052
Jiang JAllan JLim EWinslett MSanderson MFu ASun JCulpepper SLo EHo JDonato DAgrawal RZheng YCastillo CSun ATseng VLi C(2017)Adaptive Persistence for Search Effectiveness MeasuresProceedings of the 2017 ACM on Conference on Information and Knowledge Management10.1145/3132847.3133033(747-756)Online publication date: 6-Nov-2017
https://dl.acm.org/doi/10.1145/3132847.3133033
Malkevich SMarkov IMichailova Ede Rijke MKamps JKanoulas Ede Rijke MFang HYilmaz E(2017)Evaluating and Analyzing Click Simulation in Web SearchProceedings of the ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3121050.3121096(281-284)Online publication date: 1-Oct-2017
https://dl.acm.org/doi/10.1145/3121050.3121096
Ziak HKern R(2017)Evaluation of Contextualization and Diversification Approaches in Aggregated Search2017 28th International Workshop on Database and Expert Systems Applications (DEXA)10.1109/DEXA.2017.37(103-107)Online publication date: Aug-2017
https://doi.org/10.1109/DEXA.2017.37
Lasri SNfaoui E(2016)Aggregated Search and Interleaving MethodsProceedings of the International Conference on Big Data and Advanced Wireless Technologies10.1145/3010089.3010098(1-9)Online publication date: 10-Nov-2016
https://dl.acm.org/doi/10.1145/3010089.3010098
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents