Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Theoretical Analysis on the Efficiency of Interleaved Comparisons

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2023)

Abstract

This study presents a theoretical analysis on the efficiency of interleaving, an efficient online evaluation method for rankings. Although interleaving has already been applied to production systems, the source of its high efficiency has not been clarified in the literature. Therefore, this study presents a theoretical analysis on the efficiency of interleaving methods. We begin by designing a simple interleaving method similar to ordinary interleaving methods. Then, we explore a condition under which the interleaving method is more efficient than A/B testing and find that this is the case when users leave the ranking depending on the item’s relevance, a typical assumption made in click models. Finally, we perform experiments based on numerical analysis and user simulation, demonstrating that the theoretical results are consistent with the empirical results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/mpkato/interleaving.

References

  1. Allan, J., Carterette, B., Aslam, J.A., Pavlu, V., Dachev, B., Kanoulas, E.: Million query track 2007 overview. Tech. Rep., Massachusetts University Amherst Department of Computer Science (2007)

    Google Scholar 

  2. Brost, B., Cox, I.J., Seldin, Y., Lioma, C.: An improved multileaving algorithm for online ranker evaluation. In: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 745–748 (2016)

    Google Scholar 

  3. Chapelle, O., Joachims, T., Radlinski, F., Yue, Y.: Large-scale validation and analysis of interleaved search evaluation. ACM Trans. Inf. Syst. (TOIS) 30(1), 1–41 (2012)

    Article  Google Scholar 

  4. Chapelle, O., Zhang, Y.: A dynamic Bayesian network click model for web search ranking. In: Proceedings of the 18th International Conference on World Wide Web, pp. 1–10 (2009)

    Google Scholar 

  5. Chuklin, A., Markov, I., De Rijke, M.: Click models for web search. Morgan & Claypool Publishers (2015)

    Google Scholar 

  6. Clarke, C.L., Craswell, N., Soboroff, I.: Overview of the TREC 2009 web track. Tech. rep., Waterloo University (Ontario) (2009)

    Google Scholar 

  7. Craswell, N., Zoeter, O., Taylor, M., Ramsey, B.: An experimental comparison of click position-bias models. In: Proceedings of the 1st ACM International Conference on Web Search and Data Mining, pp. 87–94 (2008)

    Google Scholar 

  8. Deng, A., Xu, Y., Kohavi, R., Walker, T.: Improving the sensitivity of online controlled experiments by utilizing pre-experiment data. In: Proceedings of the 6th ACM International Conference on Web Search and Data Mining, pp. 123–132 (2013)

    Google Scholar 

  9. Dupret, G.E., Piwowarski, B.: A user browsing model to predict search engine click data from past observations. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 331–338 (2008)

    Google Scholar 

  10. Grbovic, M., Cheng, H.: Real-time personalization using embeddings for search ranking at airbnb. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 311–320. KDD 2018, Association for Computing Machinery (2018)

    Google Scholar 

  11. Guo, F., Liu, C., Wang, Y.M.: Efficient multiple-click models in web search. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pp. 124–131 (2009)

    Google Scholar 

  12. Hofmann, K., Whiteson, S., De Rijke, M.: A probabilistic method for inferring preferences from clicks. In: Proceedings of the 20th ACM International on Conference on Information and Knowledge Management, pp. 249–258 (2011)

    Google Scholar 

  13. Hofmann, K., Whiteson, S., Rijke, M.D.: Fidelity, soundness, and efficiency of interleaved comparison methods. ACM Trans. Inf. Syst. (TOIS) 31(4), 1–43 (2013)

    Article  Google Scholar 

  14. Iizuka, K., Seki, Y., Kato, M.P.: Decomposition and interleaving for variance reduction of post-click metrics. In: Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval, pp. 221–230 (2021)

    Google Scholar 

  15. Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 133–142 (2002)

    Google Scholar 

  16. Kharitonov, E., Macdonald, C., Serdyukov, P., Ounis, I.: Using historical click data to increase interleaving sensitivity. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 679–688 (2013)

    Google Scholar 

  17. Kharitonov, E., Macdonald, C., Serdyukov, P., Ounis, I.: Generalized team draft interleaving. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 773–782 (2015)

    Google Scholar 

  18. Okura, S., Tagami, Y., Ono, S., Tajima, A.: Embedding-based news recommendation for millions of users. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1933–1942. KDD 2017, Association for Computing Machinery (2017)

    Google Scholar 

  19. Oosterhuis, H., de Rijke, M.: Sensitive and scalable online evaluation with theoretical guarantees. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 77–86. CIKM 2017, Association for Computing Machinery (2017)

    Google Scholar 

  20. Oosterhuis, H., de Rijke, M.: Taking the counterfactual online: efficient and unbiased online evaluation for ranking. In: Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval, pp. 137–144 (2020)

    Google Scholar 

  21. Poyarkov, A., Drutsa, A., Khalyavin, A., Gusev, G., Serdyukov, P.: Boosted decision tree regression adjustment for variance reduction in online controlled experiments. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 235–244 (2016)

    Google Scholar 

  22. Qin, T., Liu, T.Y., Xu, J., Li, H.: LETOR: a benchmark collection for research on learning to rank for information retrieval. Inf. Retrieval 13(4), 346–374 (2010)

    Article  Google Scholar 

  23. Radlinski, F., Craswell, N.: Optimized interleaving for online retrieval evaluation. In: Proceedings of the 6th ACM International Conference on Web Search and Data Mining, pp. 245–254 (2013)

    Google Scholar 

  24. Radlinski, F., Kurup, M., Joachims, T.: How does clickthrough data reflect retrieval quality? In: Proceedings of the 17th ACM conference on Information and Knowledge Management, pp. 43–52 (2008)

    Google Scholar 

  25. Schuth, A., Sietsma, F., Whiteson, S., Lefortier, D., de Rijke, M.: Multileaved comparisons for fast online evaluation. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management, pp. 71–80 (2014)

    Google Scholar 

  26. Voorhees, E.M., Harman, D.: Overview of TREC 2003. In: TREC, pp. 1–13 (2003)

    Google Scholar 

  27. Xie, H., Aurisset, J.: Improving the sensitivity of online controlled experiments: case studies at Netflix. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 645–654 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Kojiro Iizuka , Hajime Morita or Makoto P. Kato .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Iizuka, K., Morita, H., Kato, M.P. (2023). Theoretical Analysis on the Efficiency of Interleaved Comparisons. In: Kamps, J., et al. Advances in Information Retrieval. ECIR 2023. Lecture Notes in Computer Science, vol 13980. Springer, Cham. https://doi.org/10.1007/978-3-031-28244-7_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-28244-7_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-28243-0

  • Online ISBN: 978-3-031-28244-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics