Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Context-Driven Interactive Query Simulations Based on Generative Large Language Models

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2024)

Abstract

Simulating user interactions enables a more user-oriented evaluation of information retrieval (IR) systems. While user simulations are cost-efficient and reproducible, many approaches often lack fidelity regarding real user behavior. Most notably, current user models neglect the user’s context, which is the primary driver of perceived relevance and the interactions with the search results. To this end, this work introduces the simulation of context-driven query reformulations. The proposed query generation methods build upon recent Large Language Model (LLM) approaches and consider the user’s context throughout the simulation of a search session. Compared to simple context-free query generation approaches, these methods show better effectiveness and allow the simulation of more efficient IR sessions. Similarly, our evaluations consider more interaction context than current session-based measures and reveal interesting complementary insights in addition to the established evaluation protocols. We conclude with directions for future work and provide an entirely open experimental setup.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/irgroup/SUIR.

  2. 2.

    https://platform.openai.com/docs/model-index-for-researchers.

  3. 3.

    https://catalog.ldc.upenn.edu/LDC2008T19.

  4. 4.

    https://trec.nist.gov/data/wapost/.

  5. 5.

    https://github.com/terrierteam/pyterrier_doc2query.

References

  1. Alaofi, M., Gallagher, L., Sanderson, M., Scholer, F., Thomas, P.: Can generative LLMs create query variants for test collections? An exploratory study. In: SIGIR, pp. 1869–1873. ACM (2023)

    Google Scholar 

  2. Allan, J., Harman, D., Kanoulas, E., Li, D., Gysel, C.V., Voorhees, E.M.: TREC 2017 common core track overview. In: TREC. NIST Special Publication 500-324. National Institute of Standards and Technology (NIST) (2017)

    Google Scholar 

  3. Azzopardi, L., Järvelin, K., Kamps, J., Smucker, M.D.: Report on the SIGIR 2010 workshop on the simulation of interaction. SIGIR Forum 44(2), 35–47 (2010)

    Article  Google Scholar 

  4. Balog, K., Maxwell, D., Thomas, P., Zhang, S.: Sim4IR: the SIGIR 2021 workshop on simulation for information retrieval evaluation. In: SIGIR, pp. 2697–2698. ACM (2021)

    Google Scholar 

  5. Balog, K., Zhai, C.: User simulation for evaluating information access systems. CoRR abs/2306.08550 (2023)

    Google Scholar 

  6. Baskaya, F., Keskustalo, H., Järvelin, K.: Modeling behavioral factors in interactive information retrieval. In: He, Q., Iyengar, A., Nejdl, W., Pei, J., Rastogi, R. (eds.) 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013, San Francisco, CA, USA, 27 October–1 November 2013, pp. 2297–2302. ACM (2013). https://doi.org/10.1145/2505515.2505660

  7. Breuer, T., Fuhr, N., Schaer, P.: Validating simulations of user query variants. In: Hagen, M., et al. (eds.) ECIR 2022. LNCS, vol. 13185, pp. 80–94. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-99736-6_6

    Chapter  Google Scholar 

  8. Breuer, T., Fuhr, N., Schaer, P.: Validating synthetic usage data in living lab environments. J. Data Inf. Qual. (2023, accepted). https://doi.org/10.1145/3623640

  9. Carterette, B., Bah, A., Zengin, M.: Dynamic test collections for retrieval evaluation. In: Allan, J., Croft, W.B., de Vries, A.P., Zhai, C. (eds.) Proceedings of the 2015 International Conference on the Theory of Information Retrieval, ICTIR 2015, Northampton, Massachusetts, USA, 27–30 September 2015, pp. 91–100. ACM (2015). https://doi.org/10.1145/2808194.2809470

  10. Engelmann, B., Breuer, T., Schaer, P.: Simulating users in interactive web table retrieval. In: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, CIKM 2023, pp. 3875–3879. Association for Computing Machinery, New York (2023). https://doi.org/10.1145/3583780.3615187

  11. Günther, S., Hagen, M.: Assessing query suggestions for search session simulation. In: Sim4IR: The SIGIR 2021 Workshop on Simulation for Information Retrieval Evaluation (2021). https://ceur-ws.org/Vol-2911/paper6.pdf

  12. Hagen, M., Michel, M., Stein, B.: Simulating ideal and average users. In: Ma, S., Wen, J.-R., Liu, Y., Dou, Z., Zhang, M., Chang, Y., Zhao, X. (eds.) AIRS 2016. LNCS, vol. 9994, pp. 138–154. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48051-0_11

    Chapter  Google Scholar 

  13. Hersh, W.R., et al.: Do batch and user evaluation give the same results? In: Yannakoudakis, E.J., Belkin, N.J., Ingwersen, P., Leong, M.K. (eds.) Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2000, 24–28 July 2000, Athens, Greece, pp. 17–24. ACM (2000). https://doi.org/10.1145/345508.345539

  14. Hofmann, K., Schuth, A., Whiteson, S., de Rijke, M.: Reusing historical interaction data for faster online learning to rank for IR. In: Leonardi, S., Panconesi, A., Ferragina, P., Gionis, A. (eds.) Sixth ACM International Conference on Web Search and Data Mining, WSDM 2013, Rome, Italy, 4–8 February 2013, pp. 183–192. ACM (2013). https://doi.org/10.1145/2433396.2433419

  15. Järvelin, K., Kekäläinen, J.: IR evaluation methods for retrieving highly relevant documents. In: SIGIR, pp. 41–48. ACM (2000)

    Google Scholar 

  16. Järvelin, K., Price, S.L., Delcambre, L.M.L., Nielsen, M.L.: Discounted cumulated gain based evaluation of multiple-query IR sessions. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 4–15. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78646-7_4

    Chapter  Google Scholar 

  17. Krebs, J.R., Ryan, J.C., Charnov, E.L.: Hunting by expectation or optimal foraging? A study of patch use by chickadees. Anim. Behav. 22, 953–964 (1974). https://doi.org/10.1016/0003-3472(74)90018-9

    Article  Google Scholar 

  18. Lipani, A., Carterette, B., Yilmaz, E.: From a user model for query sessions to session rank biased precision (sRBP). In: Fang, Y., Zhang, Y., Allan, J., Balog, K., Carterette, B., Guo, J. (eds.) Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval, ICTIR 2019, Santa Clara, CA, USA, 2–5 October 2019, pp. 109–116. ACM (2019). https://doi.org/10.1145/3341981.3344216

  19. MacAvaney, S., Yates, A., Feldman, S., Downey, D., Cohan, A., Goharian, N.: Simplified data wrangling with ir_datasets. In: Diaz, F., Shah, C., Suel, T., Castells, P., Jones, R., Sakai, T. (eds.) The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2021, Virtual Event, Canada, 11–15 July 2021, pp. 2429–2436. ACM (2021). https://doi.org/10.1145/3404835.3463254

  20. Macdonald, C., Tonellotto, N., MacAvaney, S., Ounis, I.: PyTerrier: declarative experimentation in python from BM25 to dense retrieval. In: CIKM, pp. 4526–4533. ACM (2021)

    Google Scholar 

  21. Mackie, I., Chatterjee, S., Dalton, J.: Generative relevance feedback with large language models. In: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2023, July 2023, pp. 2026–2031. ACM (2023). https://doi.org/10.1145/3539618.3591992

  22. Maxwell, D.: Modelling search and stopping in interactive information retrieval. Ph.D. thesis, University of Glasgow, UK (2019)

    Google Scholar 

  23. Maxwell, D., Azzopardi, L.: Simulating interactive information retrieval: SimIIR: a framework for the simulation of interaction. In: SIGIR, pp. 1141–1144. ACM (2016)

    Google Scholar 

  24. Maxwell, D., Azzopardi, L., Järvelin, K., Keskustalo, H.: Searching and stopping: an analysis of stopping rules and strategies. In: CIKM, pp. 313–322. ACM (2015)

    Google Scholar 

  25. Moffat, A., Zobel, J.: Rank-biased precision for measurement of retrieval effectiveness. ACM Trans. Inf. Syst. 27(1), 2:1–2:27 (2008)

    Google Scholar 

  26. Nogueira, R.F., Jiang, Z., Pradeep, R., Lin, J.: Document ranking with a pretrained sequence-to-sequence model. In: EMNLP (Findings). Findings of ACL, EMNLP 2020, pp. 708–718. Association for Computational Linguistics (2020)

    Google Scholar 

  27. Nogueira, R.F., Yang, W., Lin, J., Cho, K.: Document expansion by query prediction. CoRR abs/1904.08375 (2019)

    Google Scholar 

  28. Pääkkönen, T., Kekäläinen, J., Keskustalo, H., Azzopardi, L., Maxwell, D., Järvelin, K.: Validating simulated interaction for retrieval evaluation. Inf. Retr. J. 20(4), 338–362 (2017)

    Article  Google Scholar 

  29. Robertson, S.E., Zaragoza, H.: The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retr. 3(4), 333–389 (2009)

    Article  Google Scholar 

  30. Scells, H., Zhuang, S., Zuccon, G.: Reduce, reuse, recycle: green information retrieval research. In: SIGIR, pp. 2825–2837. ACM (2022)

    Google Scholar 

  31. Tague, J., Nelson, M.J.: Simulation of user judgments in bibliographic retrieval systems. In: Crouch, C.J. (ed.) Theoretical Issues in Information Retrieval, Proceedings of the Fourth International Conference on Information Storage and Retrieval, Oakland, California, USA, 31 May–2 June 1981, pp. 66–71. ACM (1981). https://doi.org/10.1145/511754.511764

  32. Tague, J., Nelson, M.J., Wu, H.: Problems in the simulation of bibliographic retrieval systems. In: Oddy, R.N., Robertson, S.E., van Rijsbergen, C.J., Williams, P.W. (eds.) Information Retrieval Research, Proceedings of the Joint ACM/BCS Symposium in Information Storage and Retrieval, Cambridge, UK, June 1980, pp. 236–255. Butterworths (1980). https://dl.acm.org/citation.cfm?id=636684

  33. Turpin, A., Hersh, W.R.: Why batch and user evaluations do not give the same results. In: Croft, W.B., Harper, D.J., Kraft, D.H., Zobel, J. (eds.) Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2001, 9–13 September 2001, New Orleans, Louisiana, USA, pp. 225–231. ACM (2001). https://doi.org/10.1145/383952.383992

  34. Voorhees, E.M., Ellis, A. (eds.): Proceedings of the Twenty-Seventh Text REtrieval Conference, TREC 2018, Gaithersburg, Maryland, USA, 14–16 November 2018, NIST Special Publication, 500-331. National Institute of Standards and Technology (NIST) (2018). https://trec.nist.gov/pubs/trec27/trec2018.html

  35. Wang, L., Yang, N., Wei, F.: Query2doc: query expansion with large language models. In: Conference on Empirical Methods in Natural Language Processing, pp. 9414–9423. Association for Computational Linguistics (2023). https://api.semanticscholar.org/CorpusID:257505063

  36. Wang, X., MacAvaney, S., Macdonald, C., Ounis, I.: Generative query reformulation for effective adhoc search (2023)

    Google Scholar 

  37. Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Liu, Q., Schlangen, D. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2020, Demos, Online, 16–20 November 2020, pp. 38–45. Association for Computational Linguistics (2020). https://doi.org/10.18653/V1/2020.EMNLP-DEMOS.6

  38. Zerhoudi, S., et al.: The SimIIR 2.0 framework: user types, Markov model-based interaction simulation, and advanced query generation. In: CIKM, pp. 4661–4666. ACM (2022)

    Google Scholar 

  39. Zhang, Y., Liu, X., Zhai, C.: Information retrieval evaluation as search simulation: a general formal framework for IR evaluation. In: ICTIR, pp. 193–200. ACM (2017)

    Google Scholar 

Download references

Acknowledgements

This work was supported by Klaus Tschira Stiftung (JoIE - 00.003.2020) and Deutsche Forschungsgemeinschaft (RESIRE - 509543643).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Björn Engelmann .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Engelmann, B., Breuer, T., Friese, J.I., Schaer, P., Fuhr, N. (2024). Context-Driven Interactive Query Simulations Based on Generative Large Language Models. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14609. Springer, Cham. https://doi.org/10.1007/978-3-031-56060-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-56060-6_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-56059-0

  • Online ISBN: 978-3-031-56060-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics