Context-Driven Interactive Query Simulations Based on Generative Large Language Models

Engelmann, Björn; Breuer, Timo; Friese, Jana Isabelle; Schaer, Philipp; Fuhr, Norbert

doi:10.1007/978-3-031-56060-6_12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14609))

Included in the following conference series:

European Conference on Information Retrieval

1065 Accesses

Abstract

Simulating user interactions enables a more user-oriented evaluation of information retrieval (IR) systems. While user simulations are cost-efficient and reproducible, many approaches often lack fidelity regarding real user behavior. Most notably, current user models neglect the user’s context, which is the primary driver of perceived relevance and the interactions with the search results. To this end, this work introduces the simulation of context-driven query reformulations. The proposed query generation methods build upon recent Large Language Model (LLM) approaches and consider the user’s context throughout the simulation of a search session. Compared to simple context-free query generation approaches, these methods show better effectiveness and allow the simulation of more efficient IR sessions. Similarly, our evaluations consider more interaction context than current session-based measures and reveal interesting complementary insights in addition to the established evaluation protocols. We conclude with directions for future work and provide an entirely open experimental setup.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Evaluating Simulated User Interaction and Search Behaviour

Validating simulated interaction for retrieval evaluation

Article 06 May 2017

Validating Simulations of User Query Variants

Notes

References

Alaofi, M., Gallagher, L., Sanderson, M., Scholer, F., Thomas, P.: Can generative LLMs create query variants for test collections? An exploratory study. In: SIGIR, pp. 1869–1873. ACM (2023)
Google Scholar
Allan, J., Harman, D., Kanoulas, E., Li, D., Gysel, C.V., Voorhees, E.M.: TREC 2017 common core track overview. In: TREC. NIST Special Publication 500-324. National Institute of Standards and Technology (NIST) (2017)
Google Scholar
Azzopardi, L., Järvelin, K., Kamps, J., Smucker, M.D.: Report on the SIGIR 2010 workshop on the simulation of interaction. SIGIR Forum 44(2), 35–47 (2010)
Article Google Scholar
Balog, K., Maxwell, D., Thomas, P., Zhang, S.: Sim4IR: the SIGIR 2021 workshop on simulation for information retrieval evaluation. In: SIGIR, pp. 2697–2698. ACM (2021)
Google Scholar
Balog, K., Zhai, C.: User simulation for evaluating information access systems. CoRR abs/2306.08550 (2023)
Google Scholar
Baskaya, F., Keskustalo, H., Järvelin, K.: Modeling behavioral factors in interactive information retrieval. In: He, Q., Iyengar, A., Nejdl, W., Pei, J., Rastogi, R. (eds.) 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013, San Francisco, CA, USA, 27 October–1 November 2013, pp. 2297–2302. ACM (2013). https://doi.org/10.1145/2505515.2505660
Breuer, T., Fuhr, N., Schaer, P.: Validating simulations of user query variants. In: Hagen, M., et al. (eds.) ECIR 2022. LNCS, vol. 13185, pp. 80–94. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-99736-6_6
Chapter Google Scholar
Breuer, T., Fuhr, N., Schaer, P.: Validating synthetic usage data in living lab environments. J. Data Inf. Qual. (2023, accepted). https://doi.org/10.1145/3623640
Carterette, B., Bah, A., Zengin, M.: Dynamic test collections for retrieval evaluation. In: Allan, J., Croft, W.B., de Vries, A.P., Zhai, C. (eds.) Proceedings of the 2015 International Conference on the Theory of Information Retrieval, ICTIR 2015, Northampton, Massachusetts, USA, 27–30 September 2015, pp. 91–100. ACM (2015). https://doi.org/10.1145/2808194.2809470
Engelmann, B., Breuer, T., Schaer, P.: Simulating users in interactive web table retrieval. In: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, CIKM 2023, pp. 3875–3879. Association for Computing Machinery, New York (2023). https://doi.org/10.1145/3583780.3615187
Günther, S., Hagen, M.: Assessing query suggestions for search session simulation. In: Sim4IR: The SIGIR 2021 Workshop on Simulation for Information Retrieval Evaluation (2021). https://ceur-ws.org/Vol-2911/paper6.pdf
Hagen, M., Michel, M., Stein, B.: Simulating ideal and average users. In: Ma, S., Wen, J.-R., Liu, Y., Dou, Z., Zhang, M., Chang, Y., Zhao, X. (eds.) AIRS 2016. LNCS, vol. 9994, pp. 138–154. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48051-0_11
Chapter Google Scholar
Hersh, W.R., et al.: Do batch and user evaluation give the same results? In: Yannakoudakis, E.J., Belkin, N.J., Ingwersen, P., Leong, M.K. (eds.) Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2000, 24–28 July 2000, Athens, Greece, pp. 17–24. ACM (2000). https://doi.org/10.1145/345508.345539
Hofmann, K., Schuth, A., Whiteson, S., de Rijke, M.: Reusing historical interaction data for faster online learning to rank for IR. In: Leonardi, S., Panconesi, A., Ferragina, P., Gionis, A. (eds.) Sixth ACM International Conference on Web Search and Data Mining, WSDM 2013, Rome, Italy, 4–8 February 2013, pp. 183–192. ACM (2013). https://doi.org/10.1145/2433396.2433419
Järvelin, K., Kekäläinen, J.: IR evaluation methods for retrieving highly relevant documents. In: SIGIR, pp. 41–48. ACM (2000)
Google Scholar
Järvelin, K., Price, S.L., Delcambre, L.M.L., Nielsen, M.L.: Discounted cumulated gain based evaluation of multiple-query IR sessions. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 4–15. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78646-7_4
Chapter Google Scholar
Krebs, J.R., Ryan, J.C., Charnov, E.L.: Hunting by expectation or optimal foraging? A study of patch use by chickadees. Anim. Behav. 22, 953–964 (1974). https://doi.org/10.1016/0003-3472(74)90018-9
Article Google Scholar
Lipani, A., Carterette, B., Yilmaz, E.: From a user model for query sessions to session rank biased precision (sRBP). In: Fang, Y., Zhang, Y., Allan, J., Balog, K., Carterette, B., Guo, J. (eds.) Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval, ICTIR 2019, Santa Clara, CA, USA, 2–5 October 2019, pp. 109–116. ACM (2019). https://doi.org/10.1145/3341981.3344216
MacAvaney, S., Yates, A., Feldman, S., Downey, D., Cohan, A., Goharian, N.: Simplified data wrangling with ir_datasets. In: Diaz, F., Shah, C., Suel, T., Castells, P., Jones, R., Sakai, T. (eds.) The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2021, Virtual Event, Canada, 11–15 July 2021, pp. 2429–2436. ACM (2021). https://doi.org/10.1145/3404835.3463254
Macdonald, C., Tonellotto, N., MacAvaney, S., Ounis, I.: PyTerrier: declarative experimentation in python from BM25 to dense retrieval. In: CIKM, pp. 4526–4533. ACM (2021)
Google Scholar
Mackie, I., Chatterjee, S., Dalton, J.: Generative relevance feedback with large language models. In: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2023, July 2023, pp. 2026–2031. ACM (2023). https://doi.org/10.1145/3539618.3591992
Maxwell, D.: Modelling search and stopping in interactive information retrieval. Ph.D. thesis, University of Glasgow, UK (2019)
Google Scholar
Maxwell, D., Azzopardi, L.: Simulating interactive information retrieval: SimIIR: a framework for the simulation of interaction. In: SIGIR, pp. 1141–1144. ACM (2016)
Google Scholar
Maxwell, D., Azzopardi, L., Järvelin, K., Keskustalo, H.: Searching and stopping: an analysis of stopping rules and strategies. In: CIKM, pp. 313–322. ACM (2015)
Google Scholar
Moffat, A., Zobel, J.: Rank-biased precision for measurement of retrieval effectiveness. ACM Trans. Inf. Syst. 27(1), 2:1–2:27 (2008)
Google Scholar
Nogueira, R.F., Jiang, Z., Pradeep, R., Lin, J.: Document ranking with a pretrained sequence-to-sequence model. In: EMNLP (Findings). Findings of ACL, EMNLP 2020, pp. 708–718. Association for Computational Linguistics (2020)
Google Scholar
Nogueira, R.F., Yang, W., Lin, J., Cho, K.: Document expansion by query prediction. CoRR abs/1904.08375 (2019)
Google Scholar
Pääkkönen, T., Kekäläinen, J., Keskustalo, H., Azzopardi, L., Maxwell, D., Järvelin, K.: Validating simulated interaction for retrieval evaluation. Inf. Retr. J. 20(4), 338–362 (2017)
Article Google Scholar
Robertson, S.E., Zaragoza, H.: The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retr. 3(4), 333–389 (2009)
Article Google Scholar
Scells, H., Zhuang, S., Zuccon, G.: Reduce, reuse, recycle: green information retrieval research. In: SIGIR, pp. 2825–2837. ACM (2022)
Google Scholar
Tague, J., Nelson, M.J.: Simulation of user judgments in bibliographic retrieval systems. In: Crouch, C.J. (ed.) Theoretical Issues in Information Retrieval, Proceedings of the Fourth International Conference on Information Storage and Retrieval, Oakland, California, USA, 31 May–2 June 1981, pp. 66–71. ACM (1981). https://doi.org/10.1145/511754.511764
Tague, J., Nelson, M.J., Wu, H.: Problems in the simulation of bibliographic retrieval systems. In: Oddy, R.N., Robertson, S.E., van Rijsbergen, C.J., Williams, P.W. (eds.) Information Retrieval Research, Proceedings of the Joint ACM/BCS Symposium in Information Storage and Retrieval, Cambridge, UK, June 1980, pp. 236–255. Butterworths (1980). https://dl.acm.org/citation.cfm?id=636684
Turpin, A., Hersh, W.R.: Why batch and user evaluations do not give the same results. In: Croft, W.B., Harper, D.J., Kraft, D.H., Zobel, J. (eds.) Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2001, 9–13 September 2001, New Orleans, Louisiana, USA, pp. 225–231. ACM (2001). https://doi.org/10.1145/383952.383992
Voorhees, E.M., Ellis, A. (eds.): Proceedings of the Twenty-Seventh Text REtrieval Conference, TREC 2018, Gaithersburg, Maryland, USA, 14–16 November 2018, NIST Special Publication, 500-331. National Institute of Standards and Technology (NIST) (2018). https://trec.nist.gov/pubs/trec27/trec2018.html
Wang, L., Yang, N., Wei, F.: Query2doc: query expansion with large language models. In: Conference on Empirical Methods in Natural Language Processing, pp. 9414–9423. Association for Computational Linguistics (2023). https://api.semanticscholar.org/CorpusID:257505063
Wang, X., MacAvaney, S., Macdonald, C., Ounis, I.: Generative query reformulation for effective adhoc search (2023)
Google Scholar
Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Liu, Q., Schlangen, D. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2020, Demos, Online, 16–20 November 2020, pp. 38–45. Association for Computational Linguistics (2020). https://doi.org/10.18653/V1/2020.EMNLP-DEMOS.6
Zerhoudi, S., et al.: The SimIIR 2.0 framework: user types, Markov model-based interaction simulation, and advanced query generation. In: CIKM, pp. 4661–4666. ACM (2022)
Google Scholar
Zhang, Y., Liu, X., Zhai, C.: Information retrieval evaluation as search simulation: a general formal framework for IR evaluation. In: ICTIR, pp. 193–200. ACM (2017)
Google Scholar

Download references

Acknowledgements

This work was supported by Klaus Tschira Stiftung (JoIE - 00.003.2020) and Deutsche Forschungsgemeinschaft (RESIRE - 509543643).

Author information

Authors and Affiliations

TH Köln, University of Applied Sciences, Cologne, Germany
Björn Engelmann, Timo Breuer & Philipp Schaer
University of Duisburg-Essen, Duisburg, Germany
Jana Isabelle Friese & Norbert Fuhr

Authors

Björn Engelmann
View author publications
You can also search for this author in PubMed Google Scholar
Timo Breuer
View author publications
You can also search for this author in PubMed Google Scholar
Jana Isabelle Friese
View author publications
You can also search for this author in PubMed Google Scholar
Philipp Schaer
View author publications
You can also search for this author in PubMed Google Scholar
Norbert Fuhr
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Björn Engelmann .

Editor information

Editors and Affiliations

Georgetown University, Washington, WA, USA
Nazli Goharian
University of Pisa, PISA, Pisa, Italy
Nicola Tonellotto
King's College London, London, UK
Yulan He
University College London, London, UK
Aldo Lipani
University of Glasgow, Glasgow, UK
Graham McDonald
University of Glasgow, Glasgow, UK
Craig Macdonald
University of Glasgow, Glasgow, UK
Iadh Ounis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Engelmann, B., Breuer, T., Friese, J.I., Schaer, P., Fuhr, N. (2024). Context-Driven Interactive Query Simulations Based on Generative Large Language Models. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14609. Springer, Cham. https://doi.org/10.1007/978-3-031-56060-6_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-56060-6_12
Published: 16 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56059-0
Online ISBN: 978-3-031-56060-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Context-Driven Interactive Query Simulations Based on Generative Large Language Models