Explanations in Open User Models
Explanations in Open User Models
net/publication/381817955
CITATIONS READS
0 14
4 authors, including:
Jordan Barria-Pineda
University of Pittsburgh
30 PUBLICATIONS 305 CITATIONS
SEE PROFILE
All content following this page was uploaded by Peter Brusilovsky on 01 October 2024.
256
UMAP Adjunct ’24, July 01–04, 2024, Cagliari, Italy Rully Agus Hendrawan, Peter Brusilovsky, Arun Balajiee Lekshmi Narayanan, and Jordan Barria-Pineda
open user model could make information exploration process more [4]. However, managing the number of items in a user profile and
efficient [3, 31, 33], however, it has also been shown that the abil- maintaining simplicity in these features is crucial, as exceeding a
ity to manipulate user models requires a good understanding of certain threshold of complexity can diminish their benefits [19, 36].
model components [4]. If the meaning of the model components Additionally, interpreting the relationships of entities within graphs
is not clear to users, their attempts to adjust the user model might remains challenging, as the complexity often obscures the rationale
negatively affect system performance [5]. behind specific recommendations.
A good example of the domain where the meaning of user model Recent advances in Large Language Models (LLMs) have signifi-
components might be hard to understand for system users is aca- cantly enhanced their capabilities in text adaptations, especially in
demic research, which has been explored in a range of search and generating definitions and explanations. Instruction-tuned LLMs
recommender systems for academic papers [10] or researchers [31]. are particularly adept in this regard [39]. These LLMs can be di-
Information exploration systems in the academic domain typically rected to provide definitions through in-context learning or prompt-
used research topics such as "reinforcement learning" or "relevance ing, and techniques like Few-Shot and Chain-of-Thought leading
feedback" as elements of the user model to reflect the user’s interest to substantial improvements in generating specific explanations
in these topics [14, 31, 33]. The exact meaning of these terms might [12, 40]. The variety of approaches for prompt engineering has led
not always be clear to system users, especially students who are researchers to categorize emerging patterns [41]. Incorporating
target users of many systems in this domain [31, 33]. Moreover, external sources to enhance LLM generation has been identified
less experienced users might not always understand the differences as essential to address certain limitations, leading to the develop-
between similar topics such as “user model” and “user profile”. The ment of Augmented LMs [27]. Retrieval Augmented Generation
lack of understanding might lead to mistakes in all kinds of model (RAG) has been particularly noted for its efficacy in knowledge-
adjustments - adding, removing, and re-weighting the topics in the intensive tasks [21]. Using the Web as a readily available source
model. of context represents a pragmatic approach [28]. In contrast, for
This paper investigates whether information exploration with domain-specific explanations, retrieving information from inter-
open user models in a complex domain could be facilitated by nal libraries is necessary [34]. Augmentation also plays a critical
topic explanations powered by modern Large Language Models. role in generating explanations between two research documents
The starting point for our research was the Grapevine exploratory [20]. Embedding LLMs with knowledge graphs, a reliable source
search system for finding research advisors [31], which used re- of structured context, further enhances their utility [17, 30]. This
search topics as elements of its open user model. While originally process can be further automated, allowing LLMs to suggest which
developed for Ph.D. students looking for advisors, the system has context to use [35].
been increasingly used by undergraduate students interested in LLM explanations may not always resemble dictionary defi-
research and even high school students selecting a college to ap- nitions, but are often easier to understand [20]. Recent research
ply to. These users have a relatively weak understanding of the demonstrates that LLM explanations are useful for learning new
research topics used in the system. To help less prepared users, we concepts in fields such as computer science [2], and medical physi-
developed two types of topic explanation: "individual keyphrase ology [1]. LLMs could alleviate the tedious task required to build
explanation", which explains the nature of a specific topic to the the concept model [26]. LLMs can provide contextual explanations,
user, and "relationship explanation between keyphrases", which making the underlying logic of the recommender systems compre-
highlights the relationship between a topic that the user consid- hensible to users. In our work presented in this paper, our objective
ers to add to the model and topics already present in the model. is to explore how explanations can facilitate an interactive explo-
The explanation will be presented on-demand when the user clicks ration experience and help users connect their existing knowledge
the explanation button to deliver an explanation based on three with the information they seek [31].
levels of details: keyphrase without explanation (no-exp), individ-
ual keyphrase explanation (ind-exp), and relationship explanation
(rel-exp) [13]. Using the Grapevine system extended with these ex- 3 GRAPEVINE: EXPLORATION AND
planations, we performed a user study to understand to what extent EXPLANATIONS WITH AN OPEN USER
the explanation affects or is affected by the exploration process. MODEL
This paper is structured as follows. First, we outline the back-
ground of this research and discuss related work. We then present
3.1 Grapevine: An Exploratory Recommender
the explanation of the implementation in Grapevine. Afterward, we System
describe the user study, present the results, and discuss the findings. The starting point of our research on explainable user models was
Finally, we summarize the work and its limitations. Grapevine [31] - an exploratory recommender system with an open
user model. Grapevine was designed to help students seek fac-
ulty advisors for a variety of academic projects, such as capstone
2 BACKGROUND AND RELATED WORK projects, independent studies, master’s theses, and Ph.D. theses.
Previous research has shown that the inclusion of exploration- The original version of Grapevine was built through a multistage
focused features, open profile building, and personalization can iterative design process that started with interviews and obser-
have a positive impact on user exploration behavior and goals [31, vation to understand how students generally search and locate
44]. Concept graphs, which represent a network of keyphrases and advisors and was followed by the design and evaluation of multi-
their interrelationships, provide a natural method to navigate items ple versions of the system [32]. Our studies demonstrated that the
257
UMAP Adjunct ’24, July 01–04, 2024, Cagliari, Italy
258
UMAP Adjunct ’24, July 01–04, 2024, Cagliari, Italy Rully Agus Hendrawan, Peter Brusilovsky, Arun Balajiee Lekshmi Narayanan, and Jordan Barria-Pineda
259
UMAP Adjunct ’24, July 01–04, 2024, Cagliari, Italy
260
UMAP Adjunct ’24, July 01–04, 2024, Cagliari, Italy Rully Agus Hendrawan, Peter Brusilovsky, Arun Balajiee Lekshmi Narayanan, and Jordan Barria-Pineda
261
UMAP Adjunct ’24, July 01–04, 2024, Cagliari, Italy
confidence in their final advisor selections. However, it is not clear [11] Max Braun, Klaas Dellschaft, Thomas Franz, Dominik Hering, Peter Jungen,
whether these users have more advanced or unusual needs, which Hagen Metzler, Eugen Müller, Alexander Rostilov, and Carsten Saathoff. 2008.
Personalized Search and Exploration with MyTag. In the 17th international con-
caused them to work harder and engage with relative explanations ference on World Wide Web, WWW ’08. ACM, 1031–1032.
but also left them less satisfied, or whether they were simply more [12] Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan,
Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda
active and investigative users who expected more from the system. Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan,
We also analyzed how the use of explanations is related to the Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter,
user’s context and task. Our observations suggest that high experi- Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin
Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya
ence in a particular field, such as accumulated by graduate students, Sutskever, and Dario Amodei. 2020. Language Models Are Few-Shot Learners.
is linked to exploring fewer basic explanations but engaging more [Link] arXiv:2005.14165 [cs]
with relationship explanations. In addition, we found no signifi- [13] Mohamed Amine Chatti, Mouadh Guesmi, Laura Vorgerd, Thao Ngo, Shoeb
Joarder, Qurat Ul Ain, and Arham Muslim. 2022. Is More Always Better? The
cant association between explanation usage with the importance Effects of Personal Characteristics and Level of Detail on the Perception of Expla-
of selected topics and confidence in the choice of faculty. nations in a Recommender System. In Proceedings of the 30th ACM Conference on
User Modeling, Adaptation and Personalization. ACM, Barcelona Spain, 254–264.
[Link]
7 LIMITATION AND FUTURE WORK [14] Dario De Nart, Felice Ferrara, and Carlo Tasso. 2013. Personalized Access to
Scientific Publications: from Recommendation to Explanation. In 21st Conference
The purpose of this study is to investigate how the level of detail on User Modeling, Adaptation and Personalization (UMAP 2013) (Lecture Notes in
in explanations, building user profiles, and controlling recommen- Computer Science), Sandra Carberry, Stephan Weibelzahl, Alessandro Micarelli,
dations are related. However, due to our limited sample size, we and Giovanni Semeraro (Eds.). 296–301. [Link]
1007%2F978-3-642-38844-6_2
strongly suggest replicating or extending our model before applying [15] Cassius Dhelon, Jae-wook Ahn, V Kasireddy, and Nirmal Mukhi. 2019. Interactive
it to real-world problems. While we have not discovered a corre- Learning in a Conversational Intelligent Tutoring System Using Student Feedback,
lation between explanations and a user’s confidence in selecting Concept Grouping and Text Linking. In Proceedings of the 13th International
Technology, Education and Development Conference. [Link]
advisors, it may be worth investigating more on user tasks, such as inted.2019.0756
drafting an introductory email. This study also lays the foundation [16] Ayoub El Majjodi, Alain D. Starke, and Christoph Trattner. 2022. Nudging
Towards Health? Examining the Merits of Nutrition Labels and Personalization
for creating flexible information exploration systems that use mul- in a Recipe Recommender System. In Proceedings of the 30th ACM Conference
tiple AI agents: search, exploration, and explanation. It provides on User Modeling, Adaptation and Personalization. ACM, Barcelona Spain, 48–56.
a test ground for exploring the mechanics of AI systems that can [Link]
[17] Chao Feng, Xinyu Zhang, and Zichu Fei. 2023. Knowledge Solver: Teaching LLMs
help users understand complex symbolic representations within to Search for Domain Knowledge from Knowledge Graphs. arXiv:2309.03118 [cs]
the system. [18] Yingqiang Ge, Wenyue Hua, Kai Mei, jianchao ji, Juntao Tan, Shuyuan Xu, Zelong
Li, and Yongfeng Zhang. 2023. OpenAGI: When LLM Meets Domain Experts. In
Advances in Neural Information Processing Systems, A. Oh, T. Neumann, A. Glober-
REFERENCES son, K. Saenko, M. Hardt, and S. Levine (Eds.), Vol. 36. Curran Associates, Inc.,
[1] Mayank Agarwal, Ayan Goswami, and Priyanka Sharma. 2023. Evaluating 5539–5568.
ChatGPT-3.5 and Claude-2 in Answering and Explaining Conceptual Medi- [19] Julio Guerra-Hollstein, Jordan Barria-Pineda, Christian D. Schunn, Susan Bull,
cal Physiology Multiple-Choice Questions. Cureus 15, 9 (Sept. 2023), e46222. and Peter Brusilovsky. 2017. Fine-Grained Open Learner Models: Complexity
[Link] Versus Support. In Proceedings of the 25th Conference on User Modeling, Adaptation
[2] Vibhor Agarwal, Nakul Thureja, Madhav Krishan Garg, Sahiti Dharmavaram, and Personalization (UMAP ’17). Association for Computing Machinery, New
Meghna, and Dhruv Kumar. 2024. "Which LLM Should I Use?": Evaluating LLMs York, NY, USA, 41–49. [Link]
for Tasks Performed by Undergraduate Computer Science Students in India. [20] Jenny Kunz and Marco Kuhlmann. 2024. Properties and Challenges of LLM-
arXiv:2402.01687 [cs] Generated Explanations. arXiv:2402.10532 [cs]
[3] Jaewook Ahn and Peter Brusilovsky. 2013. Adaptive visualization for exploratory [21] Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin,
information retrieval. Information Processing and Management 49, 5 (2013), Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel,
1139–1164. [Link] Sebastian Riedel, and Douwe Kiela. 2021. Retrieval-Augmented Generation for
[4] Jae Ahn, Peter Brusilovsky, and Shuguang Han. 2015. Personalized Search: Recon- Knowledge-Intensive NLP Tasks. [Link]
sidering the Value of Open User Models. In Proceedings of the 20th International arXiv:2005.11401 [cs]
Conference on Intelligent User Interfaces. ACM, 202–212. [Link] [22] Xianming Li and Jing Li. 2023. AnglE-optimized Text Embeddings.
article-id:13566841doi:10.1145/2678025.2701410 arXiv:2309.12871 [cs]
[5] Jae-wook Ahn, Peter Brusilovsky, Jonathan Grady, Daqing He, and Sue Yeon [23] Matteo Lissandrini and Davide Mottin. 2022. Knowledge Graph Exploration
Syn. 2007. Open user profiles for adaptive news systems: help or harm?. In Systems: Are We Lost? In Proceedings of the 12th Conference on Innovative Data
the 16th international conference on World Wide Web, WWW ’07. ACM, 11–20. Systems Research (2022).
[Link] [24] Benedikt Loepp. 2022. Recommender Systems Alone Are Not Everything: To-
[6] Jae-wook Ahn, Peter Brusilovsky, Daqing He, Jonathan Grady, and Qi Li. 2008. wards a Broader Perspective in the Evaluation of Recommender Systems. In
Personalized Web Exploration with Task Models. In the 17th international confer- Proceedings of Perspectives on the Evaluation of Recommender Systems Work-
ence on World Wide Web, WWW ’08. ACM, 1–10. shop (PERSPECTIVES 2022) at 16th ACM Conference on Recommender Systems.
[7] Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, [Link]
and Zachary Ives. 2007. DBpedia: A Nucleus for a Web of Open Data. In Pro- [25] Gary Marchionini. 2006. Exploratory search: From finding to understanding.
ceedings of the 6th International The Semantic Web and 2nd Asian Conference Commun. ACM 49, 4 (2006), 41–46.
on Asian Semantic Web Conference (ISWC’07/ASWC’07). Springer-Verlag, Berlin, [26] Lars-Peter Meyer, Claus Stadler, Johannes Frey, Norman Radtke, Kurt Jung-
Heidelberg, 722–735. hanns, Roy Meissner, Gordian Dziwis, Kirill Bulert, and Michael Martin. 2023.
[8] Fedor Bakalov, Birgitta König-Ries, Andreas Nauerz, and Martin Welsch. 2010. LLM-assisted Knowledge Graph Engineering: Experiments with ChatGPT.
IntrospectiveViews: An Interface for Scrutinizing Semantic User Models. In 18th arXiv:2307.06917 [cs]
International Conference on User Modeling, Adaptation, and Personalization (UMAP [27] Grégoire Mialon, Roberto Dessì, Maria Lomeli, Christoforos Nalmpantis, Ram
2010) (Lecture Notes in Computer Science, Vol. 6075), Paul De Bra, Alfred Kobsa, Pasunuru, Roberta Raileanu, Baptiste Rozière, Timo Schick, Jane Dwivedi-Yu, Asli
and David Chin (Eds.). Springer, 219–230. [Link] Celikyilmaz, Edouard Grave, Yann LeCun, and Thomas Scialom. 2023. Augmented
1007%2F978-3-642-13470-8_21 Language Models: A Survey. arXiv:2302.07842 [cs]
[9] Randall Balestriero, Jerome Pesenti, and Yann LeCun. 2021. Learning in High [28] Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina
Dimension Always Amounts to Extrapolation. arXiv:2110.09485 [cs] Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu
[10] Joeran Beel, Bela Gipp, Stefan Langer, and Corinna Breitinger. 2016. Paper Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew
Recommender Systems: A Literature Survey. International Journal on Digital Knight, Benjamin Chess, and John Schulman. 2022. WebGPT: Browser-assisted
Libraries 17, 4 (2016), 305–338. [Link] Question-Answering with Human Feedback. arXiv:2112.09332 [cs]
262
UMAP Adjunct ’24, July 01–04, 2024, Cagliari, Italy Rully Agus Hendrawan, Peter Brusilovsky, Arun Balajiee Lekshmi Narayanan, and Jordan Barria-Pineda
[29] OpenAI. 2022. OpenAI Platform - GPT Base. (2) For each topic you choose, rate how important they are to
[Link] you. How important are these topics to your research or
[30] Shirui Pan, Linhao Luo, Yufei Wang, Chen Chen, Jiapu Wang, and Xindong Wu.
2024. Unifying Large Language Models and Knowledge Graphs: A Roadmap. studies?
IEEE Transactions on Knowledge and Data Engineering (2024), 1–20. [Link] (_) Not Important: It has no relevance to my work.
org/10.1109/TKDE.2024.3352100
[31] Behnam Rahdari, Peter Brusilovsky, and Dmitriy Babichenko. 2020. Personalizing
(_) Slightly Important: It has minimal relevance.
Information Exploration with an Open User Model. In Proceedings of the 31st ACM (_) Moderately/ Somewhat Important: It is somewhat rele-
Conference on Hypertext and Social Media. ACM, Virtual Event USA, 167–176. vant.
[Link]
[32] Behnam Rahdari, Peter Brusilovsky, Dmitriy Babichenko, Eliza Beth Littleton, (_) Important: It is a key aspect of my work.
Ravi Patel, Jaime Fawsett, and Zara Blum. 2020. Grapevine: A profile-based (_) Very Important: It is central to my research/studies
exploratory search and recommendation system for finding research advisors. In (3) Compose a Message to one of the Faculty Members: Write
83rd Annual Meeting of the Association for Information Science and Technology.
[Link] an introductory message, pitch a project idea, and describe
[33] Tuukka Ruotsalo, Kumaripaba Athukorala, Dorota Glowacka, Ksenia why they’re a good match for your project and how they fit
Konyushkova, Antti Oulasvirta, Samuli Kaipiainen, Samuel Kaski, and Giulio
Jacucci. 2013. Supporting exploratory search tasks with interactive user
into your project plans. Be as detailed as possible – this is
modeling. In 2013 Annual Meeting of American Society for Information Science your chance to reflect on your decisions.
and Technology, Vol. 50. Wiley, 1–10. Below is the template you can use later to write your mes-
[34] Jaromir Savelka, Kevin D. Ashley, Morgan A. Gray, Hannes Westermann, and
Huihui Xu. 2023. Explaining Legal Concepts with Augmented Large Language sage.
Models (GPT-4). arXiv:2306.09525 [cs] (4) Choose one faculty member from the list below to be the
[35] Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, recipient of your message.
Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. 2023. Toolformer:
Language Models Can Teach Themselves to Use Tools. arXiv:2302.04761 [cs] (_) Person 1 (_) Person 2 (_) Person 3
[36] Sergey Sosnovsky and Peter Brusilovsky. 2015. Evaluation of Topic-Based Adap- (5) Your Message to the Faculty Member:
tation and Student Modeling in QuizGuide. User Modeling and User-Adapted
Interaction 25, 4 (Oct. 2015), 371–424. [Link]
___________________
[37] Pertti Vakkari. 2016. Searching as Learning: A Systematization Based on Litera- (6) Please give feedback to the system (Strongly Disagree / Dis-
ture. Journal of Information Science 42, 1 (Feb. 2016), 7–18. [Link] agree / Neutral / Agree / Strongly Agree):
1177/0165551515615833
[38] Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, and Ming Zhou. (a) The system helps me feel confident in my choice of poten-
2020. MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compres- tial advisors.
sion of Pre-Trained Transformers. [Link] (b) The system does NOT influence my confidence in my
arXiv:2002.10957 [cs]
[39] Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian choice of potential advisors.
Lester, Nan Du, Andrew M. Dai, and Quoc V. Le. 2022. Finetuned Language (c) The faculty member recommendations provided are highly
Models Are Zero-Shot Learners. arXiv:2109.01652 [cs]
[40] Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia,
relevant to my academic/research interests.
Ed Chi, Quoc Le, and Denny Zhou. 2023. Chain-of-Thought Prompting Elicits (d) The faculty member recommendations DO NOT seem
Reasoning in Large Language Models. arXiv:2201.11903 [cs] related to my academic/research interests.
[41] Jules White, Quchen Fu, Sam Hays, Michael Sandborn, Carlos Olea, Henry
Gilbert, Ashraf Elnashar, Jesse Spencer-Smith, and Douglas C. Schmidt. 2023. (e) The topic recommendations align well with my current
A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT. academic interests.
arXiv:2302.11382 [cs] (f) The recommended topics DO NOT seem related to my
[42] Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. 2024.
C-Pack: Packaged Resources To Advance General Chinese Embedding. academic interests.
arXiv:2309.07597 [cs] (g) The explanations for each topic enhance/aid my under-
[43] Ziqi Yin, Hao Wang, Kaito Horio, Daisuke Kawahara, and Satoshi Sekine. 2024.
Should We Respect LLMs? A Cross-Lingual Study on the Influence of Prompt
standing.
Politeness on LLM Performance. [Link] (h) Overall, I find the explanations for each topic confusing
arXiv:2402.14531 [cs] and UNHELPFUL.
[44] Run Yu, Zach Pardos, Hung Chau, and Peter Brusilovsky. 2021. Orienting Stu-
dents to Course Recommendations Using Three Types of Explanation. In Adjunct (i) The explanations of relationship between topics enhance/aid
Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Person- my understanding.
alization. ACM, Utrecht Netherlands, 238–245. [Link] (j) Overall, I find the explanations of relationship between
3464483
topics do NOT improve my understanding.
(k) The explanations of relationship between topics are help-
ful in guiding my topic selection or research direction.
(l) Overall, I find the explanations of relationship between
A APPENDIX: POST-STUDY QUESTIONNAIRE topics UNHELPFUL in guiding my topic selection or re-
(1) For each topic you chose, please rate its novelty to you. search direction.
Before using the system, how familiar were you with these (7) Suggestions or Comments (Optional):
topics? ___________________
(_) Not Familiar: I was not aware of it before. ___________________
(_) Slightly Familiar: I had minimal awareness or understand-
ing.
(_) Moderately/ Somewhat Familiar: I had heard about it pre-
viously.
(_) Familiar: I had a good understanding but not in-depth.
(_) Very Familiar: I have extensive knowledge or experience.
263