short-paper

Open access

ExpScore: Learning Metrics for Recommendation Explanation

Authors:

Yongfeng Zhang,

Chirag ShahAuthors Info & Claims

WWW '22: Proceedings of the ACM Web Conference 2022

Pages 3740 - 3744

https://doi.org/10.1145/3485447.3512269

Published: 25 April 2022 Publication History

All formats PDF

Abstract

Many information access and machine learning systems, including recommender systems, lack transparency and accountability. High-quality recommendation explanations are of great significance to enhance the transparency and interpretability of such systems. However, evaluating the quality of recommendation explanations is still challenging due to the lack of human-annotated data and benchmarks. In this paper, we present a large explanation dataset named RecoExp, which contains thousands of crowdsourced ratings of perceived quality in explaining recommendations. To measure explainability in a comprehensive and interpretable manner, we propose ExpScore, a novel machine learning-based metric that incorporates the definition of explainability from various perspectives (e.g., relevance, readability, subjectivity, and sentiment polarity). Experiments demonstrate that ExpScore not only vastly outperforms existing metrics and but also keeps itself explainable. Both the RecoExp dataset and open-source implementation of ExpScore will be released for the whole community. These resources and our findings can serve as forces of public good for scholars as well as recommender systems users.

References

[1]

Enrique Amigo, Julio Gonzalo, Jesus Giménez, and Felisa Verdejo. 2011. Corroborating text evaluation results with heterogeneous measures. EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference(2011), 455–466.

[2]

Ricardo Baeza-Yates. 2018. Bias on the web. Commun. ACM 61, 6 (2018), 54–61.

Digital Library

[3]

Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization. 65–72.

[4]

Oren Barkan, Yonatan Fuchs, Noam Koenigstein, and Avi Caciularu. 2020. Explainable Recommendations via Attentive Multi-Persona Collaborative Filtering. (2020). https://doi.org/10.1145/3383313.3412226 arxiv:2010.07042v1

Digital Library

[5]

Chong Chen, Min Zhang, Yiqun Liu, and Shaoping Ma. 2018. Neural attentional rating regression with review-level explanations. In Proceedings of the 2018 World Wide Web Conference. 1583–1592.

Digital Library

[6]

Hanxiong Chen, Xu Chen, Shaoyun Shi, and Yongfeng Zhang. 2019. Generate natural language explanations for recommendation. SIGIR 2019 Workshop on ExplainAble Recommendation and Search (2019).

[7]

Xu Chen, Zheng Qin, Yongfeng Zhang, and Tao Xu. 2016. Learning to rank features for recommendation over multiple categories. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 305–314.

Digital Library

[8]

Xu Chen, Yongfeng Zhang, and Zheng Qin. 2019. Dynamic explainable recommendation based on neural attentive models. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 53–60.

Digital Library

[9]

Felipe Costa, Sixun Ouyang, Peter Dolog, and Aonghus Lawlor. 2018. Automatic Generation of Natural Language Explanations. In Proceedings of the 23rd International Conference on Intelligent User Interfaces Companion. ACM, 57.

Digital Library

[10]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv abs/1810.04805(2019).

[11]

Li Dong, Shaohan Huang, Furu Wei, Mirella Lapata, Ming Zhou, and Ke Xu. 2017. Learning to generate product reviews from attributes. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. 623–632.

[12]

James N Farr, James J Jenkins, and Donald G Paterson. 1951. Simplification of Flesch Reading Ease Formula.Journal of Applied Psychology 35, 5 (1951), 333–337. https://doi.org/10.1037/h0062427

[13]

Krisztian Balog Google and Filip Radlinski. [n.d.]. Measuring Recommendation Explanation Quality: The Conflicting Goals of Explanations. ([n. d.]), 10. https://doi.org/10.1145/3397271.3401032

Digital Library

[14]

Jonathan L Herlocker, Joseph A Konstan, and John Riedl. 2000. Explaining collaborative filtering recommendations. In Proceedings of the 2000 ACM conference on Computer supported cooperative work. ACM, 241–250.

Digital Library

[15]

Alon Lavie and Abhaya Agarwal. 2007. METEOR: An automatic metric for MT evaluation with high levels of correlation with human judgments. In Proceedings of the second workshop on statistical machine translation. 228–231.

[16]

Lei Li, Yongfeng Zhang, and Li Chen. 2020. Generate Neural Template Expla-nations for Recommendation. (2020). https://doi.org/10.1145/3340531.3411992

Digital Library

[17]

Lei Li, Yongfeng Zhang, and Li Chen. 2020. Generate neural template explanations for recommendation. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 755–764.

Digital Library

[18]

Piji Li, Zihao Wang, Zhaochun Ren, Lidong Bing, and Wai Lam. 2017. Neural rating regression with abstractive tips generation for recommendation. In Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 345–354.

Digital Library

[19]

Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, 74–81. https://www.aclweb.org/anthology/W04-1013

[20]

Bing Liu. 2010. Sentiment analysis and subjectivity. Handbook of Natural Language Processing, Second EditionJanuary 2010 (2010), 627–666.

[21]

Jianmo Ni, Jiacheng Li, and Julian Mcauley. [n.d.]. Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects. Technical Report.

[22]

Jianmo Ni, Jiacheng Li, and Julian McAuley. 2019. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 188–197.

[23]

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, 311–318. https://doi.org/10.3115/1073083.1073135

Digital Library

[24]

Zhaochun Ren, Shangsong Liang, Piji Li, Shuaiqiang Wang, and Maarten de Rijke. 2017. Social collaborative viewpoint regression with explainable recommendations. In Proceedings of the tenth ACM international conference on web search and data mining. ACM, 485–494.

Digital Library

[25]

Thibault Sellam, Dipanjan Das, and Ankur P. Parikh. 2020. BLEURT: Learning Robust Metrics for Text Generation. arXiv (2020), 7881–7892. https://doi.org/10.18653/v1/2020.acl-main.704 arxiv:2004.04696

[26]

Quoc-Tuan Truong and Hady Lauw. 2019. Multimodal review generation for recommender systems. In The World Wide Web Conference. 1864–1874.

Digital Library

[27]

Kosetsu Tsukuda and Masataka Goto. 2020. Explainable Recommendation for Repeat Consumption. In RecSys 2020 - 14th ACM Conference on Recommender Systems. 462–467. https://doi.org/10.1145/3383313.3412230

Digital Library

[28]

Nan Wang, Hongning Wang, Yiling Jia, and Yue Yin. 2018. Explainable Recommendation via Multi-Task Learning in Opinionated Text Data. arXiv preprint arXiv:1806.03568(2018).

[29]

Xiting Wang and Yiru Chen. 2018. 2018 IEEE International Conference on Data Mining A Reinforcement Learning Framework for Explainable Recommendation. IEEE International Conference on Data Mining (2018). https://www.microsoft.com/en-us/research/publication/a-reinforcement-learning-framework-for-explainable-recommendation/

[30]

Zhongqing Wang and Yue Zhang. 2017. Opinion recommendation using neural memory model. arXiv preprint arXiv:1702.01517(2017).

[31]

Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2019. Bertscore: Evaluating text generation with bert. arXiv (2019), 1–43. arxiv:1904.09675

[32]

Yongfeng Zhang and Xu Chen. 2020. Explainable recommendation: A survey and new perspectives. Foundations and Trends in Information Retrieval (2020). https://doi.org/10.1561/9781680836592

[33]

Yongfeng Zhang, Guokun Lai, Min Zhang, Yi Zhang, Yiqun Liu, and Shaoping Ma. 2014. Explicit factor models for explainable recommendation based on phrase-level sentiment analysis. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. ACM, 83–92.

Digital Library

Cited By

Li LZhang YChen L(2023)Personalized Prompt Learning for Explainable RecommendationACM Transactions on Information Systems10.1145/358048841:4(1-26)Online publication date: 23-Mar-2023
https://dl.acm.org/doi/10.1145/3580488
May JPoudel K(2023)A Brief Survey of Offline Explainability Metrics for Conversational Recommender Systems2023 IEEE Signal Processing in Medicine and Biology Symposium (SPMB)10.1109/SPMB59478.2023.10372769(1-9)Online publication date: 2-Dec-2023
https://doi.org/10.1109/SPMB59478.2023.10372769

Index Terms

ExpScore: Learning Metrics for Recommendation Explanation
1. Human-centered computing
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
  2. Information systems applications

Index terms have been assigned to the content through auto-classification.

Recommendations

Counterfactual Explainable Recommendation
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

By providing explanations for users and system designers to facilitate better understanding and decision making, explainable recommendation has been an important research problem. In this paper, we propose Counterfactual Explainable Recommendation (...
On the Relationship between Explanation and Recommendation: Learning to Rank Explanations for Improved Performance
Explaining to users why some items are recommended is critical, as it can help users to make better decisions, increase their satisfaction, and gain their trust in recommender systems (RS). However, existing explainable RS usually consider explanation as ...
Counterfactual Explanation for Fairness in Recommendation
Fairness-aware recommendation alleviates discrimination issues to build trustworthy recommendation systems. Explaining the causes of unfair recommendations is critical, as it promotes fairness diagnostics, and thus secures users’ trust in recommendation ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '22: Proceedings of the ACM Web Conference 2022

April 2022

3764 pages

ISBN:9781450390965

DOI:10.1145/3485447

Editors:
Frédérique Laforest
INSA Lyon, France
,
Raphaël Troncy
EURECOM, France
,
Elena Simperl
King’s College London, UK
,
Deepak Agarwal
Pinterest, USA
,
Aristides Gionis
KTH Royal Institute of Technology, Sweden
,
Ivan Herman
W3C / retired
,
Lionel Médini
Université Lyon 1, France

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 April 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper
Research
Refereed limited

Funding Sources

US National Science Foundation

Conference

WWW '22

Sponsor:

SIGWEB

WWW '22: The ACM Web Conference 2022

April 25 - 29, 2022

Virtual Event, Lyon, France

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
1,096
Total Downloads

Downloads (Last 12 months)318
Downloads (Last 6 weeks)34

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li LZhang YChen L(2023)Personalized Prompt Learning for Explainable RecommendationACM Transactions on Information Systems10.1145/358048841:4(1-26)Online publication date: 23-Mar-2023
https://dl.acm.org/doi/10.1145/3580488
May JPoudel K(2023)A Brief Survey of Offline Explainability Metrics for Conversational Recommender Systems2023 IEEE Signal Processing in Medicine and Biology Symposium (SPMB)10.1109/SPMB59478.2023.10372769(1-9)Online publication date: 2-Dec-2023
https://doi.org/10.1109/SPMB59478.2023.10372769

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents