Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3485447.3512269acmconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
short-paper
Open access

ExpScore: Learning Metrics for Recommendation Explanation

Published: 25 April 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Many information access and machine learning systems, including recommender systems, lack transparency and accountability. High-quality recommendation explanations are of great significance to enhance the transparency and interpretability of such systems. However, evaluating the quality of recommendation explanations is still challenging due to the lack of human-annotated data and benchmarks. In this paper, we present a large explanation dataset named RecoExp, which contains thousands of crowdsourced ratings of perceived quality in explaining recommendations. To measure explainability in a comprehensive and interpretable manner, we propose ExpScore, a novel machine learning-based metric that incorporates the definition of explainability from various perspectives (e.g., relevance, readability, subjectivity, and sentiment polarity). Experiments demonstrate that ExpScore not only vastly outperforms existing metrics and but also keeps itself explainable. Both the RecoExp dataset and open-source implementation of ExpScore will be released for the whole community. These resources and our findings can serve as forces of public good for scholars as well as recommender systems users.

    References

    [1]
    Enrique Amigo, Julio Gonzalo, Jesus Giménez, and Felisa Verdejo. 2011. Corroborating text evaluation results with heterogeneous measures. EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference(2011), 455–466.
    [2]
    Ricardo Baeza-Yates. 2018. Bias on the web. Commun. ACM 61, 6 (2018), 54–61.
    [3]
    Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization. 65–72.
    [4]
    Oren Barkan, Yonatan Fuchs, Noam Koenigstein, and Avi Caciularu. 2020. Explainable Recommendations via Attentive Multi-Persona Collaborative Filtering. (2020). https://doi.org/10.1145/3383313.3412226 arxiv:2010.07042v1
    [5]
    Chong Chen, Min Zhang, Yiqun Liu, and Shaoping Ma. 2018. Neural attentional rating regression with review-level explanations. In Proceedings of the 2018 World Wide Web Conference. 1583–1592.
    [6]
    Hanxiong Chen, Xu Chen, Shaoyun Shi, and Yongfeng Zhang. 2019. Generate natural language explanations for recommendation. SIGIR 2019 Workshop on ExplainAble Recommendation and Search (2019).
    [7]
    Xu Chen, Zheng Qin, Yongfeng Zhang, and Tao Xu. 2016. Learning to rank features for recommendation over multiple categories. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 305–314.
    [8]
    Xu Chen, Yongfeng Zhang, and Zheng Qin. 2019. Dynamic explainable recommendation based on neural attentive models. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 53–60.
    [9]
    Felipe Costa, Sixun Ouyang, Peter Dolog, and Aonghus Lawlor. 2018. Automatic Generation of Natural Language Explanations. In Proceedings of the 23rd International Conference on Intelligent User Interfaces Companion. ACM, 57.
    [10]
    Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv abs/1810.04805(2019).
    [11]
    Li Dong, Shaohan Huang, Furu Wei, Mirella Lapata, Ming Zhou, and Ke Xu. 2017. Learning to generate product reviews from attributes. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. 623–632.
    [12]
    James N Farr, James J Jenkins, and Donald G Paterson. 1951. Simplification of Flesch Reading Ease Formula.Journal of Applied Psychology 35, 5 (1951), 333–337. https://doi.org/10.1037/h0062427
    [13]
    Krisztian Balog Google and Filip Radlinski. [n.d.]. Measuring Recommendation Explanation Quality: The Conflicting Goals of Explanations. ([n. d.]), 10. https://doi.org/10.1145/3397271.3401032
    [14]
    Jonathan L Herlocker, Joseph A Konstan, and John Riedl. 2000. Explaining collaborative filtering recommendations. In Proceedings of the 2000 ACM conference on Computer supported cooperative work. ACM, 241–250.
    [15]
    Alon Lavie and Abhaya Agarwal. 2007. METEOR: An automatic metric for MT evaluation with high levels of correlation with human judgments. In Proceedings of the second workshop on statistical machine translation. 228–231.
    [16]
    Lei Li, Yongfeng Zhang, and Li Chen. 2020. Generate Neural Template Expla-nations for Recommendation. (2020). https://doi.org/10.1145/3340531.3411992
    [17]
    Lei Li, Yongfeng Zhang, and Li Chen. 2020. Generate neural template explanations for recommendation. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 755–764.
    [18]
    Piji Li, Zihao Wang, Zhaochun Ren, Lidong Bing, and Wai Lam. 2017. Neural rating regression with abstractive tips generation for recommendation. In Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 345–354.
    [19]
    Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, 74–81. https://www.aclweb.org/anthology/W04-1013
    [20]
    Bing Liu. 2010. Sentiment analysis and subjectivity. Handbook of Natural Language Processing, Second EditionJanuary 2010 (2010), 627–666.
    [21]
    Jianmo Ni, Jiacheng Li, and Julian Mcauley. [n.d.]. Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects. Technical Report.
    [22]
    Jianmo Ni, Jiacheng Li, and Julian McAuley. 2019. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 188–197.
    [23]
    Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, 311–318. https://doi.org/10.3115/1073083.1073135
    [24]
    Zhaochun Ren, Shangsong Liang, Piji Li, Shuaiqiang Wang, and Maarten de Rijke. 2017. Social collaborative viewpoint regression with explainable recommendations. In Proceedings of the tenth ACM international conference on web search and data mining. ACM, 485–494.
    [25]
    Thibault Sellam, Dipanjan Das, and Ankur P. Parikh. 2020. BLEURT: Learning Robust Metrics for Text Generation. arXiv (2020), 7881–7892. https://doi.org/10.18653/v1/2020.acl-main.704 arxiv:2004.04696
    [26]
    Quoc-Tuan Truong and Hady Lauw. 2019. Multimodal review generation for recommender systems. In The World Wide Web Conference. 1864–1874.
    [27]
    Kosetsu Tsukuda and Masataka Goto. 2020. Explainable Recommendation for Repeat Consumption. In RecSys 2020 - 14th ACM Conference on Recommender Systems. 462–467. https://doi.org/10.1145/3383313.3412230
    [28]
    Nan Wang, Hongning Wang, Yiling Jia, and Yue Yin. 2018. Explainable Recommendation via Multi-Task Learning in Opinionated Text Data. arXiv preprint arXiv:1806.03568(2018).
    [29]
    Xiting Wang and Yiru Chen. 2018. 2018 IEEE International Conference on Data Mining A Reinforcement Learning Framework for Explainable Recommendation. IEEE International Conference on Data Mining (2018). https://www.microsoft.com/en-us/research/publication/a-reinforcement-learning-framework-for-explainable-recommendation/
    [30]
    Zhongqing Wang and Yue Zhang. 2017. Opinion recommendation using neural memory model. arXiv preprint arXiv:1702.01517(2017).
    [31]
    Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2019. Bertscore: Evaluating text generation with bert. arXiv (2019), 1–43. arxiv:1904.09675
    [32]
    Yongfeng Zhang and Xu Chen. 2020. Explainable recommendation: A survey and new perspectives. Foundations and Trends in Information Retrieval (2020). https://doi.org/10.1561/9781680836592
    [33]
    Yongfeng Zhang, Guokun Lai, Min Zhang, Yi Zhang, Yiqun Liu, and Shaoping Ma. 2014. Explicit factor models for explainable recommendation based on phrase-level sentiment analysis. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. ACM, 83–92.

    Cited By

    View all
    • (2023)Personalized Prompt Learning for Explainable RecommendationACM Transactions on Information Systems10.1145/358048841:4(1-26)Online publication date: 23-Mar-2023
    • (2023)A Brief Survey of Offline Explainability Metrics for Conversational Recommender Systems2023 IEEE Signal Processing in Medicine and Biology Symposium (SPMB)10.1109/SPMB59478.2023.10372769(1-9)Online publication date: 2-Dec-2023

    Index Terms

    1. ExpScore: Learning Metrics for Recommendation Explanation
              Index terms have been assigned to the content through auto-classification.

              Recommendations

              Comments

              Information & Contributors

              Information

              Published In

              cover image ACM Conferences
              WWW '22: Proceedings of the ACM Web Conference 2022
              April 2022
              3764 pages
              ISBN:9781450390965
              DOI:10.1145/3485447
              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Sponsors

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              Published: 25 April 2022

              Permissions

              Request permissions for this article.

              Check for updates

              Author Tags

              1. Evaluation
              2. Explainable Recommendation
              3. Metric

              Qualifiers

              • Short-paper
              • Research
              • Refereed limited

              Funding Sources

              • US National Science Foundation

              Conference

              WWW '22
              Sponsor:
              WWW '22: The ACM Web Conference 2022
              April 25 - 29, 2022
              Virtual Event, Lyon, France

              Acceptance Rates

              Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

              Contributors

              Other Metrics

              Bibliometrics & Citations

              Bibliometrics

              Article Metrics

              • Downloads (Last 12 months)318
              • Downloads (Last 6 weeks)34
              Reflects downloads up to 27 Jul 2024

              Other Metrics

              Citations

              Cited By

              View all
              • (2023)Personalized Prompt Learning for Explainable RecommendationACM Transactions on Information Systems10.1145/358048841:4(1-26)Online publication date: 23-Mar-2023
              • (2023)A Brief Survey of Offline Explainability Metrics for Conversational Recommender Systems2023 IEEE Signal Processing in Medicine and Biology Symposium (SPMB)10.1109/SPMB59478.2023.10372769(1-9)Online publication date: 2-Dec-2023

              View Options

              View options

              PDF

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader

              HTML Format

              View this article in HTML Format.

              HTML Format

              Get Access

              Login options

              Media

              Figures

              Other

              Tables

              Share

              Share

              Share this Publication link

              Share on social media