Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3539618.3592003acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Improving News Recommendation via Bottlenecked Multi-task Pre-training

Published: 18 July 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Recent years have witnessed the boom of deep neural networks in online news recommendation service. As news articles mainly consist of textual content, pre-trained language models~(PLMs) (e.g. BERT) have been widely adopted as the backbone to encode them into news embeddings, which would be utilized to generate the user representations or perform the semantic matching. However, existing PLMs are mostly pre-trained on large-scale general corpus, and have not been specially adapted for capturing the rich information within news articles. Therefore, their produced news embeddings may be not informative enough to represent the news content or characterize the relations among news. To solve it, we propose a bottlenecked multi-task pre-training approach, which relies on an information-bottleneck encoder-decoder architecture to compress the useful semantic information into the news embedding. Concretely, we design three pre-training tasks, to enforce the news embedding to recover the news contents of itself, its frequently oc-occurring neighbours, and the news with similar topics. We conduct experiments on the MIND dataset and show that our approach can outperform competitive pre-training methods.

    Supplemental Material

    MP4 File
    The introduction of our paper: Improving News Recommendation via Bottlenecked Multi-task Pre-training

    References

    [1]
    Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Jianfeng Gao, Songhao Piao, Ming Zhou, et al. 2020. Unilmv2: Pseudo-masked language models for unified language model pre-training. In International conference on machine learning. PMLR, 642--652.
    [2]
    Qiwei Bi, Jian Li, Lifeng Shang, Xin Jiang, Qun Liu, and Hanfang Yang. 2022. Mtrec: Multi-task learning over bert for news recommendation. In Findings of the Association for Computational Linguistics: ACL 2022. 2663--2669.
    [3]
    Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
    [4]
    Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, and Hsiao-Wuen Hon. 2019. Unified language model pre-training for natural language understanding and generation. Advances in neural information processing systems, Vol. 32 (2019).
    [5]
    Luyu Gao and Jamie Callan. 2021. Condenser: a pre-training architecture for dense retrieval. arXiv preprint arXiv:2104.08253 (2021).
    [6]
    Florent Garcin, Kai Zhou, Boi Faltings, and Vincent Schickel. 2012. Personalized news recommendation based on collaborative filtering. In 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, Vol. 1. IEEE, 437--441.
    [7]
    Wouter IJntema, Frank Goossen, Flavius Frasincar, and Frederik Hogenboom. 2010. Ontology-based news recommendation. In Proceedings of the 2010 EDBT/ICDT Workshops. 1--6.
    [8]
    Qinglin Jia, Jingjie Li, Qi Zhang, Xiuqiang He, and Jieming Zhu. 2021. RMBERT: News recommendation via recurrent reasoning memory network over BERT. In Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval. 1773--1777.
    [9]
    Michal Kompan and Mária Bieliková. 2010. Content-based news recommendation. In E-Commerce and Web Technologies: 11th International Conference, EC-Web 2010, Bilbao, Spain, September 1--3, 2010. Proceedings 11. Springer, 61--72.
    [10]
    Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019).
    [11]
    Jiahui Liu, Peter Dolan, and Elin Rønby Pedersen. 2010. Personalized news recommendation based on click behavior. In Proceedings of the 15th international conference on Intelligent user interfaces. 31--40.
    [12]
    Jialu Liu, Tianqi Liu, and Cong Yu. 2021. Newsembed: Modeling news through pre-trained document representations. arXiv preprint arXiv:2106.00590 (2021).
    [13]
    Zheng Liu and Yingxia Shao. 2022. Retromae: Pre-training retrieval-oriented transformers via masked auto-encoder. arXiv preprint arXiv:2205.12035 (2022).
    [14]
    Shumpei Okura, Yukihiro Tagami, Shingo Ono, and Akira Tajima. 2017. Embedding-based news recommendation for millions of users. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 1933--1942.
    [15]
    Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. 2018. Improving language understanding by generative pre-training. (2018).
    [16]
    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017).
    [17]
    Hongwei Wang, Fuzheng Zhang, Xing Xie, and Minyi Guo. 2018. DKN: Deep knowledge-aware network for news recommendation. In Proceedings of the 2018 world wide web conference. 1835--1844.
    [18]
    Hao Wen, Liping Fang, and Ling Guan. 2012. A hybrid approach for personalized recommendation of news on the Web. Expert Systems with Applications, Vol. 39, 5 (2012), 5806--5814.
    [19]
    Chuhan Wu, Fangzhao Wu, Suyu Ge, Tao Qi, Yongfeng Huang, and Xing Xie. 2019. Neural news recommendation with multi-head self-attention. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). 6389--6394.
    [20]
    Chuhan Wu, Fangzhao Wu, Tao Qi, and Yongfeng Huang. 2020b. Sentirec: Sentiment diversity-aware neural news recommendation. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing. 44--53.
    [21]
    Chuhan Wu, Fangzhao Wu, Tao Qi, and Yongfeng Huang. 2021a. Empowering news recommendation with pre-trained language models. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1652--1656.
    [22]
    Chuhan Wu, Fangzhao Wu, Yang Yu, Tao Qi, Yongfeng Huang, and Qi Liu. 2021b. NewsBERT: Distilling pre-trained language model for intelligent news application. arXiv preprint arXiv:2102.04887 (2021).
    [23]
    Fangzhao Wu, Ying Qiao, Jiun-Hung Chen, Chuhan Wu, Tao Qi, Jianxun Lian, Danyang Liu, Xing Xie, Jianfeng Gao, Winnie Wu, et al. 2020a. Mind: A large-scale dataset for news recommendation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3597--3606.
    [24]
    Qi Zhang, Jingjie Li, Qinglin Jia, Chuyuan Wang, Jieming Zhu, Zhaowei Wang, and Xiuqiang He. 2021. UNBERT: User-News Matching BERT for News Recommendation. In IJCAI. 3356--3362.
    [25]
    Wayne Xin Zhao, Jing Liu, Ruiyang Ren, and Ji-Rong Wen. 2022. Dense text retrieval based on pretrained language models: A survey. arXiv preprint arXiv:2211.14876 (2022).
    [26]
    Kun Zhou, Yeyun Gong, Xiao Liu, Wayne Xin Zhao, Yelong Shen, Anlei Dong, Jingwen Lu, Rangan Majumder, Ji-Rong Wen, Nan Duan, et al. 2022a. Simans: Simple ambiguous negatives sampling for dense text retrieval. arXiv preprint arXiv:2210.11773 (2022).
    [27]
    Kun Zhou, Xiao Liu, Yeyun Gong, Wayne Xin Zhao, Daxin Jiang, Nan Duan, and Ji-Rong Wen. 2022b. MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are Better Dense Retrievers. arXiv preprint arXiv:2212.07841 (2022).

    Index Terms

    1. Improving News Recommendation via Bottlenecked Multi-task Pre-training

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
      July 2023
      3567 pages
      ISBN:9781450394086
      DOI:10.1145/3539618
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 July 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. news recommendation
      2. pre-training language models

      Qualifiers

      • Short-paper

      Conference

      SIGIR '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 232
        Total Downloads
      • Downloads (Last 12 months)189
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 09 Aug 2024

      Other Metrics

      Citations

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media