research-article

Déjà vu: A Contextualized Temporal Attention Mechanism for Sequential Recommendation

Authors:

Hongning WangAuthors Info & Claims

WWW '20: Proceedings of The Web Conference 2020

Pages 2199 - 2209

https://doi.org/10.1145/3366423.3380285

Published: 20 April 2020 Publication History

Abstract

Predicting users’ preferences based on their sequential behaviors in history is challenging and crucial for modern recommender systems. Most existing sequential recommendation algorithms focus on transitional structure among the sequential actions, but largely ignore the temporal and context information, when modeling the influence of a historical event to current prediction.

In this paper, we argue that the influence from the past events on a user’s current action should vary over the course of time and under different context. Thus, we propose a Contextualized Temporal Attention Mechanism that learns to weigh historical actions’ influence on not only what action it is, but also when and how the action took place. More specifically, to dynamically calibrate the relative input dependence from the self-attention mechanism, we deploy multiple parameterized kernel functions to learn various temporal dynamics, and then use the context information to determine which of these reweighing kernels to follow for each input. In empirical evaluations on two large public recommendation datasets, our model consistently outperformed an extensive set of state-of-the-art sequential recommendation methods.

References

[1]

Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450(2016).

[2]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473(2014).

[3]

Alex Beutel, Paul Covington, Sagar Jain, Can Xu, Jia Li, Vince Gatto, and Ed H Chi. 2018. Latent cross: Making use of context in recurrent recommender systems. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 46–54.

Digital Library

[4]

Renqin Cai, Xueying Bai, Zhenrui Wang, Yuling Shi, Parikshit Sondhi, and Hongning Wang. 2018. Modeling Sequential Online Interactive Behaviors with Temporal Point Process. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 873–882.

Digital Library

[5]

Xu Chen, Hongteng Xu, Yongfeng Zhang, Jiaxi Tang, Yixin Cao, Zheng Qin, and Hongyuan Zha. 2018. Sequential recommendation with user memory networks. In Proceedings of the eleventh ACM international conference on web search and data mining. ACM, 108–116.

Digital Library

[6]

Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555(2014).

[7]

Qiang Cui, Shu Wu, Yan Huang, and Liang Wang. 2017. A hierarchical contextual attention-based gru network for sequential recommendation. arXiv preprint arXiv:1711.05114(2017).

[8]

Zihang Dai, Zhilin Yang, Yiming Yang, William W Cohen, Jaime Carbonell, Quoc V Le, and Ruslan Salakhutdinov. 2019. Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860(2019).

[9]

Samarjit Das. 1994. Time series analysis. Princeton University Press, Princeton, NJ.

[10]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018).

[11]

Nan Du, Hanjun Dai, Rakshit Trivedi, Utkarsh Upadhyay, Manuel Gomez-Rodriguez, and Le Song. 2016. Recurrent marked temporal point processes: Embedding event history to vector. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1555–1564.

Digital Library

[12]

Alan G Hawkes. 1971. Spectra of some self-exciting and mutually exciting point processes. Biometrika 58, 1 (1971), 83–90.

[13]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015), 770–778.

[14]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision. 1026–1034.

Digital Library

[15]

Balázs Hidasi and Alexandros Karatzoglou. 2018. Recurrent neural networks with top-k gains for session-based recommendations. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 843–852.

Digital Library

[16]

Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939(2015).

[17]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531(2015).

[18]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.

Digital Library

[19]

Wang-Cheng Kang and Julian McAuley. 2018. Self-Attentive Sequential Recommendation. In 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 197–206.

[20]

Yehuda Koren. 2009. Collaborative filtering with temporal dynamics. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 447–456.

Digital Library

[21]

Patrick J Laub, Thomas Taimre, and Philip K Pollett. 2015. Hawkes processes. arXiv preprint arXiv:1507.02822(2015).

[22]

Jing Li, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Tao Lian, and Jun Ma. 2017. Neural attentive session-based recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 1419–1428.

Digital Library

[23]

Lei Li, Li Zheng, Fan Yang, and Tao Li. 2014. Modeling and broadening temporal user interest in personalized news recommendation. Expert Systems with Applications 41, 7 (2014), 3168–3177.

Digital Library

[24]

Yang Li, Nan Du, and Samy Bengio. 2017. Time-dependent representation for neural event sequence prediction. arXiv preprint arXiv:1708.00065(2017).

[25]

Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025(2015).

[26]

Hongyuan Mei and Jason M Eisner. 2017. The neural hawkes process: A neurally self-modulating multivariate point process. In Advances in Neural Information Processing Systems. 6754–6764.

[27]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111–3119.

[28]

Vinod Nair and Geoffrey E. Hinton. 2010. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proceedings of the 27th International Conference on International Conference on Machine Learning(ICML’10). Omnipress, USA, 807–814. http://dl.acm.org/citation.cfm?id=3104322.3104425

Digital Library

[29]

Andrzej Pacuk, Piotr Sankowski, Karol Węgrzycki, Adam Witkowski, and Piotr Wygocki. 2016. RecSys Challenge 2016: Job Recommendations Based on Preselection of Offers and Gradient Boosting. In Proceedings of the Recommender Systems Challenge(RecSys Challenge ’16). ACM, New York, NY, USA, Article 10, 4 pages. https://doi.org/10.1145/2987538.2987544

Digital Library

[30]

Ofir Press and Lior Wolf. 2017. Using the Output Embedding to Improve Language Models. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers. Association for Computational Linguistics, Valencia, Spain, 157–163.

[31]

Massimo Quadrana, Alexandros Karatzoglou, Balázs Hidasi, and Paolo Cremonesi. 2017. Personalizing session-based recommendations with hierarchical recurrent neural networks. In Proceedings of the Eleventh ACM Conference on Recommender Systems. ACM, 130–137.

Digital Library

[32]

Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. [n. d.]. Language models are unsupervised multitask learners. Technical Report.

[33]

Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feedback. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence(UAI ’09). AUAI Press, Arlington, Virginia, United States, 452–461. http://dl.acm.org/citation.cfm?id=1795114.1795167

[34]

Louis L Scharf and Cédric Demeure. 1991. Statistical signal processing: detection, estimation, and time series analysis. Vol. 63. Addison-Wesley Reading, MA.

[35]

Mike Schuster and Kuldip K Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45, 11 (1997), 2673–2681.

Digital Library

[36]

Yang Song, Ali Mamdouh Elkahky, and Xiaodong He. 2016. Multi-rate deep learning for temporal recommendation. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 909–912.

Digital Library

[37]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research 15, 1 (2014), 1929–1958.

Digital Library

[38]

Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, and Peng Jiang. 2019. BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer. arXiv preprint arXiv:1904.06690(2019).

[39]

Jiaxi Tang, Francois Belletti, Sagar Jain, Minmin Chen, Alex Beutel, Can Xu, and Ed H Chi. 2019. Towards Neural Mixture Recommender for Long Range Dependent User Sequences. arXiv preprint arXiv:1902.08588(2019).

[40]

Jiaxi Tang and Ke Wang. 2018. Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining(WSDM ’18). ACM, New York, NY, USA, 565–573. https://doi.org/10.1145/3159652.3159656

Digital Library

[41]

Bjørnar Vassøy, Massimiliano Ruocco, Eliezer de Souza da Silva, and Erlend Aune. 2019. Time is of the Essence: a Joint Hierarchical RNN and Point Process Model for Time and Item Predictions. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. ACM, 591–599.

Digital Library

[42]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998–6008.

[43]

Shuai Xiao, Junchi Yan, Xiaokang Yang, Hongyuan Zha, and Stephen M Chu. 2017. Modeling the intensity function of point process via recurrent neural networks. In Thirty-First AAAI Conference on Artificial Intelligence.

[44]

Liang Xiong, Xi Chen, Tzu-Kuo Huang, Jeff Schneider, and Jaime G Carbonell. 2010. Temporal collaborative filtering with bayesian probabilistic tensor factorization. In Proceedings of the 2010 SIAM International Conference on Data Mining. SIAM, 211–222.

[45]

Jiaxuan You, Yichen Wang, Aditya Pal, Pong Eksombatchai, Chuck Rosenberg, and Jure Leskovec. 2019. Hierarchical Temporal Convolutional Networks for Dynamic Recommender Systems. arXiv preprint arXiv:1904.04381(2019).

[46]

Yuyu Zhang, Hanjun Dai, Chang Xu, Jun Feng, Taifeng Wang, Jiang Bian, Bin Wang, and Tie-Yan Liu. 2014. Sequential Click Prediction for Sponsored Search with Recurrent Neural Networks. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence(AAAI’14). AAAI Press, 1369–1375. http://dl.acm.org/citation.cfm?id=2893873.2894086

Digital Library

[47]

Han Zhu, Xiang Li, Pengye Zhang, Guozheng Li, Jie He, Han Li, and Kun Gai. 2018. Learning Tree-based Deep Model for Recommender Systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1079–1088.

Digital Library

[48]

Yu Zhu, Hao Li, Yikang Liao, Beidou Wang, Ziyu Guan, Haifeng Liu, and Deng Cai. 2017. What to Do Next: Modeling User Behaviors by Time-LSTM. In IJCAI. 3602–3608.

Cited By

Anzum FAsha ADey LGavrilov AIffath FOhi APond LShopon MGavrilova M(2024)A Comprehensive Review of Trustworthy, Ethical, and Explainable Computer Vision Advancements in Online Social MediaGlobal Perspectives on the Applications of Computer Vision in Cybersecurity10.4018/978-1-6684-8127-1.ch001(1-46)Online publication date: 23-Feb-2024
https://doi.org/10.4018/978-1-6684-8127-1.ch001
Celik EIlhan Omurca S(2024)Skip-Gram and Transformer Model for Session-Based RecommendationApplied Sciences10.3390/app1414635314:14(6353)Online publication date: 21-Jul-2024
https://doi.org/10.3390/app14146353
Sahebi SYao MZhao SFeyzi Behnagh R(2024)MoMENt: Marked Point Processes with Memory-Enhanced Neural Networks for User Activity ModelingACM Transactions on Knowledge Discovery from Data10.1145/364950418:6(1-32)Online publication date: 29-Feb-2024
https://dl.acm.org/doi/10.1145/3649504
Show More Cited By

Index Terms

Déjà vu: A Contextualized Temporal Attention Mechanism for Sequential Recommendation

Index terms have been assigned to the content through auto-classification.

Recommendations

Modeling Temporal Positive and Negative Excitation for Sequential Recommendation
WWW '23: Proceedings of the ACM Web Conference 2023

Sequential recommendation aims to predict the next item which interests users via modeling their interest in items over time. Most of the existing works on sequential recommendation model users’ dynamic interest in specific items while overlooking users’...
Attention-based context-aware sequential recommendation model
Abstract
Recurrent neural networks (RNN) based recommendation algorithms have been introduced recently as sequence information plays an increasingly important role when modeling user preferences. However, these methods have numerous limitations: they ...
A hierarchical contextual attention-based network for sequential recommendation
Abstract
The sequential recommendation is one of the most fundamental tasks for Web applications. Recently, recurrent neural network (RNN) based methods become popular and show effectiveness in many sequential recommendation tasks, such as next-...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '20: Proceedings of The Web Conference 2020

April 2020

3143 pages

ISBN:9781450370233

DOI:10.1145/3366423

Editors:
Yennun Huang
Acadmica sinica, Taiwan
,
Irwin King
The Chinese University of Hong Kong, Hong Kong
,
Tie-Yan Liu
Microsoft Research Asia, China
,
Maarten van Steen
University of Twente, Netherlands

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 April 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '20

Sponsor:

SIGWEB

WWW '20: The Web Conference 2020

April 20 - 24, 2020

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

42
Total Citations
View Citations
1,027
Total Downloads

Downloads (Last 12 months)48
Downloads (Last 6 weeks)5

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Anzum FAsha ADey LGavrilov AIffath FOhi APond LShopon MGavrilova M(2024)A Comprehensive Review of Trustworthy, Ethical, and Explainable Computer Vision Advancements in Online Social MediaGlobal Perspectives on the Applications of Computer Vision in Cybersecurity10.4018/978-1-6684-8127-1.ch001(1-46)Online publication date: 23-Feb-2024
https://doi.org/10.4018/978-1-6684-8127-1.ch001
Celik EIlhan Omurca S(2024)Skip-Gram and Transformer Model for Session-Based RecommendationApplied Sciences10.3390/app1414635314:14(6353)Online publication date: 21-Jul-2024
https://doi.org/10.3390/app14146353
Sahebi SYao MZhao SFeyzi Behnagh R(2024)MoMENt: Marked Point Processes with Memory-Enhanced Neural Networks for User Activity ModelingACM Transactions on Knowledge Discovery from Data10.1145/364950418:6(1-32)Online publication date: 29-Feb-2024
https://dl.acm.org/doi/10.1145/3649504
Liu YWalder CXie LLiu YBaeza-Yates RBonchi F(2024)Probabilistic Attention for Sequential RecommendationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671733(1956-1967)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671733
Jiao YYang FChen YGao YLiu JSun YChua TNgo CKumar RLauw HKa-Wei Lee R(2024)Rethinking Sequential Relationships: Improving Sequential Recommenders with Inter-Sequence Data AugmentationCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651552(641-645)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3651552
Zhang XXu BMa FLi CYang LLin H(2024)Beyond Co-Occurrence: Multi-Modal Session-Based RecommendationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.330999536:4(1450-1462)Online publication date: Apr-2024
https://doi.org/10.1109/TKDE.2023.3309995
Li QLi KGao XFu JZhang L(2024)Anomaly Detection Based on Temporal Attention Network With Adaptive Threshold Adjustment for Electrical Submersible PumpIEEE Transactions on Instrumentation and Measurement10.1109/TIM.2024.343611373(1-14)Online publication date: 2024
https://doi.org/10.1109/TIM.2024.3436113
Wang TDai YShao S(2023)CFSeRec: A Contrastive Framework for Sequential Recommendation2023 42nd Chinese Control Conference (CCC)10.23919/CCC58697.2023.10240619(8211-8216)Online publication date: 24-Jul-2023
https://doi.org/10.23919/CCC58697.2023.10240619
Ma MRen PChen ZRen ZLiang HMa JDe Rijke M(2023)Improving Transformer-based Sequential Recommenders through Preference EditingACM Transactions on Information Systems10.1145/356428241:3(1-24)Online publication date: 10-Apr-2023
https://dl.acm.org/doi/10.1145/3564282
Chu ZWang HXiao YLong BWu LChua TLauw HSi LTerzi ETsaparas P(2023)Meta Policy Learning for Cold-Start Conversational RecommendationProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570443(222-230)Online publication date: 27-Feb-2023
https://dl.acm.org/doi/10.1145/3539597.3570443
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten