Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3366423.3380285acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Déjà vu: A Contextualized Temporal Attention Mechanism for Sequential Recommendation

Published: 20 April 2020 Publication History

Abstract

Predicting users’ preferences based on their sequential behaviors in history is challenging and crucial for modern recommender systems. Most existing sequential recommendation algorithms focus on transitional structure among the sequential actions, but largely ignore the temporal and context information, when modeling the influence of a historical event to current prediction.
In this paper, we argue that the influence from the past events on a user’s current action should vary over the course of time and under different context. Thus, we propose a Contextualized Temporal Attention Mechanism that learns to weigh historical actions’ influence on not only what action it is, but also when and how the action took place. More specifically, to dynamically calibrate the relative input dependence from the self-attention mechanism, we deploy multiple parameterized kernel functions to learn various temporal dynamics, and then use the context information to determine which of these reweighing kernels to follow for each input. In empirical evaluations on two large public recommendation datasets, our model consistently outperformed an extensive set of state-of-the-art sequential recommendation methods.

References

[1]
Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450(2016).
[2]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473(2014).
[3]
Alex Beutel, Paul Covington, Sagar Jain, Can Xu, Jia Li, Vince Gatto, and Ed H Chi. 2018. Latent cross: Making use of context in recurrent recommender systems. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 46–54.
[4]
Renqin Cai, Xueying Bai, Zhenrui Wang, Yuling Shi, Parikshit Sondhi, and Hongning Wang. 2018. Modeling Sequential Online Interactive Behaviors with Temporal Point Process. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 873–882.
[5]
Xu Chen, Hongteng Xu, Yongfeng Zhang, Jiaxi Tang, Yixin Cao, Zheng Qin, and Hongyuan Zha. 2018. Sequential recommendation with user memory networks. In Proceedings of the eleventh ACM international conference on web search and data mining. ACM, 108–116.
[6]
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555(2014).
[7]
Qiang Cui, Shu Wu, Yan Huang, and Liang Wang. 2017. A hierarchical contextual attention-based gru network for sequential recommendation. arXiv preprint arXiv:1711.05114(2017).
[8]
Zihang Dai, Zhilin Yang, Yiming Yang, William W Cohen, Jaime Carbonell, Quoc V Le, and Ruslan Salakhutdinov. 2019. Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860(2019).
[9]
Samarjit Das. 1994. Time series analysis. Princeton University Press, Princeton, NJ.
[10]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018).
[11]
Nan Du, Hanjun Dai, Rakshit Trivedi, Utkarsh Upadhyay, Manuel Gomez-Rodriguez, and Le Song. 2016. Recurrent marked temporal point processes: Embedding event history to vector. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1555–1564.
[12]
Alan G Hawkes. 1971. Spectra of some self-exciting and mutually exciting point processes. Biometrika 58, 1 (1971), 83–90.
[13]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015), 770–778.
[14]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision. 1026–1034.
[15]
Balázs Hidasi and Alexandros Karatzoglou. 2018. Recurrent neural networks with top-k gains for session-based recommendations. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 843–852.
[16]
Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939(2015).
[17]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531(2015).
[18]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
[19]
Wang-Cheng Kang and Julian McAuley. 2018. Self-Attentive Sequential Recommendation. In 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 197–206.
[20]
Yehuda Koren. 2009. Collaborative filtering with temporal dynamics. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 447–456.
[21]
Patrick J Laub, Thomas Taimre, and Philip K Pollett. 2015. Hawkes processes. arXiv preprint arXiv:1507.02822(2015).
[22]
Jing Li, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Tao Lian, and Jun Ma. 2017. Neural attentive session-based recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 1419–1428.
[23]
Lei Li, Li Zheng, Fan Yang, and Tao Li. 2014. Modeling and broadening temporal user interest in personalized news recommendation. Expert Systems with Applications 41, 7 (2014), 3168–3177.
[24]
Yang Li, Nan Du, and Samy Bengio. 2017. Time-dependent representation for neural event sequence prediction. arXiv preprint arXiv:1708.00065(2017).
[25]
Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025(2015).
[26]
Hongyuan Mei and Jason M Eisner. 2017. The neural hawkes process: A neurally self-modulating multivariate point process. In Advances in Neural Information Processing Systems. 6754–6764.
[27]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111–3119.
[28]
Vinod Nair and Geoffrey E. Hinton. 2010. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proceedings of the 27th International Conference on International Conference on Machine Learning(ICML’10). Omnipress, USA, 807–814. http://dl.acm.org/citation.cfm?id=3104322.3104425
[29]
Andrzej Pacuk, Piotr Sankowski, Karol Węgrzycki, Adam Witkowski, and Piotr Wygocki. 2016. RecSys Challenge 2016: Job Recommendations Based on Preselection of Offers and Gradient Boosting. In Proceedings of the Recommender Systems Challenge(RecSys Challenge ’16). ACM, New York, NY, USA, Article 10, 4 pages. https://doi.org/10.1145/2987538.2987544
[30]
Ofir Press and Lior Wolf. 2017. Using the Output Embedding to Improve Language Models. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers. Association for Computational Linguistics, Valencia, Spain, 157–163.
[31]
Massimo Quadrana, Alexandros Karatzoglou, Balázs Hidasi, and Paolo Cremonesi. 2017. Personalizing session-based recommendations with hierarchical recurrent neural networks. In Proceedings of the Eleventh ACM Conference on Recommender Systems. ACM, 130–137.
[32]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. [n. d.]. Language models are unsupervised multitask learners. Technical Report.
[33]
Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feedback. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence(UAI ’09). AUAI Press, Arlington, Virginia, United States, 452–461. http://dl.acm.org/citation.cfm?id=1795114.1795167
[34]
Louis L Scharf and Cédric Demeure. 1991. Statistical signal processing: detection, estimation, and time series analysis. Vol. 63. Addison-Wesley Reading, MA.
[35]
Mike Schuster and Kuldip K Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45, 11 (1997), 2673–2681.
[36]
Yang Song, Ali Mamdouh Elkahky, and Xiaodong He. 2016. Multi-rate deep learning for temporal recommendation. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 909–912.
[37]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research 15, 1 (2014), 1929–1958.
[38]
Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, and Peng Jiang. 2019. BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer. arXiv preprint arXiv:1904.06690(2019).
[39]
Jiaxi Tang, Francois Belletti, Sagar Jain, Minmin Chen, Alex Beutel, Can Xu, and Ed H Chi. 2019. Towards Neural Mixture Recommender for Long Range Dependent User Sequences. arXiv preprint arXiv:1902.08588(2019).
[40]
Jiaxi Tang and Ke Wang. 2018. Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining(WSDM ’18). ACM, New York, NY, USA, 565–573. https://doi.org/10.1145/3159652.3159656
[41]
Bjørnar Vassøy, Massimiliano Ruocco, Eliezer de Souza da Silva, and Erlend Aune. 2019. Time is of the Essence: a Joint Hierarchical RNN and Point Process Model for Time and Item Predictions. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. ACM, 591–599.
[42]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998–6008.
[43]
Shuai Xiao, Junchi Yan, Xiaokang Yang, Hongyuan Zha, and Stephen M Chu. 2017. Modeling the intensity function of point process via recurrent neural networks. In Thirty-First AAAI Conference on Artificial Intelligence.
[44]
Liang Xiong, Xi Chen, Tzu-Kuo Huang, Jeff Schneider, and Jaime G Carbonell. 2010. Temporal collaborative filtering with bayesian probabilistic tensor factorization. In Proceedings of the 2010 SIAM International Conference on Data Mining. SIAM, 211–222.
[45]
Jiaxuan You, Yichen Wang, Aditya Pal, Pong Eksombatchai, Chuck Rosenberg, and Jure Leskovec. 2019. Hierarchical Temporal Convolutional Networks for Dynamic Recommender Systems. arXiv preprint arXiv:1904.04381(2019).
[46]
Yuyu Zhang, Hanjun Dai, Chang Xu, Jun Feng, Taifeng Wang, Jiang Bian, Bin Wang, and Tie-Yan Liu. 2014. Sequential Click Prediction for Sponsored Search with Recurrent Neural Networks. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence(AAAI’14). AAAI Press, 1369–1375. http://dl.acm.org/citation.cfm?id=2893873.2894086
[47]
Han Zhu, Xiang Li, Pengye Zhang, Guozheng Li, Jie He, Han Li, and Kun Gai. 2018. Learning Tree-based Deep Model for Recommender Systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1079–1088.
[48]
Yu Zhu, Hao Li, Yikang Liao, Beidou Wang, Ziyu Guan, Haifeng Liu, and Deng Cai. 2017. What to Do Next: Modeling User Behaviors by Time-LSTM. In IJCAI. 3602–3608.

Cited By

View all
  • (2024)A Comprehensive Review of Trustworthy, Ethical, and Explainable Computer Vision Advancements in Online Social MediaGlobal Perspectives on the Applications of Computer Vision in Cybersecurity10.4018/978-1-6684-8127-1.ch001(1-46)Online publication date: 23-Feb-2024
  • (2024)Skip-Gram and Transformer Model for Session-Based RecommendationApplied Sciences10.3390/app1414635314:14(6353)Online publication date: 21-Jul-2024
  • (2024)MoMENt: Marked Point Processes with Memory-Enhanced Neural Networks for User Activity ModelingACM Transactions on Knowledge Discovery from Data10.1145/364950418:6(1-32)Online publication date: 29-Feb-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '20: Proceedings of The Web Conference 2020
April 2020
3143 pages
ISBN:9781450370233
DOI:10.1145/3366423
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 April 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Attention Mechanism
  2. Context
  3. Neural Recommender System
  4. Sequential Recommendation
  5. Temporal Dynamics

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '20
Sponsor:
WWW '20: The Web Conference 2020
April 20 - 24, 2020
Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)48
  • Downloads (Last 6 weeks)5
Reflects downloads up to 25 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Comprehensive Review of Trustworthy, Ethical, and Explainable Computer Vision Advancements in Online Social MediaGlobal Perspectives on the Applications of Computer Vision in Cybersecurity10.4018/978-1-6684-8127-1.ch001(1-46)Online publication date: 23-Feb-2024
  • (2024)Skip-Gram and Transformer Model for Session-Based RecommendationApplied Sciences10.3390/app1414635314:14(6353)Online publication date: 21-Jul-2024
  • (2024)MoMENt: Marked Point Processes with Memory-Enhanced Neural Networks for User Activity ModelingACM Transactions on Knowledge Discovery from Data10.1145/364950418:6(1-32)Online publication date: 29-Feb-2024
  • (2024)Probabilistic Attention for Sequential RecommendationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671733(1956-1967)Online publication date: 25-Aug-2024
  • (2024)Rethinking Sequential Relationships: Improving Sequential Recommenders with Inter-Sequence Data AugmentationCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651552(641-645)Online publication date: 13-May-2024
  • (2024)Beyond Co-Occurrence: Multi-Modal Session-Based RecommendationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.330999536:4(1450-1462)Online publication date: Apr-2024
  • (2024)Anomaly Detection Based on Temporal Attention Network With Adaptive Threshold Adjustment for Electrical Submersible PumpIEEE Transactions on Instrumentation and Measurement10.1109/TIM.2024.343611373(1-14)Online publication date: 2024
  • (2023)CFSeRec: A Contrastive Framework for Sequential Recommendation2023 42nd Chinese Control Conference (CCC)10.23919/CCC58697.2023.10240619(8211-8216)Online publication date: 24-Jul-2023
  • (2023)Improving Transformer-based Sequential Recommenders through Preference EditingACM Transactions on Information Systems10.1145/356428241:3(1-24)Online publication date: 10-Apr-2023
  • (2023)Meta Policy Learning for Cold-Start Conversational RecommendationProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570443(222-230)Online publication date: 27-Feb-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media