Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Efficient On-Device Session-Based Recommendation

Published: 22 March 2023 Publication History

Abstract

On-device session-based recommendation systems have been achieving increasing attention on account of the low energy/resource consumption and privacy protection while providing promising recommendation performance. To fit the powerful neural session-based recommendation models in resource-constrained mobile devices, tensor-train decomposition and its variants have been widely applied to reduce memory footprint by decomposing the embedding table into smaller tensors, showing great potential in compressing recommendation models. However, these model compression techniques significantly increase the local inference time due to the complex process of generating index lists and a series of tensor multiplications to form item embeddings. The resultant on-device recommender fails to provide real-time responses and recommendations. To improve the online recommendation efficiency, we propose to learn compositional encoding-based compact item representations. Specifically, each item is represented by a compositional code that consists of several codewords, and we learn embedding vectors to represent each codeword instead of each item. Then the composition of the codeword embedding vectors from different embedding matrices (i.e., codebooks) forms the item embedding. Since the size of codebooks can be extremely small, the recommender model is thus able to fit in resource-constrained devices and save the codebooks for fast local inference. In addition, to prevent the loss of model capacity caused by compression, we propose a bidirectional self-supervised knowledge distillation framework. Extensive experimental results on two benchmark datasets demonstrate that compared with existing methods, the proposed on-device recommender not only achieves an 8× inference speedup with a large compression ratio but also shows superior recommendation performance. The code is released at https://github.com/xiaxin1998/EODRec.

References

[1]
Benu Madhab Changmai, Divija Nagaraju, Debi Prasanna Mohanty, Kriti Singh, Kunal Bansal, and Sukumar Moharana. 2019. On-device user intent prediction for context and sequence aware recommendation. arXiv:1909.12756.
[2]
Liyang Chen, Yongquan Chen, Juntong Xi, and Xinyi Le. 2021. Knowledge from the original network: Restore a better pruned network with knowledge distillation. Complex & Intelligent Systems 8 (2022), 709–718.
[3]
Ting Chen, Martin Renqiang Min, and Yizhou Sun. 2018. Learning k-way d-dimensional discrete codes for compact embedding representations. In Proceedings of the International Conference on Machine Learning. PMLR, 854–863.
[4]
Tong Chen, Hongzhi Yin, Quoc Viet Hung Nguyen, Wen-Chih Peng, Xue Li, and Xiaofang Zhou. 2020. Sequence-aware factorization machines for temporal predictive analytics. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE’20). IEEE, 1405–1416.
[5]
Tong Chen, Hongzhi Yin, Yujia Zheng, Zi Huang, Yang Wang, and Meng Wang. 2021. Learning elastic embeddings for customizing on-device recommenders. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 138–147.
[6]
Wenlin Chen, James Wilson, Stephen Tyree, Kilian Weinberger, and Yixin Chen. 2015. Compressing neural networks with the hashing trick. In Proceedings of the International Conference on Machine Learning. PMLR, 2285–2294.
[7]
Xusong Chen, Dong Liu, Chenyi Lei, Rui Li, Zheng-Jun Zha, and Zhiwei Xiong. 2019. BERT4SessRec: Content-based video relevance prediction with bidirectional encoder representations from transformer. In Proceedings of the 27th ACM International Conference on Multimedia. 2597–2601.
[8]
Gabriel de Souza Pereira Moreira, Sara Rabhi, Jeong Min Lee, Ronay Ak, and Even Oldridge. 2021. Transformers4Rec: Bridging the gap between NLP and sequential/session-based recommendation. In Proceedings of the 15th ACM Conference on Recommender Systems. 143–153.
[9]
Sauptik Dhar, Junyao Guo, Jiayi Liu, Samarth Tripathi, Unmesh Kurup, and Mohak Shah. 2021. A survey of on-device machine learning: An algorithms and learning theory perspective. ACM Transactions on Internet of Things 2, 3 (2021), 1–49.
[10]
Simone Disabato and Manuel Roveri. 2020. Incremental on-device tiny machine learning. In Proceedings of the 2nd International Workshop on Challenges in Artificial Intelligence and Machine Learning for Internet of Things. 7–13.
[11]
Yunchao Gong, Liu Liu, Ming Yang, and Lubomir Bourdev. 2014. Compressing deep convolutional networks using vector quantization. arXiv:1412.6115.
[12]
Jialiang Han, Yun Ma, Qiaozhu Mei, and Xuanzhe Liu. 2021. DeepRec: On-device deep learning for privacy-preserving sequential recommendation in mobile commerce. In Proceedings of the Web Conference 2021. 900–911.
[13]
Song Han, Huizi Mao, and William J. Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv:1510.00149.
[14]
Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv:1511.06939.
[15]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv:1503.02531.
[16]
Oleksii Hrinchuk, Valentin Khrulkov, Leyla Mirvakhabova, Elena Orlova, and Ivan Oseledets. 2019. Tensorized embedding layers for efficient model compression. arXiv:1901.10787.
[17]
David A. Huffman. 1952. A method for the construction of minimum-redundancy codes. Proceedings of the IRE 40, 9 (1952), 1098–1101.
[18]
Eric Jang, Shixiang Gu, and Ben Poole. 2017. Categorical reparameterization with Gumbel-softmax. In Conference Track Proceedings of the 5th International Conference on Learning Representations (ICLR’17).
[19]
SeongKu Kang, Junyoung Hwang, Wonbin Kweon, and Hwanjo Yu. 2020. DE-RRD: A knowledge distillation framework for recommender system. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 605–614.
[20]
Wang-Cheng Kang, Derek Zhiyuan Cheng, Tiansheng Yao, Xinyang Yi, Ting Chen, Lichan Hong, and Ed H. Chi. 2020. Learning to embed categorical features without embedding tables for recommendation. arXiv:2010.10784.
[21]
Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recommendation. In Proceedings of the 2018 IEEE International Conference on Data Mining (ICDM’18). IEEE, 197–206.
[22]
Wonbin Kweon, SeongKu Kang, and Hwanjo Yu. 2021. Bidirectional distillation for top-K recommender system. In Proceedings of the Web Conference 2021. 3861–3871.
[23]
Juhyun Lee, Nikolay Chirkov, Ekaterina Ignasheva, Yury Pisarchyk, Mogan Shieh, Fabio Riccardi, Raman Sarokin, Andrei Kulik, and Matthias Grundmann. 2019. On-device neural net inference with mobile GPUs. arXiv:1907.01989.
[24]
Jae-woong Lee, Minjin Choi, Jongwuk Lee, and Hyunjung Shim. 2019. Collaborative distillation for top-N recommendation. In Proceedings of the 2019 IEEE International Conference on Data Mining (ICDM’19). IEEE, 369–378.
[25]
Jing Li, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Tao Lian, and Jun Ma. 2017. Neural attentive session-based recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 1419–1428.
[26]
Yang Li, Tong Chen, Peng-Fei Zhang, and Hongzhi Yin. 2021. Lightweight self-attentive sequential recommendation. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 967–977.
[27]
Chenghao Liu, Tao Lu, Xin Wang, Zhiyong Cheng, Jianling Sun, and Steven C. H. Hoi. 2019. Compositional coding for collaborative filtering. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 145–154.
[28]
Qiao Liu, Yifu Zeng, Refuoe Mokhosi, and Haibin Zhang. 2018. STAMP: Short-term attention/memory priority model for session-based recommendation. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1831–1839.
[29]
Xiao Liu, Fanjin Zhang, Zhenyu Hou, Zhaoyu Wang, Li Mian, Jing Zhang, and Jie Tang. 2020. Self-supervised learning: Generative or contrastive. IEEE Transactions on Knowledge and Data Engineering 35, 1 (2023), 857–876.
[30]
Muyang Ma, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Huasheng Liang, Jun Ma, and Maarten de Rijke. 2021. Improving transformer-based sequential recommenders through preference editing. arXiv:2106.12120.
[31]
Matthew S. Meselson and Charles M. Radding. 1975. A general model for genetic recombination. Proceedings of the National Academy of Sciences U S A 72, 1 (1975), 358–361.
[32]
Alexander Novikov, Dmitry Podoprikhin, Anton Osokin, and Dmitry Vetrov. 2015. Tensorizing neural networks. arXiv:1509.06569.
[33]
Keiichi Ochiai, Kohei Senkawa, Naoki Yamamoto, Yuya Tanaka, and Yusuke Fukazawa. 2019. Real-time on-device troubleshooting recommendation for smartphones. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2783–2791.
[34]
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv:1807.03748.
[35]
Ivan V. Oseledets. 2011. Tensor-train decomposition. SIAM Journal on Scientific Computing 33, 5 (2011), 2295–2317.
[36]
Zhiqiang Pan, Fei Cai, Yanxiang Ling, and Maarten de Rijke. 2020. Rethinking item importance in session-based recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1837–1840.
[37]
Bryan A. Plummer, Nikoli Dryden, Julius Frost, Torsten Hoefler, and Kate Saenko. 2020. Shapeshifter networks: Cross-layer parameter sharing for scalable and effective deep learning. arXiv–2006.
[38]
Ruihong Qiu, Hongzhi Yin, Zi Huang, and Tong Chen. 2020. Gag: Global attributed graph neural network for streaming session-based recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 669–678.
[39]
Guy Shani, David Heckerman, and Ronen I. Brafman. 2005. An MDP-based recommender system. Journal of Machine Learning Research 6, (Sept. 2005), 1265–1295.
[40]
Hao-Jun Michael Shi, Dheevatsa Mudigere, Maxim Naumov, and Jiyan Yang. 2020. Compositional embeddings using complementary partitions for memory-efficient recommendation systems. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 165–175.
[41]
Raphael Shu and Hideki Nakayama. 2017. Compressing word embeddings via deep compositional code learning. arXiv:1711.01068.
[42]
Suraj Srinivas and R. Venkatesh Babu. 2015. Data-free parameter pruning for deep neural networks. arXiv:1507.06149.
[43]
Yang Sun, Fajie Yuan, Min Yang, Guoao Wei, Zhou Zhao, and Duo Liu. 2020. A generic network compression framework for sequential recommender systems. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1299–1308.
[44]
Jiaxi Tang and Ke Wang. 2018. Ranking distillation: Learning compact ranking models with high performance for recommender system. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2289–2298.
[45]
Dan Tito Svenstrup, Jonas Hansen, and Ole Winther. 2017. Hash embeddings for efficient word representations. Advances in Neural Information Processing Systems 30 (2017).
[46]
Huynh Thanh Trung, Tong Van Vinh, Nguyen Thanh Tam, Hongzhi Yin, Matthias Weidlich, and Nguyen Quoc Viet Hung. 2020. Adaptive network alignment with unsupervised and multi-order convolutional networks. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE’20). 85–96.
[47]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998–6008.
[48]
Qinyong Wang, Hongzhi Yin, Tong Chen, Zi Huang, Hao Wang, Yanchang Zhao, and Nguyen Quoc Viet Hung. 2020. Next point-of-interest recommendation on resource-constrained mobile devices. In Proceedings of the Web Conference 2020. 906–916.
[49]
Ziyang Wang, Wei Wei, Gao Cong, Xiao-Li Li, Xian-Ling Mao, and Minghui Qiu. 2020. Global context enhanced graph neural networks for session-based recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 169–178.
[50]
Shu Wu, Yuyuan Tang, Yanqiao Zhu, Liang Wang, Xing Xie, and Tieniu Tan. 2019. Session-based recommendation with graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 346–353.
[51]
Xiaorui Wu, Hong Xu, Honglin Zhang, Huaming Chen, and Jian Wang. 2020. Saec: Similarity-aware embedding compression in recommendation systems. In Proceedings of the 11th ACM SIGOPS Asia-Pacific Workshop on Systems. 82–89.
[52]
Xin Xia, Hongzhi Yin, Junliang Yu, Yingxia Shao, and Lizhen Cui. 2021. Self-supervised graph co-training for session-based recommendation. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 2180–2190.
[53]
Xin Xia, Hongzhi Yin, Junliang Yu, Qinyong Wang, Lizhen Cui, and Xiangliang Zhang. 2021. Self-supervised hypergraph convolutional networks for session-based recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence. 4503–4511.
[54]
Xin Xia, Hongzhi Yin, Junliang Yu, Qinyong Wang, Guandong Xu, and Quoc Viet Hung Nguyen. 2022. On-device next-item recommendation with self-supervised knowledge distillation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 546–555.
[55]
Chengfeng Xu, Pengpeng Zhao, Yanchi Liu, Victor S. Sheng, Jiajie Xu, Fuzhen Zhuang, Junhua Fang, and Xiaofang Zhou. 2019. Graph contextualized self-attention network for session-based recommendation. In IJCAI, Vol. 19. 3940–3946.
[56]
Chunxing Yin, Bilge Acun, Carole-Jean Wu, and Xing Liu. 2021. TT-Rec: Tensor train compression for deep learning recommendation models. Proceedings of Machine Learning and Systems 3 (2021).
[57]
Hongzhi Yin and Bin Cui. 2016. Spatio-Temporal Recommendation in Social Media. Springer.
[58]
Junliang Yu, Xin Xia, Tong Chen, Lizhen Cui, Nguyen Quoc Viet Hung, and Hongzhi Yin. 2022. XSimGCL: Towards extremely simple graph contrastive learning for recommendation. arXiv:2209.02544.
[59]
Junliang Yu, Hongzhi Yin, Min Gao, Xin Xia, Xiangliang Zhang, and Nguyen Quoc Viet Hung. 2021. Socially-aware self-supervised tri-training for recommendation. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2084–2092.
[60]
Junliang Yu, Hongzhi Yin, Jundong Li, Qinyong Wang, Nguyen Quoc Viet Hung, and Xiangliang Zhang. 2021. Self-supervised multi-channel hypergraph convolutional network for social recommendation. In Proceedings of the Web Conference 2021. 413–424.
[61]
Junliang Yu, Hongzhi Yin, Xin Xia, Tong Chen, Lizhen Cui, and Quoc Viet Hung Nguyen. 2022. Are graph augmentations necessary? Simple graph contrastive learning for recommendation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1294–1303.
[62]
Junliang Yu, Hongzhi Yin, Xin Xia, Tong Chen, Jundong Li, and Zi Huang. 2022. Self-supervised learning for recommender systems: A survey. arXiv:2203.15876.
[63]
Fajie Yuan, Xiangnan He, Alexandros Karatzoglou, and Liguang Zhang. 2020. Parameter-efficient transfer from sequential behaviors for user modeling and recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1469–1478.
[64]
Yan Zhang, Hongzhi Yin, Zi Huang, Xingzhong Du, Guowu Yang, and Defu Lian. 2018. Discrete deep learning for fast content-aware recommendation. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining. 717–726.
[65]
Hengling Zhao, Yipeng Liu, Xiaolin Huang, and Ce Zhu. 2021. Semi-tensor product-based TensorDecomposition for neural network compression. arXiv:2109.15200.

Cited By

View all
  • (2024)Efficient Inference of Sub-Item Id-based Sequential Recommendation Models with Millions of ItemsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688168(912-917)Online publication date: 8-Oct-2024
  • (2024)SMLP4Rec: An Efficient All-MLP Architecture for Sequential RecommendationsACM Transactions on Information Systems10.1145/363787142:3(1-23)Online publication date: 22-Jan-2024
  • (2024)A Population-to-individual Tuning Framework for Adapting Pretrained LM to On-device User Intent PredictionProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671984(896-907)Online publication date: 25-Aug-2024
  • Show More Cited By

Index Terms

  1. Efficient On-Device Session-Based Recommendation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Information Systems
    ACM Transactions on Information Systems  Volume 41, Issue 4
    October 2023
    958 pages
    ISSN:1046-8188
    EISSN:1558-2868
    DOI:10.1145/3587261
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 March 2023
    Online AM: 13 January 2023
    Accepted: 05 January 2023
    Revised: 08 December 2022
    Received: 02 October 2022
    Published in TOIS Volume 41, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Model compression
    2. on-device learning
    3. next-item recommendation
    4. self-supervised learning
    5. knowledge distillation

    Qualifiers

    • Research-article

    Funding Sources

    • Australian Research Council Future Fellowship
    • Discovery Project
    • Discovery Early Career Research Award

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)594
    • Downloads (Last 6 weeks)40
    Reflects downloads up to 04 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Efficient Inference of Sub-Item Id-based Sequential Recommendation Models with Millions of ItemsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688168(912-917)Online publication date: 8-Oct-2024
    • (2024)SMLP4Rec: An Efficient All-MLP Architecture for Sequential RecommendationsACM Transactions on Information Systems10.1145/363787142:3(1-23)Online publication date: 22-Jan-2024
    • (2024)A Population-to-individual Tuning Framework for Adapting Pretrained LM to On-device User Intent PredictionProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671984(896-907)Online publication date: 25-Aug-2024
    • (2024)Poisoning Decentralized Collaborative Recommender System and Its CountermeasuresProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657814(1712-1721)Online publication date: 10-Jul-2024
    • (2024)MeMemo: On-device Retrieval Augmentation for Private and Personalized Text GenerationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657662(2765-2770)Online publication date: 10-Jul-2024
    • (2024)On-Device Recommender Systems: A Tutorial on The New-Generation Recommendation ParadigmCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3641250(1280-1283)Online publication date: 13-May-2024
    • (2024)Decentralized Collaborative Learning with Adaptive Reference Data for On-Device POI RecommendationProceedings of the ACM Web Conference 202410.1145/3589334.3645696(3930-3939)Online publication date: 13-May-2024
    • (2024)Personalized Elastic Embedding Learning for On-Device RecommendationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.336156236:7(3363-3375)Online publication date: 2-Feb-2024
    • (2024)Graph Condensation for Inductive Node Representation Learning2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00237(3056-3069)Online publication date: 13-May-2024
    • (2024)Accelerating Scalable Graph Neural Network Inference with Node-Adaptive Propagation2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00236(3042-3055)Online publication date: 13-May-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media