research-article

Hierarchical User Intent Graph Network for Multimedia Recommendation

Authors:

Tat-Seng ChuaAuthors Info & Claims

IEEE Transactions on Multimedia, Volume 24

Pages 2701 - 2712

https://doi.org/10.1109/TMM.2021.3088307

Published: 01 January 2022 Publication History

Abstract

Understanding user preference on item context is the key to acquire a high-quality multimedia recommendation. Typically, the pre-existing features of items are derived from pre-trained models (e.g. visual features of micro-videos extracted from some neural networks), and then introduced into the recommendation framework (e.g. collaborative filtering) to capture user preference. However, we argue that such a paradigm is insufficient to output satisfactory user representations, which hardly profile personal interests well. The key reason is that present works largely leave user intents untouched, then failing to encode such informative representation of users. In this work, we aim to learn multi-level user intents from the co-interacted patterns of items, so as to obtain high-quality representations of users and items and further enhance the recommendation performance. Towards this end, we develop a novel framework, <italic>Hierarchical User Intent Graph Network</italic>, which exhibits user intents in a hierarchical graph structure, from the fine-grained to coarse-grained intents. In particular, we get the multi-level user intents by recursively performing two operations: 1) intra-level aggregation, which distills the signal pertinent to user intents from co-interacted item graphs; and 2) inter-level aggregation, which constitutes the supernode in higher levels to model coarser-grained user intents via gathering the nodes’ representations in the lower ones. Then, we refine the user and item representations as a distribution over the discovered intents, instead of simple pre-existing features. To demonstrate the effectiveness of our model, we conducted extensive experiments on three public datasets. Our model achieves significant improvements over the state-of-the-art methods, including MMGCN and DisenGCN. Furthermore, by visualizing the item representations, we provide the semantics of user intents.

References

[1]

R. He and J. McAuley, “VBPR: Visual bayesian personalized ranking from implicit feedback,” in Proc. AAAI Conf. Artif. Intell., 2016, pp. 144–150.

[2]

J. Chen, H. Zhang, X. He, L. Nie, W. Liu, and T.-S. Chua, “Attentive collaborative filtering: Multimedia recommendation with item-and component-level attention,” in Proc. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2017, pp. 335–344.

[3]

Y. Wei, X. Wang, L. Nie, X. He, R. Hong, and T.-S. Chua, “MMGCN: Multi-modal graph convolution network for personalized recommendation of micro-video,” in Proc. ACM Multimedia Conf. Multimedia Conf., 2019, pp. 1437–1445.

[4]

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.

[5]

G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 4700–4708.

[6]

A. Voulodimos, N. Doulamis, A. Doulamis, and E. Protopapadakis, “Deep learning for computer vision: A brief review,”Comput. Intell. Neurosci., vol. 2018, 2018.

Digital Library

[7]

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Proc. Adv. Neural Inf. Process. Syst., 2013, pp. 3111–3119.

[8]

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proc. Annu. Conf. Assoc. Comput. Linguistics, 2019, p. 4171–4186.

[9]

S. Liu, Z. Chen, H. Liu, and X. Hu, “User-video co-attention network for personalized micro-video recommendation,” in Proc. Int. Conf. World Wide Web, 2019, pp. 3020–3026.

[10]

C. Lei, D. Liu, W. Li, Z.-J. Zha, and H. Li, “Comparative deep learning of hybrid representations for image recommendations,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 2545–2553.

[11]

P. Cui, Z. Wang, and Z. Su, “What videos are similar with you? learning a common attributed representation for video recommendation,” in Proc. ACM Multimedia Conf. Multimedia Conf., 2014, pp. 597–606.

[12]

W. Fanet al., “Graph neural networks for social recommendation,” in Proc. Int. Conf. World Wide Web, 2019, pp. 417–426.

[13]

Y. Du, M. Fang, J. Yi, C. Xu, J. Cheng, and D. Tao, “Enhancing the robustness of neural collaborative filtering systems under malicious attacks,”IEEE Trans. Multimedia, vol. 21, no. 3, pp. 555–565, Mar.2019.

[14]

Y. Yang, Y. Xu, E. Wang, J. Han, and Z. Yu, “Improving existing collaborative filtering recommendations via serendipity-based algorithm,”IEEE Trans. Multimedia, vol. 20, no. 7, pp. 1888–1900, Jul.2018.

[15]

S. Jiang, X. Qian, J. Shen, Y. Fu, and T. Mei, “Author topic model-based collaborative filtering for personalized poi recommendations,”IEEE Trans. Multimedia, vol. 17, no. 6, pp. 907–918, Jun.2015.

[16]

J. Wu, S.-H. Zhong, and Y. Liu, “MVSGCN: A novel graph convolutional network for multi-video summarization,” in Proc. ACM Int. Conf. Multimedia, 2019, pp. 827–835.

[17]

F. Ye, S. Pu, Q. Zhong, C. Li, D. Xie, and H. Tang, “Dynamic GCN: Context-enriched topology learning for skeleton-based action recognition,” in Proc. ACM Int. Conf. Multimedia, 2020, pp. 55–63.

[18]

Z. Zhang, D. Xu, W. Ouyang, and L. Zhou, “Dense video captioning using graph-based sentence summarization,”IEEE Trans. Multimedia, vol. 23, pp. 1799–1810, Jun. 2020.

Digital Library

[19]

J. Liu, Z.-J. Zha, R. Hong, M. Wang, and Y. Zhang, “Deep adversarial graph attention convolution network for text-based person search,” in Proc. ACM Int. Conf. Multimedia, 2019, pp. 665–673.

[20]

J. Zhang, Y. Yang, Q. Tian, L. Zhuo, and X. Liu, “Personalized social image recommendation method based on user-image-tag model,”IEEE Trans. Multimedia, vol. 19, no. 11, pp. 2439–2449, Nov.2017.

[21]

Y. Wei, X. Wang, L. Nie, X. He, and T.-S. Chua, “Graph-refined convolutional network for multimedia recommendation with implicit feedback,” in Proc. ACM Int. Conf. Multimedia, 2020, pp. 3541–3549.

[22]

X. Wang, X. He, M. Wang, F. Feng, and T. Chua, “Neural graph collaborative filtering,” in Proc. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2019, pp. 165–174.

[23]

Z. Ying, J. You, C. Morris, X. Ren, W. Hamilton, and J. Leskovec, “Hierarchical graph representation learning with differentiable pooling,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2018, pp. 4800–4810.

[24]

W. Yu, X. Liang, K. Gong, C. Jiang, N. Xiao, and L. Lin, “Layout-graph reasoning for fashion landmark detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 2937–2945.

[25]

S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme, “BPR: Bayesian personalized ranking from implicit feedback,” in Proc. Conf. Uncertainty Artif. Intell., 2009, pp. 452–461.

[26]

L. Wu, P. Sun, Y. Fu, R. Hong, X. Wang, and M. Wang, “A neural influence diffusion model for social recommendation,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2019, pp. 1–10.

[27]

X. Xin, X. He, Y. Zhang, Y. Zhang, and J. Jose, “Relational collaborative filtering: Modeling multiple item relations for recommendation,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2019, pp. 1–10.

[28]

W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2017, pp. 1024–1034.

[29]

K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” in Proc. Int. Conf. Learn. Representations, 2018, pp. 1–17.

[30]

T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in Proc. Int. Conf. Learn. Representations, 2016, pp. 1–14.

[31]

N. D. Doulamis, A. D. Doulamis, P. Kokkinos, and E. M. Varvarigos, “Event detection in twitter microblogging,”IEEE Trans. Cybern., vol. 46, no. 12, pp. 2810–2824, Dec.2016.

[32]

X. Wang, X. He, Y. Cao, M. Liu, and T. Chua, “KGAT: Knowledge graph attention network for recommendation,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2019, pp. 1–9.

[33]

S. Hersheyet al., “CNN architectures for large-scale audio classification,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2017, pp. 131–135.

[34]

S. Arora, Y. Liang, and T. Ma, “A simple but tough-to-beat baseline for sentence embeddings,” in Proc. Int. Conf. Learn. Representations, 2016, pp. 1–16.

[35]

X. Geng, H. Zhang, J. Bian, and T.-S. Chua, “Learning image and user features for recommendation in social networks,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 4274–4282.

[36]

O. Barkan, N. Koenigstein, E. Yogev, and O. Katz, “CB₂CF: A neural multiview content-to-collaborative filtering model for completely cold item recommendations,” in Proc. ACM Conf. Recommender Syst., 2019, pp. 228–236.

[37]

J. Ma, P. Cui, K. Kuang, X. Wang, and W. Zhu, “Disentangled graph convolutional networks,” in Proc. Int. Conf. Mach. Learn., 2019, pp. 4212–4221.

[38]

J. Ma, C. Zhou, P. Cui, H. Yang, and W. Zhu, “Learning disentangled representations for recommendation,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2019, pp. 5712–5723.

[39]

X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proc. Int. Conf. Artif. Intell. Statist., 2010, pp. 249–256.

[40]

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learn. Representations, 2015, pp. 1–16.

Cited By

Jannach DZanker M(2024)A Survey on Intent-aware Recommender SystemsACM Transactions on Recommender Systems10.1145/37008903:2(1-32)Online publication date: 16-Oct-2024
https://dl.acm.org/doi/10.1145/3700890
Jiang YXia LWei WLuo DLin KHuang CCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)DiffMM: Multi-Modal Diffusion Model for RecommendationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681498(7591-7599)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681498
Hadizadeh Moghaddam ANayebi Kerdabadi MLiu MYao ZSerra ESpezzano F(2024)Contrastive Learning on Medical Intents for Sequential Prescription RecommendationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679836(748-757)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679836
Show More Cited By

Recommendations

Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation
WWW '23: Proceedings of the ACM Web Conference 2023

Recommender systems are essential to various fields, e.g., e-commerce, e-learning, and streaming media. At present, graph neural networks (GNNs) for session-based recommendations normally can only recommend items existing in users’ historical sessions. ...
Entity-driven user intent inference for knowledge graph-based recommendation
Abstract
It has been proven that a knowledge graph (KG) has the ability to improve the accuracy of recommendations, owing to its capability of storing the auxiliary information of items in a heterogeneous structure. Recently, intent inference methods have ...
User's Interests-Based Movie Recommendation in Heterogeneous Network
IIKI '15: Proceedings of the 2015 International Conference on Identification, Information, and Knowledge in the Internet of Things (IIKI)

For large data set of movies and various users' interests, recommender systems should pay attention to time consumption and personal interest. However, existing information filtering techniques rarely research users' interests in movies. In order to ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia

IEEE Transactions on Multimedia Volume 24, Issue

2022

2475 pages

ISSN:1520-9210

Issue’s Table of Contents

1520-9210 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 01 January 2022

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

27
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jannach DZanker M(2024)A Survey on Intent-aware Recommender SystemsACM Transactions on Recommender Systems10.1145/37008903:2(1-32)Online publication date: 16-Oct-2024
https://dl.acm.org/doi/10.1145/3700890
Jiang YXia LWei WLuo DLin KHuang CCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)DiffMM: Multi-Modal Diffusion Model for RecommendationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681498(7591-7599)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681498
Hadizadeh Moghaddam ANayebi Kerdabadi MLiu MYao ZSerra ESpezzano F(2024)Contrastive Learning on Medical Intents for Sequential Prescription RecommendationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679836(748-757)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679836
Zhang YZhou XZhu FLiu NGuo WXu YShen ZCui LSerra ESpezzano F(2024)Multi-modal Food Recommendation with Health-aware Knowledge DistillationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679580(3279-3289)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679580
Shang YGao CChen JJin DLi YChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Improving Item-side Fairness of Multimodal Recommendation via Modality DebiasingProceedings of the ACM Web Conference 202410.1145/3589334.3648156(4697-4705)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3648156
Jing PZhao XFan FYang FLi YSu Y(2024)Multimodal Progressive Modulation Network for Micro-Video Multi-Label ClassificationIEEE Transactions on Multimedia10.1109/TMM.2024.340572426(10134-10144)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3405724
Guo JWen LZhou YSong BChi YYu F(2024)SPACE: Self-Supervised Dual Preference Enhancing Network for Multimodal RecommendationIEEE Transactions on Multimedia10.1109/TMM.2024.338288926(8849-8859)Online publication date: 28-Mar-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3382889
Li JWang YLi W(2024)MHRN: A Multimodal Hierarchical Reasoning Network for Topic DetectionIEEE Transactions on Multimedia10.1109/TMM.2024.335869626(6968-6980)Online publication date: 25-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3358696
Tang HZhao GGao JQian X(2024)Personalized Representation With Contrastive Loss for Recommendation SystemsIEEE Transactions on Multimedia10.1109/TMM.2023.329574026(2419-2429)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2023.3295740
Li SXue FLiu KGuo DHong R(2024)Multimodal Graph Causal Embedding for Multimedia-Based RecommendationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.342426836:12(8842-8858)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1109/TKDE.2024.3424268
Show More Cited By

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents