Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Hierarchical User Intent Graph Network for Multimedia Recommendation

Published: 01 January 2022 Publication History

Abstract

Understanding user preference on item context is the key to acquire a high-quality multimedia recommendation. Typically, the pre-existing features of items are derived from pre-trained models (e.g. visual features of micro-videos extracted from some neural networks), and then introduced into the recommendation framework (e.g. collaborative filtering) to capture user preference. However, we argue that such a paradigm is insufficient to output satisfactory user representations, which hardly profile personal interests well. The key reason is that present works largely leave user intents untouched, then failing to encode such informative representation of users. In this work, we aim to learn multi-level user intents from the co-interacted patterns of items, so as to obtain high-quality representations of users and items and further enhance the recommendation performance. Towards this end, we develop a novel framework, <italic>Hierarchical User Intent Graph Network</italic>, which exhibits user intents in a hierarchical graph structure, from the fine-grained to coarse-grained intents. In particular, we get the multi-level user intents by recursively performing two operations: 1) intra-level aggregation, which distills the signal pertinent to user intents from co-interacted item graphs; and 2) inter-level aggregation, which constitutes the supernode in higher levels to model coarser-grained user intents via gathering the nodes&#x2019; representations in the lower ones. Then, we refine the user and item representations as a distribution over the discovered intents, instead of simple pre-existing features. To demonstrate the effectiveness of our model, we conducted extensive experiments on three public datasets. Our model achieves significant improvements over the state-of-the-art methods, including MMGCN and DisenGCN. Furthermore, by visualizing the item representations, we provide the semantics of user intents.

References

[1]
R. He and J. McAuley, “VBPR: Visual bayesian personalized ranking from implicit feedback,” in Proc. AAAI Conf. Artif. Intell., 2016, pp. 144–150.
[2]
J. Chen, H. Zhang, X. He, L. Nie, W. Liu, and T.-S. Chua, “Attentive collaborative filtering: Multimedia recommendation with item-and component-level attention,” in Proc. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2017, pp. 335–344.
[3]
Y. Wei, X. Wang, L. Nie, X. He, R. Hong, and T.-S. Chua, “MMGCN: Multi-modal graph convolution network for personalized recommendation of micro-video,” in Proc. ACM Multimedia Conf. Multimedia Conf., 2019, pp. 1437–1445.
[4]
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.
[5]
G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 4700–4708.
[6]
A. Voulodimos, N. Doulamis, A. Doulamis, and E. Protopapadakis, “Deep learning for computer vision: A brief review,”Comput. Intell. Neurosci., vol. 2018, 2018.
[7]
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Proc. Adv. Neural Inf. Process. Syst., 2013, pp. 3111–3119.
[8]
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proc. Annu. Conf. Assoc. Comput. Linguistics, 2019, p. 4171–4186.
[9]
S. Liu, Z. Chen, H. Liu, and X. Hu, “User-video co-attention network for personalized micro-video recommendation,” in Proc. Int. Conf. World Wide Web, 2019, pp. 3020–3026.
[10]
C. Lei, D. Liu, W. Li, Z.-J. Zha, and H. Li, “Comparative deep learning of hybrid representations for image recommendations,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 2545–2553.
[11]
P. Cui, Z. Wang, and Z. Su, “What videos are similar with you? learning a common attributed representation for video recommendation,” in Proc. ACM Multimedia Conf. Multimedia Conf., 2014, pp. 597–606.
[12]
W. Fanet al., “Graph neural networks for social recommendation,” in Proc. Int. Conf. World Wide Web, 2019, pp. 417–426.
[13]
Y. Du, M. Fang, J. Yi, C. Xu, J. Cheng, and D. Tao, “Enhancing the robustness of neural collaborative filtering systems under malicious attacks,”IEEE Trans. Multimedia, vol. 21, no. 3, pp. 555–565, Mar.2019.
[14]
Y. Yang, Y. Xu, E. Wang, J. Han, and Z. Yu, “Improving existing collaborative filtering recommendations via serendipity-based algorithm,”IEEE Trans. Multimedia, vol. 20, no. 7, pp. 1888–1900, Jul.2018.
[15]
S. Jiang, X. Qian, J. Shen, Y. Fu, and T. Mei, “Author topic model-based collaborative filtering for personalized poi recommendations,”IEEE Trans. Multimedia, vol. 17, no. 6, pp. 907–918, Jun.2015.
[16]
J. Wu, S.-H. Zhong, and Y. Liu, “MVSGCN: A novel graph convolutional network for multi-video summarization,” in Proc. ACM Int. Conf. Multimedia, 2019, pp. 827–835.
[17]
F. Ye, S. Pu, Q. Zhong, C. Li, D. Xie, and H. Tang, “Dynamic GCN: Context-enriched topology learning for skeleton-based action recognition,” in Proc. ACM Int. Conf. Multimedia, 2020, pp. 55–63.
[18]
Z. Zhang, D. Xu, W. Ouyang, and L. Zhou, “Dense video captioning using graph-based sentence summarization,”IEEE Trans. Multimedia, vol. 23, pp. 1799–1810, Jun. 2020.
[19]
J. Liu, Z.-J. Zha, R. Hong, M. Wang, and Y. Zhang, “Deep adversarial graph attention convolution network for text-based person search,” in Proc. ACM Int. Conf. Multimedia, 2019, pp. 665–673.
[20]
J. Zhang, Y. Yang, Q. Tian, L. Zhuo, and X. Liu, “Personalized social image recommendation method based on user-image-tag model,”IEEE Trans. Multimedia, vol. 19, no. 11, pp. 2439–2449, Nov.2017.
[21]
Y. Wei, X. Wang, L. Nie, X. He, and T.-S. Chua, “Graph-refined convolutional network for multimedia recommendation with implicit feedback,” in Proc. ACM Int. Conf. Multimedia, 2020, pp. 3541–3549.
[22]
X. Wang, X. He, M. Wang, F. Feng, and T. Chua, “Neural graph collaborative filtering,” in Proc. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2019, pp. 165–174.
[23]
Z. Ying, J. You, C. Morris, X. Ren, W. Hamilton, and J. Leskovec, “Hierarchical graph representation learning with differentiable pooling,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2018, pp. 4800–4810.
[24]
W. Yu, X. Liang, K. Gong, C. Jiang, N. Xiao, and L. Lin, “Layout-graph reasoning for fashion landmark detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 2937–2945.
[25]
S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme, “BPR: Bayesian personalized ranking from implicit feedback,” in Proc. Conf. Uncertainty Artif. Intell., 2009, pp. 452–461.
[26]
L. Wu, P. Sun, Y. Fu, R. Hong, X. Wang, and M. Wang, “A neural influence diffusion model for social recommendation,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2019, pp. 1–10.
[27]
X. Xin, X. He, Y. Zhang, Y. Zhang, and J. Jose, “Relational collaborative filtering: Modeling multiple item relations for recommendation,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2019, pp. 1–10.
[28]
W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2017, pp. 1024–1034.
[29]
K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” in Proc. Int. Conf. Learn. Representations, 2018, pp. 1–17.
[30]
T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in Proc. Int. Conf. Learn. Representations, 2016, pp. 1–14.
[31]
N. D. Doulamis, A. D. Doulamis, P. Kokkinos, and E. M. Varvarigos, “Event detection in twitter microblogging,”IEEE Trans. Cybern., vol. 46, no. 12, pp. 2810–2824, Dec.2016.
[32]
X. Wang, X. He, Y. Cao, M. Liu, and T. Chua, “KGAT: Knowledge graph attention network for recommendation,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2019, pp. 1–9.
[33]
S. Hersheyet al., “CNN architectures for large-scale audio classification,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2017, pp. 131–135.
[34]
S. Arora, Y. Liang, and T. Ma, “A simple but tough-to-beat baseline for sentence embeddings,” in Proc. Int. Conf. Learn. Representations, 2016, pp. 1–16.
[35]
X. Geng, H. Zhang, J. Bian, and T.-S. Chua, “Learning image and user features for recommendation in social networks,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 4274–4282.
[36]
O. Barkan, N. Koenigstein, E. Yogev, and O. Katz, “CB2CF: A neural multiview content-to-collaborative filtering model for completely cold item recommendations,” in Proc. ACM Conf. Recommender Syst., 2019, pp. 228–236.
[37]
J. Ma, P. Cui, K. Kuang, X. Wang, and W. Zhu, “Disentangled graph convolutional networks,” in Proc. Int. Conf. Mach. Learn., 2019, pp. 4212–4221.
[38]
J. Ma, C. Zhou, P. Cui, H. Yang, and W. Zhu, “Learning disentangled representations for recommendation,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2019, pp. 5712–5723.
[39]
X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proc. Int. Conf. Artif. Intell. Statist., 2010, pp. 249–256.
[40]
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learn. Representations, 2015, pp. 1–16.

Cited By

View all
  • (2024)A Survey on Intent-aware Recommender SystemsACM Transactions on Recommender Systems10.1145/37008903:2(1-32)Online publication date: 16-Oct-2024
  • (2024)DiffMM: Multi-Modal Diffusion Model for RecommendationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681498(7591-7599)Online publication date: 28-Oct-2024
  • (2024)Contrastive Learning on Medical Intents for Sequential Prescription RecommendationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679836(748-757)Online publication date: 21-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia
IEEE Transactions on Multimedia  Volume 24, Issue
2022
2475 pages

Publisher

IEEE Press

Publication History

Published: 01 January 2022

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Survey on Intent-aware Recommender SystemsACM Transactions on Recommender Systems10.1145/37008903:2(1-32)Online publication date: 16-Oct-2024
  • (2024)DiffMM: Multi-Modal Diffusion Model for RecommendationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681498(7591-7599)Online publication date: 28-Oct-2024
  • (2024)Contrastive Learning on Medical Intents for Sequential Prescription RecommendationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679836(748-757)Online publication date: 21-Oct-2024
  • (2024)Multi-modal Food Recommendation with Health-aware Knowledge DistillationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679580(3279-3289)Online publication date: 21-Oct-2024
  • (2024)Improving Item-side Fairness of Multimodal Recommendation via Modality DebiasingProceedings of the ACM Web Conference 202410.1145/3589334.3648156(4697-4705)Online publication date: 13-May-2024
  • (2024)Multimodal Progressive Modulation Network for Micro-Video Multi-Label ClassificationIEEE Transactions on Multimedia10.1109/TMM.2024.340572426(10134-10144)Online publication date: 1-Jan-2024
  • (2024)SPACE: Self-Supervised Dual Preference Enhancing Network for Multimodal RecommendationIEEE Transactions on Multimedia10.1109/TMM.2024.338288926(8849-8859)Online publication date: 28-Mar-2024
  • (2024)MHRN: A Multimodal Hierarchical Reasoning Network for Topic DetectionIEEE Transactions on Multimedia10.1109/TMM.2024.335869626(6968-6980)Online publication date: 25-Jan-2024
  • (2024)Personalized Representation With Contrastive Loss for Recommendation SystemsIEEE Transactions on Multimedia10.1109/TMM.2023.329574026(2419-2429)Online publication date: 1-Jan-2024
  • (2024)Multimodal Graph Causal Embedding for Multimedia-Based RecommendationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.342426836:12(8842-8858)Online publication date: 1-Dec-2024
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media