Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3534678.3539080acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

TwHIN: Embedding the Twitter Heterogeneous Information Network for Personalized Recommendation

Published: 14 August 2022 Publication History

Abstract

Social networks, such as Twitter, form a heterogeneous information network (HIN) where nodes represent domain entities (e.g., user, content, advertiser, etc.) and edges represent one of many entity interactions (e.g, a user re-sharing content or "following" another). Interactions from multiple relation types can encode valuable information about social network entities not fully captured by a single relation; for instance, a user's preference for accounts to follow may depend on both user-content engagement interactions and the other users they follow. In this work, we investigate knowledge-graph embeddings for entities in the Twitter HIN (TwHIN); we show that these pretrained representations yield significant offline and online improvement for a diverse range of downstream recommendation and classification tasks: personalized ads rankings, account follow-recommendation, offensive content detection, and search ranking. We discuss design choices and practical challenges of deploying industry-scale HIN embeddings, including compressing them to reduce end-to-end model latency and handling parameter drift across versions.

References

[1]
A. Bordes, N. Usunier, A. Garcia-Duran, J Weston, and O. Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. NeurIPS 26 (2013).
[2]
H. Cai, V. Zheng, and K. Chang. 2018. A comprehensive survey of graph embedding: Problems, techniques, and applications. TKDE 30, 9 (2018), 1616--1637.
[3]
S. Cao, W. Lu, and Q. Xu. 2015. Grarep: Learning graph representations with global structural information. In CIKM. 891--900.
[4]
S. Cao, W. Lu, and Q. Xu. 2016. Deep neural networks for learning graph representations. In AAAI.
[5]
S. Chang, W. Han, J. Tang, G. Qi, C. Aggarwal, and T. Huang. 2015. Heterogeneous network embedding via deep architectures. In SIGKDD. 119--128.
[6]
T. Chen and Y. Sun. 2017. Task-guided and path-augmented heterogeneous network embedding for author identification. In WSDM. 295--304.
[7]
H. Cheng, L. Koc, J. Harmsen, T. Shaked, T. Chandra, H. Aradhye, G. Anderson, G. Corrado, W. Chai, M. Ispir, et al . 2016. Wide & deep learning for recommender systems. In DLRS. 7--10.
[8]
P. Covington, J. Adams, and E. Sargin. 2016. Deep neural networks for youtube recommendations. In RecSys. 191--198.
[9]
J. Devlin, M. Chang, K. Lee, and K. Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 (2018).
[10]
Y. Dong, N. Chawla, and A. Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In SIGKDD. 135--144.
[11]
J. Duchi, E. Hazan, and Y. Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. JMLR 12, 7 (2011).
[12]
Ahmed El-Kishky, Thomas Markovich, Kenny Leung, Frank Portman, Aria Haghighi, and Ying Xiao. 2022. kNN-Embed: Locally Smoothed Embedding Mixtures For Multi-interest Candidate Retrieval. arXiv preprint arXiv:2205.06205 (2022).
[13]
W. Feng and J. Wang. 2012. Incorporating heterogeneous information for personalized tag recommendation in social tagging systems. In SIGKDD. 1276--1284.
[14]
Y. Goldberg and O. Levy. 2014. word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method. arXiv:1402.3722 (2014).
[15]
P. Goyal and E. Ferrara. 2018. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems 151 (2018), 78--94.
[16]
M. Grbovic and H. Cheng. 2018. Real-time personalization using embeddings for search ranking at airbnb. In SIGKDD. 311--320.
[17]
A. Grover and J. Leskovec. 2016. node2vec: Scalable feature learning for networks. In SIGKDD. 855--864.
[18]
P. Gupta, A. Goel, J. Lin, A. Sharma, D. Wang, and R. Zadeh. 2013. Wtf: The who to follow service at twitter. In WWW. 505--514.
[19]
P. Hoff, A. Raftery, and M. Handcock. 2002. Latent space approaches to social network analysis. JASA 97, 460 (2002), 1090--1098.
[20]
H. Jegou, M. Douze, and C. Schmid. 2010. Product quantization for nearest neighbor search. IEEE TPAMI 33, 1 (2010), 117--128.
[21]
J. Johnson, Ma. Douze, and H. Jégou. 2017. Billion-scale similarity search with GPUs. arXiv preprint arXiv:1702.08734 (2017).
[22]
Ad. Lerer, L. Wu, J. Shen, T. Lacroix, L. Wehrstedt, A. Bose, and A. Peysakhovich. 2019. Pytorch-biggraph: A large-scale graph embedding system. arXiv preprint arXiv:1903.12287 (2019).
[23]
Y. Lin, Z.n Liu, M. Sun, Y. Liu, and X. Zhu. 2015. Learning entity and relation embeddings for knowledge graph completion. In AAAI.
[24]
Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[25]
C. Luo, W. Pang, Z. Wang, and C. Lin. 2014. Hete-cf: Social-based collaborative filtering recommendation using heterogeneous relations. In ICDM. 917--922.
[26]
Y. Malkov and D. Yashunin. 2018. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. PAMI (2018).
[27]
T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. 2013. Distributed representations of words and phrases and their compositionality. NeurIPS 26 (2013).
[28]
D. Nguyen, T. Vu, and A. Nguyen. 2020. BERTweet: A pre-trained language model for English Tweets. arXiv preprint arXiv:2005.10200 (2020).
[29]
C. O'Brien, K. Liu, J. Neufeld, R. Barreto, and J. Hunt. 2021. An Analysis Of Entire Space Multi-Task Models For Post-Click Conversion Prediction. In RecSys.
[30]
S. Okura, Y. Tagami, S. Ono, and A. Tajima. 2017. Embedding-based news recommendation for millions of users. In SIGKDD. 1933--1942.
[31]
A. Pal, C. Eksombatchai, Y. Zhou, B. Zhao, C. Rosenberg, and J. Leskovec. 2020. PinnerSage: multi-modal user embedding framework for recommendations at pinterest. In SIGKDD. 2311--2320.
[32]
B. Perozzi, R. Al-Rfou, and S. Skiena. 2014. Deepwalk: Online learning of social representations. In SIGKDD. 701--710.
[33]
J. Pougué-Biyong, A. Gupta, A Haghighi, and A El-Kishky. 2022. Learning Stance Embeddings from Signed Social Graphs. arXiv:2201.11675 [cs.SI]
[34]
D. Sculley. 2010. Web-scale k-means clustering. In WWW. 1177--1178.
[35]
C. Shi, X. Kong, Y. Huang, Y. Philip, and B. Wu. 2014. Hetesim: A general framework for relevance measure in heterogeneous networks. TKDE (2014).
[36]
C. Shi, Y. Li, J. Zhang, Y. Sun, and Y. Philip. 2016. A survey of heterogeneous information network analysis. TKDE 29, 1 (2016), 17--37.
[37]
C. Shi, Z. Zhang, P. Luo, P. Yu, Y. Yue, and B. Wu. 2015. Semantic path based personalized recommendation on weighted heterogeneous information networks. In CIKM. 453--462.
[38]
Y. Sun and J. Han. 2013. Mining heterogeneous information networks: a structural analysis approach. Acm Sigkdd Explorations Newsletter (2013).
[39]
Y. Sun, J. Han, X. Yan, P. Yu, and T. Wu. 2011. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. VLDB (2011).
[40]
J. Tang, M. Qu, and Q. Mei. 2015. Pte: Predictive text embedding through large-scale heterogeneous text networks. In SIGKDD. 1165--1174.
[41]
J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei. 2015. Line: Large-scale information network embedding. In WWW. 1067--1077.
[42]
T. Trouillon, J Welbl, S. Riedel, É. Gaussier, and G. Bouchard. 2016. Complex embeddings for simple link prediction. In ICML. PMLR, 2071--2080.
[43]
C. Tu, W. Zhang, Z. Liu, M. Sun, et al . 2016. Max-margin deepwalk: Discriminative learning of network representation. In IJCAI, Vol. 2016. 3889--3895.
[44]
C. Wang, Y. Song, A. El-Kishky, D. Roth, M. Zhang, and J. Han. 2015. Incorporating world knowledge to document clustering via heterogeneous information networks. In SIGKDD. 1215--1224.
[45]
D. Wang, P. Cui, and W. Zhu. 2016. Structural deep network embedding. In SIGKDD. 1225--1234.
[46]
Q. Wang, Z. Mao, B. Wang, and L. Guo. 2017. Knowledge graph embedding: A survey of approaches and applications. TKDE 29, 12 (2017), 2724--2743.
[47]
R. Wang, B Fu, G Fu, and M. Wang. 2017. Deep & cross network for ad click predictions. In ADKDD. 1--7.
[48]
R. Wang, R. Shivanna, D. Cheng, S. Jain, D. Lin, L. Hong, and E. Chi. 2021. DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems. In WWW. 1785--1797.
[49]
Z. Wang, J. Zhang, J. Feng, and Z. Chen. 2014. Knowledge graph embedding by translating on hyperplanes. In AAAI, Vol. 28.
[50]
X. Wei, L. Xu, B. Cao, and P. Yu. 2017. Cross view link prediction by learning noise-resilient representation consensus. In WWW. 1611--1619.
[51]
D. Xin, A. El-Kishky, D. Liao, B. Norick, and J. Han. 2018. Active learning on heterogeneous information networks: A multi-armed bandit approach. In ICDM.
[52]
L. Xu, X. Wei, J. Cao, and P. Yu. 2017. Embedding of embedding (EOE) joint embedding for coupled heterogeneous networks. In WSDM. 741--749.
[53]
S. Yan, D. Xu, B. Zhang, and H. Zhang. 2005. Graph embedding: A general framework for dimensionality reduction. In CVPR, Vol. 2. IEEE, 830--837.
[54]
C. Yang, Z. Liu, D. Zhao, M. Sun, and E. Chang. 2015. Network representation learning with rich text information. In IJCAI.
[55]
R. Ying, R. He, K. Chen, P. Eksombatchai, W. Hamilton, and J. Leskovec. 2018. Graph convolutional neural networks for web-scale recommender systems. In SIGKDD. 974--983.
[56]
Ji. You, Y. Wang, A. Pal, P. Eksombatchai, C. Rosenburg, and J. Leskovec. 2019. Hierarchical temporal convolutional networks for dynamic recommender systems. In WWW. 2236--2246.
[57]
X. Yu, X. Ren, Q. Gu, Y. Sun, and J. Han. 2013. Collaborative filtering with entity similarity regularization in heterogeneous information networks. IJCAI HINA 27 (2013).
[58]
X. Yu, X. Ren, Y. Sun, Q. Gu, B. Sturt, U. Khandelwal, B. Norick, and J. Han. 2014. Personalized entity recommendation: A heterogeneous information network approach. In WSDM. 283--292.
[59]
X. Yu, X. Ren, Y. Sun, B. Sturt, U. Khandelwal, Q. Gu, B. Norick, and J. Han. 2013. Recommendation in heterogeneous information networks with implicit user feedback. In RecSys. 347--350.
[60]
D. Zhang, J. Yin, X. Zhu, and C. Zhang. 2016. Homophily, structure, and content augmented network representation learning. In ICDM. IEEE, 609--618.
[61]
X. Zhao, R. Louca, D. Hu, and L. Hong. 2018. Learning item-interaction embeddings for user recommendations. arXiv preprint arXiv:1812.04407 (2018).
[62]
D. Zheng, X. Song, C. Ma, Z. Tan, Z. Ye, J. Dong, H. Xiong, Z. Zhang, and G. Karypis. 2020. DGL-KE: Training Knowledge Graph Embeddings at Scale. In SIGIR. 739--748.
[63]
Z. Zhu, S. Xu, J. Tang, and M. Qu. 2019. Graphvite: A high-performance cpu-gpu hybrid system for node embedding. In WWW. 2494--2504.

Cited By

View all
  • (2024)Clustering on heterogeneous IoT information network based on meta pathScience Progress10.1177/00368504241257389107:2Online publication date: 17-Jun-2024
  • (2024)HiGPT: Heterogeneous Graph Language ModelProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671987(2842-2853)Online publication date: 25-Aug-2024
  • (2024)Towards Graph Foundation Models for PersonalizationCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651980(1798-1802)Online publication date: 13-May-2024
  • Show More Cited By

Index Terms

  1. TwHIN: Embedding the Twitter Heterogeneous Information Network for Personalized Recommendation

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
      August 2022
      5033 pages
      ISBN:9781450393850
      DOI:10.1145/3534678
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 14 August 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Twitter
      2. embedding
      3. graph embedding
      4. heterogeneous information network
      5. recommendation system
      6. social network

      Qualifiers

      • Research-article

      Conference

      KDD '22
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)178
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 27 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Clustering on heterogeneous IoT information network based on meta pathScience Progress10.1177/00368504241257389107:2Online publication date: 17-Jun-2024
      • (2024)HiGPT: Heterogeneous Graph Language ModelProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671987(2842-2853)Online publication date: 25-Aug-2024
      • (2024)Towards Graph Foundation Models for PersonalizationCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651980(1798-1802)Online publication date: 13-May-2024
      • (2024)Scaling User Modeling: Large-scale Online User Representations for Ads Personalization in MetaCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3648301(47-55)Online publication date: 13-May-2024
      • (2024)Personalized Elastic Embedding Learning for On-Device RecommendationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.336156236:7(3363-3375)Online publication date: Jul-2024
      • (2024)Heterogeneous Graph Contrastive Learning With Metapath-Based AugmentationsIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2023.33223418:1(1003-1014)Online publication date: Feb-2024
      • (2024)Time-aware multi-behavior graph network model for complex group behavior predictionInformation Processing and Management: an International Journal10.1016/j.ipm.2024.10366661:3Online publication date: 2-Jul-2024
      • (2024)C-privacy: a social relationship-driven image customization sharing method in cyber-physical networksDigital Communications and Networks10.1016/j.dcan.2024.03.009Online publication date: Mar-2024
      • (2024)Disentangling User Cognitive Intent with Causal Reasoning for Knowledge-Enhanced RecommendationCognitive Computation10.1007/s12559-024-10321-0Online publication date: 18-Jul-2024
      • (2023)Leveraging Zero and Few-Shot Learning for Enhanced Model Generality in Hate Speech Detection in Spanish and EnglishMathematics10.3390/math1124500411:24(5004)Online publication date: 18-Dec-2023
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media