Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Transfer Learning to Infer Social Ties across Heterogeneous Networks

Published: 13 April 2016 Publication History

Abstract

Interpersonal ties are responsible for the structure of social networks and the transmission of information through these networks. Different types of social ties have essentially different influences on people. Awareness of the types of social ties can benefit many applications, such as recommendation and community detection. For example, our close friends tend to move in the same circles that we do, while our classmates may be distributed into different communities. Though a bulk of research has focused on inferring particular types of relationships in a specific social network, few publications systematically study the generalization of the problem of predicting social ties across multiple heterogeneous networks.
In this work, we develop a framework referred to as TranFG for classifying the type of social relationships by learning across heterogeneous networks. The framework incorporates social theories into a factor graph model, which effectively improves the accuracy of predicting the types of social relationships in a target network by borrowing knowledge from a different source network. We also present several active learning strategies to further enhance the inferring performance. To scale up the model to handle really large networks, we design a distributed learning algorithm for the proposed model.
We evaluate the proposed framework (TranFG) on six different networks and compare with several existing methods. TranFG clearly outperforms the existing methods on multiple metrics. For example, by leveraging information from a coauthor network with labeled advisor-advisee relationships, TranFG is able to obtain an F1-score of 90% (8%--28% improvements over alternative methods) for predicting manager-subordinate relationships in an enterprise email network. The proposed model is efficient. It takes only a few minutes to train the proposed transfer model on large networks containing tens of thousands of nodes.

References

[1]
Lada A. Adamic and Eytan Adar. 2001. Friends and neighbors on the web. Social Networks 25 (2001), 211--230.
[2]
Rie Kubota Ando and Tong Zhang. 2005. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research 6 (2005), 1817--1853.
[3]
Andreas Argyriou, Theodoros Evgeniou, and Massimiliano Pontil. 2006. Multi-task feature learning. In Proceedings of the 18th Neural Information Processing Systems (NIPS’06). 41--48.
[4]
Andreas Argyriou, Andreas Maurer, and Massimiliano Pontil. 2008. An algorithm for transfer learning in a heterogeneous environment. In Proceedings of 2008 European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD’08). 71--85.
[5]
Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In Proceedings of the 4th ACM International Conference on Web Search and Web Data Mining (WSDM’11). 635--644.
[6]
Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random networks. Science 286, 5439 (1999), 509--512.
[7]
John Blitzer, Ryan McDonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP’06). 120--128.
[8]
Ronald S. Burt. 1992. Structural Holes: The Social Structure of Competition. Harvard University Press.
[9]
Bin Cao, Nathan Nan Liu, and Qiang Yang. 2010. Transfer learning for collective link prediction in multiple heterogenous domains. In Proceedings of the 27th International Conference on Machine Learning (ICML’10). 159--166.
[10]
Wei Chen, Yajun Wang, and Siyu Yang. 2009. Efficient influence maximization in social networks. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’09). 199--207.
[11]
Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine Learning 20 (1995), 273--297.
[12]
David J. Crandall, Lars Backstrom, Dan Cosley, Daniel Huttenlocher Siddharth Suri, and Jon Kleinberg. 2010. Inferring social ties from geographic coincidences. Proceedings of the National Academy of Science 107 (Dec. 2010), 22436--22441.
[13]
Wenyuan Dai, Yuqiang Chen, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2008. Translated learning: Transfer learning across different feature spaces. In Proceedings of the 22nd Annual Conference on Neural Information Processing Systems (NIPS’08). 353--360.
[14]
Wenyuan Dai, Guirong Xue, Qiang Yang, and Yong Yu. 2007a. Co-clustering based classification for out-of-domain documents. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07). 210--219.
[15]
Wenyuan Dai, Qiang Yang, Gui-Rong Xue, and Yong Yu. 2007b. Boosting for transfer learning. In Proceedings of the 24th International Conference on Machine Learning (ICML’07). 193--200.
[16]
James A. Davis and Samuel Leinhardt. 1972. The structure of positive interpersonal relations in small groups. In Sociological Theories in Progress, J. Berger (Ed.). Vol. 2. Houghton Mifflin, 218--251.
[17]
Christopher P. Diehl, Galileo Namata, and Lise Getoor. 2007. Relationship identification for social network discovery. In Proceedings of the 24th National Conference on Artificial Intelligence and 9th Conference on Innovative Applications of Artificial Intelligence (AAAI’07). 546--552.
[18]
Yuxiao Dong, Jie Tang, Nitesh V. Chawla, Tiancheng Lou, Yang Yang, and Bai Wang. 2015. Inferring social status and rich club effects in enterprise communication networks. PLOS One 10, 3 (2015), e0119446.
[19]
Yuxiao Dong, Jie Tang, Sen Wu, Jilei Tian, Nitesh V. Chawla, Jinghai Rao, and Huanhuan Cao. 2012. Link prediction and recommendation across heterogeneous social networks. In Proceedings of the 12th International Conference on Data Mining (ICDM’12). 181--190.
[20]
Yuxiao Dong, Jing Zhang, Jie Tang, Nitesh V. Chawla, and Bai Wang. 2015. CoupledLP: Link prediction in coupled networks. In Proceedings of the 21nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’15). 199--208.
[21]
Nathan Eagle, Alex (Sandy) Pentland, and David Lazer. 2009. Inferring social network structure using mobile phone data. Proceedings of the National Academy of Science 106, 36 (2009), 15274--15278.
[22]
David Easley and Jon Kleinberg. 2010. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press.
[23]
Jing Gao, Wei Fan, Jing Jian, and Jiawei Han. 2008. Knowledge transfer via multiple model local structure mapping. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’08). 283--291.
[24]
Lise Getoor and Ben Taskar. 2007. Introduction to Statistical Relational Learning. MIT Press.
[25]
Malcolm Gladwell. 2001. The Tipping Point - How Little Things Make a Big Difference (New York, NY: Little Brown, 2000); G. Khermouch and J. Green, Buzz Marketing: Suddenly This Stealth Strategy Is Hot -- but It's Still Fraught with Risk. Business Week (2001), 50.
[26]
Bruno Goncalves, Nicola Perra, and Alessandro Vespignani. 2011. Modeling users’ activity on twitter networks: Validation of dunbar’s number. PLoS ONE 6 (08 2011), e22656.
[27]
Mark Granovetter. 1973. The strength of weak ties. American Journal of Sociology 78, 6 (1973), 1360--1380.
[28]
R. Guha, Ravi Kumar, Prabhakar Raghavan, and Andrew Tomkins. 2004. Propagation of trust and distrust. In Proceedings of the 13th International Conference on World Wide Web (WWW’04). 403--412.
[29]
J. M. Hammersley and P. Clifford. 1971. Markov field on finite graphs and lattices. Unpublished manuscript (1971).
[30]
John Hopcroft, Tiancheng Lou, and Jie Tang. 2011. Who will follow you back? Reciprocal relationship prediction. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM’11). 1137--1146.
[31]
Tony Jebara. 2004. Multi-task feature and kernel selection for svms. In Proceedings of the 21th International Conference on Machine Learning (ICML’04). 55--62.
[32]
George Karypis and Vipin Kumar. 1998. MeTis: Unstrctured Graph Partitioning and Sparse Matrix Ordering System, Version 4.0. http://glaros.dtc.umn.edu/gkhome/fetch/sw/metis/manual.pdf.
[33]
Elihu Katz. 1957. The two-step flow of communication: An up-to-date report on an hypothesis. Public Opinion Quarterly 21, 1 (1957), 61--78.
[34]
Elihu Katz and Paul Felix Lazarsfeld. 1955. Personal Influence. The Free Press, New York, USA.
[35]
Leo Katz. 1953. A new status index derived from sociometric analysis. Psychometrika 18, 1 (1953), 39--43.
[36]
David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’03). 137--146.
[37]
Nikos Komodakis, Nikos Paragios, and Georgios Tziritas. 2011. MRF energy minimization and beyond via dual decomposition. IEEE Transactions on Pattern Analalysis and Machine Intelligence 33, 3 (2011), 531--552.
[38]
David Krackhardt. 1992. The strength of strong ties: The importance of philos in organizations. Networks and Organizations: Structure, Form, and Action 216 (1992), 239.
[39]
Frank R. Kschischang, Brendan J. Frey, and Hans andrea Loeliger. 2001. Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory 47 (2001), 498--519.
[40]
John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML’01). 282--289.
[41]
Steffen L. Lauritzen. 1996. Graphical Models. Oxford University Press, Oxford.
[42]
Paul Felix Lazarsfeld, Bernard Berelson, and Hazel Gaudet. 1944. The People’s Choice: How the Voter Makes up his Mind in a Presidential Campaign. Columbia University Press, New York, NY, USA.
[43]
Su-In Lee, Vassil Chatalbashev, David Vickrey, and Daphne Koller. 2007. Learning a meta-level prior for feature relevance from multiple related tasks. In Proceedings of the 24th International Conference on Machine Learning (ICML’07). 489--496.
[44]
Jure Leskovec, Daniel Huttenlocher, and Jon Kleinberg. 2010a. Predicting positive and negative links in online social networks. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). 641--650.
[45]
Jure Leskovec, Daniel Huttenlocher, and Jon Kleinberg. 2010b. Signed networks in social media. In Proceedings of the 28th International Conference on Human Factors in Computing Systems (CHI’10). 1361--1370.
[46]
Xuejun Liao, Ya Xue, and Lawrence Carin. 2005. Logistic regression with an auxiliary data source. In Proceedings of the 22th International Conference on Machine Learning (ICML’05). 505--512.
[47]
David Liben-Nowell and Jon M. Kleinberg. 2007. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology 58, 7 (2007), 1019--1031.
[48]
Ryan Lichtenwalter, Jake T. Lussier, and Nitesh V. Chawla. 2010. New perspectives and methods in link prediction. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). 243--252.
[49]
Xiao Ling, Guirong Xue, Wenyuan Dai, Yun Jiang, Qiang Yang, and Yong Yu. 2008. Can Chinese web pages be classified with english data source? In Proceeding of the 17th International Conference on World Wide Web (WWW’08). 969--978.
[50]
Tiancheng Lou and Jie Tang. 2013. Mining structural hole spanners through information diffusion in social networks. In Proceedings of the 22th International Conference on World Wide Web (WWW’13). 837--848.
[51]
Tiancheng Lou, Jie Tang, John Hopcroft, Zhanpeng Fang, and Xiaowen Ding. 2013. Learning to predict reciprocity and triadic closure in social networks. ACM Transactions on Knowledge Discovery from Data 7, 2 (2013), Article No. 5.
[52]
Julian McAuley and Jure Leskovec. 2014. Discovering social circles in ego networks. ACM Transactions on Knowledge Discovery from Data (TKDD) 8, 1 (2014), 4.
[53]
Kevin P. Murphy, Yair Weiss, and Michael I. Jordan. 1999. Loopy belief propagation for approximate inference: An empirical study. In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI’99). 467--475.
[54]
G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. 1978. An analysis of approximations for maximizing submodular set functions. Mathematical Programming 14, 1 (1978), 265--294.
[55]
M. E. J. Newman. 2001. Clustering and preferential attachment in growing networks. Physical Reviews E 64, 2 (2001), 025102.
[56]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report SIDL-WP-1999-0120. Stanford University.
[57]
Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (Oct. 2010), 1345 --1359.
[58]
Maayan Roth, Assaf Ben-David, David Deutscher, Guy Flysher, Ilan Horn, Ari Leichtberg, Naty Leiser, Yossi Matias, and Ron Merom. 2010. Suggesting friends using the implicit social graph. In Proceedings of the 16th International Conference on Knowledge Discovery and Data Mining (KDD’10). 233--242.
[59]
Burr Settles and Mark Craven. 2008. An analysis of active learning strategies for sequence labeling tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’08). 1070--1079.
[60]
Xiaoxiao Shi, Wei Fan, and Jiangtao Ren. 2008. Actively transfer domain knowledge. In Proceedings of 2008 European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD’08). 342--357.
[61]
Xiaodan Song, Yun Chi, Koji Hino, and Belle L. Tseng. 2007. Identifying opinion leaders in the blogosphere. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management (CIKM’06). 971--974.
[62]
Chenhao Tan, Lillian Lee, Jie Tang, Long Jiang, Ming Zhou, and Ping Li. 2011. User-level sentiment analysis incorporating social networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11). 1397--1405.
[63]
Chenhao Tan, Jie Tang, Jimeng Sun, Quan Lin, and Fengjiao Wang. 2010. Social action tracking via noise tolerant time-varying factor graphs. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). 1049--1058.
[64]
Jie Tang, Tiancheng Lou, and Jon Kleinberg. 2012. Inferring social ties across heterogeneous networks. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM’12). 743--752.
[65]
Jie Tang, Jimeng Sun, Chi Wang, and Zi Yang. 2009. Social influence analysis in large-scale networks. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’09). 807--816.
[66]
Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. 2008. ArnetMiner: Extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’08). 990--998.
[67]
Lei Tang and Huan Liu. 2011. Leveraging social media networks for classification. Data Mining and Knowledge Discovery 23, 3 (2011), 447--478.
[68]
Wenbin Tang, Honglei Zhuang, and Jie Tang. 2011. Learning to Infer Social Ties in Large Networks. In Proceedings of 2011 European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD’11). 381--397.
[69]
Benjamin Taskar, Ming Fai Wong, Pieter Abbeel, and Daphne Koller. 2003. Link prediction in relational data. In Proceedings of the 15th Neural Information Processing Systems (NIPS’03).
[70]
Martin J. Wainwright and Michael I. Jordan. 2008. Graphical models, exponential families, and variational inference. Foundational Trends in Machine Learning 1, 1--2 (Jan. 2008), 1--305.
[71]
Chi Wang, Jiawei Han, Yuntao Jia, Jie Tang, Duo Zhang, Yintao Yu, and Jingyi Guo. 2010. Mining advisor-advisee relationships from research publication networks. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). 203--212.
[72]
Liaoruo Wang, Tiancheng Lou, Jie Tang, and John Hopcroft. 2011. Detecting community kernels in large social networks. In Proceedings of 2011 IEEE International Conference on Data Mining (ICDM’11). 784--793.
[73]
Wim Wiegerinck. 2000. Variational approximations between mean field theory and the junction tree algorithm. In Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence (UAI’00). 626--633.
[74]
Shaomei Wu, J. M. Hofman, W. A. Mason, and D. J. Watts. 2011. Who says what to whom on twitter. In Proceedings of the 20th International Conference on World Wide Web (WWW’11). 705--714.
[75]
Rongjing Xiang, Jennifer Neville, and Monica Rogati. 2010. Modeling relationship strength in online social networks. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). 981--990.
[76]
Eric P. Xing, Michael I. Jordan, and Stuart Russell. 2003. A generalized mean field algorithm for variational inference in exponential families. In Proceedings of 19th Conference on Uncertainty in Artificial Intelligence (UAI’03). 583--591.
[77]
Yang Yang, Yizhou Sun, Jie Tang, Bo Ma, and Juanzi Li. 2015. Entity matching across heterogeneous sources. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’15). 1395--1404.
[78]
Zi Yang, Jingyi Guo, Keke Cai, Jie Tang, Juanzi Li, Li Zhang, and Zhong Su. 2010. Understanding retweeting behaviors in social networks. In Proceedings of the 19th ACM Conference on Information and Knowledge Management (CIKM’10). 1633--1636.
[79]
Jiawei Zhang, Xiangnan Kong, and Philip S. Yu. 2013. Predicting social links for new users across aligned heterogeneous social networks. In ICDM’13. 1289--1294.
[80]
Yutao Zhang, Jie Tang, Zhilin Yang, Jian Pei, and Philip Yu. 2015. COSNET: Connecting heterogeneous social networks with local and global consistency. In KDD’15. 1485--1494.
[81]
Honglei Zhuang, Jie Tang, Wenbin Tang, Tiancheng Lou, Alvin Chin, and Xia Wang. 2012. Actively learning to infer social ties. Data Mining and Knowledge Discovery 25, 2 (2012), 270--297.

Cited By

View all
  • (2024)Adaptive Hypergraph Network for Trust Prediction2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00232(2986-2999)Online publication date: 13-May-2024
  • (2024)Synthetic pre-training for neural-network interatomic potentialsMachine Learning: Science and Technology10.1088/2632-2153/ad16265:1(015003)Online publication date: 10-Jan-2024
  • (2024)Transfer learning for molecular property predictions from small datasetsAIP Advances10.1063/5.021475414:10Online publication date: 14-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems
ACM Transactions on Information Systems  Volume 34, Issue 2
April 2016
220 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/2891107
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 April 2016
Accepted: 01 March 2015
Revised: 01 January 2015
Received: 01 June 2014
Published in TOIS Volume 34, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Social ties
  2. predictive model
  3. social influence
  4. social network

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Yahoo Research Alliance Grant
  • Google Research Grant
  • NSF
  • National High-Tech R&D Program
  • National Basic Research Program of China
  • Natural Science Foundation of China
  • National Social Science Foundation of China
  • Huawei Research Grant

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)53
  • Downloads (Last 6 weeks)5
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Adaptive Hypergraph Network for Trust Prediction2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00232(2986-2999)Online publication date: 13-May-2024
  • (2024)Synthetic pre-training for neural-network interatomic potentialsMachine Learning: Science and Technology10.1088/2632-2153/ad16265:1(015003)Online publication date: 10-Jan-2024
  • (2024)Transfer learning for molecular property predictions from small datasetsAIP Advances10.1063/5.021475414:10Online publication date: 14-Oct-2024
  • (2023)Finding reinforced structural hole spanners in social networks via node embeddingIntelligent Data Analysis10.3233/IDA-22683627:1(297-318)Online publication date: 30-Jan-2023
  • (2023)Community Preserving Social Recommendation with Cyclic Transfer LearningACM Transactions on Information Systems10.1145/363111542:3(1-36)Online publication date: 29-Dec-2023
  • (2022)Experience: Analyzing Missing Web Page Visits and Unintentional Web Page Visits from the Client-side Web LogsJournal of Data and Information Quality10.1145/349039214:2(1-17)Online publication date: 23-Mar-2022
  • (2022)Structural Hole Theory in Social Network Analysis: A ReviewIEEE Transactions on Computational Social Systems10.1109/TCSS.2021.30703219:3(724-739)Online publication date: Jun-2022
  • (2022)Deep autoencoder based domain adaptation for transfer learningMultimedia Tools and Applications10.1007/s11042-022-12226-281:16(22379-22405)Online publication date: 1-Jul-2022
  • (2022)Privacy-Preserving Link PredictionData Privacy Management, Cryptocurrencies and Blockchain Technology10.1007/978-3-031-25734-6_3(35-50)Online publication date: 26-Sep-2022
  • (2021)Understanding Structural Hole Spanners in Location-Based Social Networks: A Data-Driven StudyAdjunct Proceedings of the 2021 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2021 ACM International Symposium on Wearable Computers10.1145/3460418.3480398(619-624)Online publication date: 21-Sep-2021
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media