Abstract
We study the extent to which social ties between people can be inferred in large social network, in particular via active user interactions. In most online social networks, relationships are lack of meaning labels (e.g., “colleague” and “intimate friends”) due to various reasons. Understanding the formation of different types of social relationships can provide us insights into the micro-level dynamics of the social network. In this work, we precisely define the problem of inferring social ties and propose a Partially-Labeled Pairwise Factor Graph Model (PLP-FGM) for learning to infer the type of social relationships. The model formalizes the problem of inferring social ties into a flexible semi-supervised framework. We test the model on three different genres of data sets and demonstrate its effectiveness. We further study how to leverage user interactions to help improve the inferring accuracy. Two active learning algorithms are proposed to actively select relationships to query users for their labels. Experimental results show that with only a few user corrections, the accuracy of inferring social ties can be significantly improved. Finally, to scale the model to handle real large networks, a distributed learning algorithm has been developed.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Barabasi AL (2002) Statistical mechanics of complex networks. Rev Mod Phys 74(1): 47–97
Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. In: WSDM, pp 635–644
Bilgic M, Mihalkova L, Getoor L (2010) Active learning for networked data. In: Fürnkranz J, Joachims T (eds) ICML. Omnipress, pp 79–86
Califf ME, Mooney RJ (1999) Relational learning of pattern-match rules for information extraction. In: AAAI/IAAI, pp 328–334
Cesa-Bianchi N, Gentile C, Vitale F, Zappella G (2010) Active learning on trees and graphs. In: COLT, pp 320–332
Crandall D, Backstrom L, Cosley D, Suri S, Huttenlocher D, Kleinberg J (2010) Inferring social ties from geographic coincidences. PNAS 107(52): 22436
Diehl CP, Namata G, Getoor L (2007) Relationship identification for social network discovery. In: AAAI, AAAI Press, pp 546–552
Domingos P, Richardson M (2001) Mining the network value of customers. In: KDD, pp 57–66
Eagle N, Pentland AS, Lazer D (2008) Mobile phone data for inferring social network structure. In: Social computing, behavioral modeling, and prediction, pp 79–88
Faloutsos M, Faloutsos P, Faloutsos C (1999) On power-law relationships of the internet topology. In: SIGCOMM, pp 251–262
Getoor L, Taskar B (2007) Introduction to statistical relational learning. The MIT Press, Cambridge
Golovin D, Krause A, Ray D (2010) Near-optimal Bayesian active learning with noisy observations. CoRR abs/1010.3091
Grob R, Kuhn M, Wattenhofer R, Wirz M (2009) Cluestr: mobile social networking for enhanced group communication. In: GROUP, pp 81–90
Hammersley JM, Clifford P (1971) Markov field on finite graphs and lattices. Unpublished manuscript
Hopcroft JE, Lou T, Tang J (2011) Who will follow you back? Reciprocal relationship prediction. In: CIKM’11
Karypis G, Kumar V (1998) MeTis: unstrctured graph partitioning and sparse matrix ordering system. Version 4.0 Sept
Kempe D, Kleinberg J, Tardos E (2003) Maximizing the spread of influence through a social network. In: KDD, pp 137–146
Kimura M, Saito K, Nakano R, Motoda H (2010) Extracting influential nodes on a social network for information diffusion. Data Min Knowl Discov 20(1): 70–97
Kleinberg J (2005) Temporal dynamics of on-line information streams. In: Garofalakis M, Gehrke J, Rastogi R (eds) Data stream managemnt processing high-speed data. Springer, Heidelberg
Krause A, Guestrin C (2009) Optimal value of information in graphical models. J Artif Intell Res (JAIR) 35: 557–591
Kuwadekar A, Neville J (2011) Relational active learning for joint collective classification models. In: Getoor L, Scheffer T (eds) Proceedings of the 28th International Conference on Machine Learning (ICML-11), ICML ’11, pp 385–392, New York, NY, USA, June. ACM.
Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th international conference on machine learning (ICML’01), pp 282–289
Leskovec J, Huttenlocher DP, Kleinberg JM (2010) Predicting positive and negative links in online social networks. In: WWW, pp 641–650
Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inf Sci Technol 58(7): 1019–1031
Martinez O, Tsechpenakis G (2008) Integration of active learning in a collaborative CRF. In: Computer vision and pattern recognition workshop, pp 1–8
Menon AK, Elkan C (2010) A log-linear model with latent features for dyadic prediction. In: ICDM, pp 364–373
Murphy K, Weiss Y, Jordan M (1999) Loopy belief propagation for approximate inference: an empirical study. In: UAI, vol 9, pp 467–475
Newman MEJ (2003) The structure and function of complex networks. SIAM Rev 45:167–256
Popescul A, Ungar L (2003) Statistical relational learning for link prediction. In: IJCAI03 workshop on learning statistical models from relational data volume 149,172
Roth M, Ben-David A, Deutscher D, Flysher G, Horn I, Leichtberg A, Leiser N, Matias Y, Merom R (2010) Suggesting friends using the implicit social graph. In: KDD, pp 233–242
Roy N, McCallum A (2001) Toward optimal active learning through sampling estimation of error reduction. In: ICML, pp 441–448
Settles B, Craven M (2008) An analysis of active learning strategies for sequence labeling tasks. In: EMNLP, pp 1070–1079
Shi L, Zhao Y, Tang J (2011) Batch mode active learning for networked data. In: ACM Transactions on Intelligent Systems and Technology (TIST)
Strogatz SH (2003) Exploring complex networks. Nature 410: 268–276
Tan C, Lee L, Tang J, Jiang L, Zhou M, Li P (2011) User-level sentiment analysis incorporating social networks. In: KDD, pp 1397–1405
Tan C, Tang J, Sun J, Lin Q, Wang F (2010) Social action tracking via noise tolerant time-varying factor graphs. In: KDD, pp 1049–1058
Tang J, Lou T, Kleinberg J (2012) Inferring social ties across heterogenous networks. In: WSDM’12
Tang J, Sun J, Wang C, Yang Z (2009) Social influence analysis in large-scale networks. In: KDD, pp 807–816
Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: Extraction and mining of academic social networks. In: KDD’08, pp 990–998
Tang L, Liu H (2011) Leveraging social media networks for classification. Data Min Knowl Discov 23(3): 447–478
Tang W, Zhuang H, Tang J (2011) Learning to infer social ties in large networks. In: ECML/PKDD’11, pp 381–397
Taskar B, Wong MF, Abbeel P, Koller D (2003) Link prediction in relational data. In: NIPS. MIT Press
Wang C, Han J, Jia Y, Tang J, Zhang D, Yu Y, Guo J (2010) Mining advisor-advisee relationships from research publication networks. In: KDD, pp 203–212
Wang D, Pedreschi D, Song C, Giannotti F, Barabási A-L (2011) Human mobility, social ties, and link prediction. In: KDD, pp 1100–1108
Yang Z, Guo J, Cai K, Tang J, Li J, Zhang L, Su Z (2010) Understanding retweeting behaviors in social networks. In: CIKM, pp 1633–1636
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Dimitrios Gunopulos, Donato Malerba and Michalis Vazirgiannis.
Rights and permissions
About this article
Cite this article
Zhuang, H., Tang, J., Tang, W. et al. Actively learning to infer social ties. Data Min Knowl Disc 25, 270–297 (2012). https://doi.org/10.1007/s10618-012-0274-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-012-0274-x