Abstract
A key problem in online social networks is the identification of user characteristics and the analysis of how these are reflected in the graph structure evolution. The basis to tackle this issue is user similarity measures. In this paper, we propose a novel user similarity measure for online social networks, which combines both network and profile similarity. Since user profile data could be missing proposed measure is complemented by a technique to infer missing items from profile of the user’s contacts. The second main contribution of this paper is an extensive performance evaluation of the proposed measures with respect to some of the most relevant measures already proposed in the literature. The performance evaluation study has been conducted on a variety of data sets (i.e., Facebook, Youtube, Epinions and DBLP data sets) to see how different scenarios and graph characteristics affect the measures’ performance.




















Similar content being viewed by others
Notes
User inputs with Facebook auto-completion and aggregation was imposed just in recent years. Before this, users could enter unstructured texts.
where |Sb| = 1 in case of non-structured items, like gender.
This is done by creating the graph using only edges established before time T.
In the table, precision is the correct inferrals over all inferrals, i.e., precision = #correct inferrals/#all inferrals.
If there is no profile information, a generic list of celebrity accounts are offered.
Related videos list of a video can not be determined by the user who uploaded the video: http://support.google.com/youtube/bin/answer.py?hl=en&answer=92651
In Graph theory, triadic closure is used to refer to predictions for such graphs where two pairs of nodes have strong ties, and a weak tie among them is expected, i.e., the dashed line already exists or it is expected to be formed in future. Our experiments try to predict this edge.
Values are rounded to two decimal points.
This problem is known as the cold start problem in recommender systems.
References
Adamic L, Buyukkokten O, Adar E (2003) A social network caught in the web. First Monday 8(6):6
Akcora C, Carminati B, Ferrari E (2011) Network and profile based measures for user similarities on social networks. In: IEEE international conference on information reuse and integration (IRI), IEEE, pp 292–298
Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. In: Proceedings of the fourth ACM international conference on Web search and data mining, ACM, pp 635–644
Bhattacharyya P, Garg A, Wu S (2010) Analysis of user keyword similarity in online social networks. Soc Netw Anal Min 1:1–16
Boriah S, Chandola V, Kumar V (2008) Similarity measures for categorical data: a comparative evaluation. Redox Rep 30(2):3
Bringmann B, Berlingerio M, Bonchi F, Gionis A (2010) Learning and predicting the evolution of social networks. Intell Syst IEEE 25(4):26–35
Brodka P, Saganowski S, Kazienko P (2012) Ged: the method for group evolution discovery in social networks. Soc Netw Anal Min 1:1–14. doi:10.1007/s13278-012-0058-8
Cheng X, Liu J (2009) Nettube: exploring social networks for peer-to-peer short video sharing. In: INFOCOM 2009, IEEE, IEEE, pp 1152–1160
Cheng X, Dale C, Liu J (2008) Statistics and social network of youtube videos. In: Quality of service, 2008. IWQoS 2008. 16th International Workshop on, IEEE, pp 229–238
Cover TM, Thomas JA (1991) Elements of information theory. Wiley-Interscience, New York
Deshpande M, Karypis G (2004) Item-based top-n recommendation algorithms. ACM Trans Inf Syst 22:143–177
Guha R, Kumar R, Raghavan P, Tomkins A (2004) Propagation of trust and distrust. In: Proceedings of the 13th international conference on World Wide Web, ACM, New York, NY, USA, WWW ’04, pp 403–412. doi:10.1145/988672.988727
Krishnamurthy B, Wills C (2010) On the leakage of personally identifiable information via online social networks. ACM SIGCOMM 40(1):112–117
Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inform Sci Technol 58(7):1019–1031
Lindamood J, Heatherly R, Kantarcioglu M, Thuraisingham B (2009) Inferring private information using social network data. In: Proceedings of the 18th WWW, ACM, pp 1145–1146
Massa P, Avesani P (2004) Trust-aware bootstrapping of recommender systems. In: Proceedings of ECAI 2006: workshop on recommender systems, Citeseer, pp 29–33
McPherson M, Smith-Lovin L, Cook J (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27:415–444
Mislove A, Viswanath B, Gummadi K, Druschel P (2010) You are who you know: inferring user profiles in online social networks. In: Proceedings of the third WSDM conference, ACM, pp 251–260
Mueller D (1976) Public choice: a survey. J Econ Literature 14(2):395–433
Newman M, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026,113
Penrose L (1946) The elementary statistics of majority voting. J Roy Stat Soc B 109(1):53–57
Scott J (2011) Social network analysis: developments, advances, and prospects. Soc Netw Anal Min 1:21–26. doi:10.1007/s13278-010-0012-6
Spertus E, Sahami M, Buyukkokten O (2005) Evaluating similarity measures: a large-scale study in the orkut social network. In: Proceedings of the 11th SIGKDD, ACM, pp 678–684
Viswanath B, Mislove A, Cha M, Gummadi KP (2009) On the evolution of user interaction in facebook. In: Proceedings of the 2nd ACM SIGCOMM workshop on social networks (WOSN’09)
Zhou R, Khemmarat S, Gao L (2010) The impact of youtube recommendation system on video views. In: Proceedings of the 10th annual conference on Internet measurement, ACM, pp 404–410
Zhou Y, Cheng H, Yu J (2009) Graph clustering based on structural/attribute similarities. Proc VLDB Endowment 2(1):718–729
Acknowledgments
The research presented in this paper was partially funded by a Google Research Award.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Akcora, C.G., Carminati, B. & Ferrari, E. User similarities on social networks. Soc. Netw. Anal. Min. 3, 475–495 (2013). https://doi.org/10.1007/s13278-012-0090-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13278-012-0090-8