Abstract
The prosperity of Web 2.0 and social media brings about many diverse social networks of unprecedented scales, which present new challenges for more effec- tive graph-mining techniques. In this chapter, we present some graph patterns that are commonly observed in large-scale social networks. As most networks demonstrate strong community structures, one basic task in social network anal- ysis is community detection which uncovers the group membership of actors in a network. We categorize and survey representative graph mining approaches and evaluation strategies for community detection. We then present and discuss some research issues for future exploration.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
J. Abello, M. G. C. Resende, and S. Sudarsky. Massive quasi-clique detection. In LATIN, pages 598–612, 2002.
A. Abou-Rjeili and G. Karypis. Multilevel algorithms for partitioning power-law graphs. pages 10 pp.–, April 2006.
N. Agarwal, H. Liu, L. Tang, and P. S. Yu. Identifying the influential bloggers in a community. In WSDM ’08: Proceedings of the international conference on Web search and web data mining, pages 207–218, New York, NY, USA, 2008. ACM.
E. Airodi, D. Blei, S. Fienberg, and E. P. Xing. Mixed membership stochastic blockmodels. J. Mach. Learn. Res., 9:1981–2014, 2008.
N. Alon, R. Yuster, and U. Zwick. Finding and counting given length cycles. Algorithmica, 17(3):209–223, 1997.
C. Anderson. The Long Tail: why the future of business is selling less of more. 2006.
L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: membership, growth, and evolution. In KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 44–54, New York, NY, USA, 2006. ACM.
A.-L. Barabasi and R. Albert. Emergence of Scaling in Random Networks. Science, 286(5439):509–512, 1999.
L. Becchetti, P. Boldi, C. Castillo, and A. Gionis. Efficient semi-streaming algorithms for local triangle counting in massive graphs. In KDD ’08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 16–24, New York, NY, USA, 2008. ACM.
S. P. Borgatti, M. G. Everett, and P. R. Shirey. Ls sets, lambda sets and other cohesive subsets. Social Networks, 12:337–357, 1990.
U. Brandes, D. Delling, M. Gaertler, R. Gorke, M. Hoefer, Z. Nikoloski, and D. Wagner. Maximizing modularity is hard. Arxiv preprint physics/0608255, 2006.
T. Bu and D. Towsley. On distinguishing between internet power law topology generators. In Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies, volume 2, pages 638–647 vol.2, 2002.
L. S. Buriol, G. Frahling, S. Leonardi, A. Marchetti-Spaccamela, and C. Sohler. Counting triangles in data streams. In PODS ’06: Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 253–262, New York, NY, USA, 2006. ACM.
D. Chakrabarti and C. Faloutsos. Graph mining: Laws, generators, and algorithms. ACM Comput. Surv., 38(1):2, 2006.
A. Clauset, M. Mewman, and C. Moore. Finding community structure in very large networks. Arxiv preprint cond-mat/0408187, 2004.
A. Clauset, C. Moore, and M. E. J. Newman. Hierarchical structure and the prediction of missing links in networks. Nature, 453:98–101, 2008.
A. Clauset, C. R. Shalizi, and M. E. J. Newman. Power-law distributions in empirical data. arXiv, 706, 2007.
J. Diesner, T. L. Frantz, and K. M. Carley. Communication networks from the enron email corpus “it’s always about the people. enron is no. different”. Comput. Math. Organ. Theory, 11(3):201–228, 2005.
Y. Dourisboure, F. Geraci, and M. Pellegrini. Extraction and classification of dense communities in the web. In WWW ’07: Proceedings of the 16th international conference on World Wide Web, pages 461–470, New York, NY, USA, 2007. ACM.
P. Erdos and A. Renyi. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci, 5:17–61, 1960.
M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. In SIGCOMM ’99: Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication, pages 251–262, New York, NY, USA, 1999. ACM.
G. W. Flake, S. Lawrence, and C. L. Giles. Efficient identification of web communities. In KDD ’00: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 150–160, New York, NY, USA, 2000. ACM.
D. Gibson, R. Kumar, and A. Tomkins. Discovering large dense subgraphs in massive graphs. In VLDB ’05: Proceedings of the 31st international conference on Very large data bases, pages 721–732. VLDB Endowment, 2005.
M. S. Handcock, A. E. Raftery, and J. M. Tantrum. Model-based clustering for social networks. Journal Of The Royal Statistical Society Series A, 127(2):301–354, 2007.
R. Hanneman and M. Riddle. Introduction to Social Network Methods. http://faculty.ucr.edu/hanneman/, 2005.
P. D. Hoff and M. S. H. Adrian E. Raftery. Latent space approaches to social network analysis. Journal of the American Statistical Association, 97(460):1090–1098, 2002.
J. Hopcroft, O. Khan, B. Kulis, and B. Selman. Natural communities in large linked networks. In KDD ’03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 541–546, New York, NY, USA, 2003. ACM.
R. Kumar, J. Novak, and A. Tomkins. Structure and evolution of online social networks. In KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 611–617, New York, NY, USA, 2006. ACM.
R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Trawling the web for emerging cyber-communities. Comput. Netw., 31(11–16):1481–1493, 1999.
M. Latapy. Main-memory triangle computations for very large (sparse (power-law)) graphs. Theor. Comput. Sci., 407(1–3):458–473, 2008.
J. Leskovec, L. A. Adamic, and B. A. Huberman. The dynamics of viral marketing. In EC ’06: Proceedings of the 7th ACM conference on Electronic commerce, pages 228–237, New York, NY, USA, 2006. ACM.
J. Leskovec, L. Backstrom, R. Kumar, and A. Tomkins. Microscopic evolution of social networks. In KDD ’08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 462–470, New York, NY, USA, 2008. ACM.
J. Leskovec and E. Horvitz. Planetary-scale views on a large instant-messaging network. In WWW ’08: Proceeding of the 17th international conference on World Wide Web, pages 915–924, New York, NY, USA, 2008. ACM.
J. Leskovec, J. Kleinberg, and C. Faloutsos. Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Discov. Data, 1(1):2, 2007.
J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Statistical properties of community structure in large social and information networks. In WWW ’08: Proceeding of the 17th international conference on World Wide Web, pages 695–704, New York, NY, USA, 2008. ACM.
J. Leskovec, M. McGlohon, C. Faloutsos, N. Glance, and M. Hurst. Cascading behavior in large blog graphs. In SIAM International Conference on Data Mining (SDM 2007), 2007.
B. McClosky and I. V. Hicks. Detecting cohesive groups. http://www.caam.rice.edu/ivhicks/CokplexAlgorithmPaper.pdf, 2009.
A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, and B. Bhattacharjee. Measurement and analysis of online social networks. In IMC ’07: Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, pages 29–42, New York, NY, USA, 2007. ACM.
A. A. Nanavati, S. Gurumurthy, G. Das, D. Chakraborty, K. Dasgupta, S. Mukherjea, and A. Joshi. On the structural properties of massive telecom call graphs: findings and implications. In CIKM ’06: Proceedings of the 15th ACM international conference on Information and knowledge management, pages 435–444, New York, NY, USA, 2006. ACM.
M. Newman. The structure and function of complex networks. SIAM Review, 45:167–256, 2003.
M. Newman. Power laws, Pareto distributions and Zipf’s law. Contemporary physics, 46(5):323–352, 2005.
M. Newman. Finding community structure in networks using the eigen-vectors of matrices. Physical Review E (Statistical, Nonlinear, and Soft Matter Physics), 74(3), 2006.
M. Newman. Modularity and community structure in networks. PNAS, 103(23):8577–8582, 2006.
M. Newman, A.-L. Barabasi, and D. J. Watts, editors. The Structure and Dynamics of Networks. 2006.
M. Newman and M. Girvan. Finding and evaluating community structure in networks. Physical Review E, 69:026113, 2004.
K. Nowicki and T. A. B. Snijders. Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association, 96(455):1077–1087, 2001.
G. Palla, I. Derenyi, I. Farkas, and T. Vicsek. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435:814–818, 2005.
C. R. Palmer, P. B. Gibbons, and C. Faloutsos. ANF: a fast and scalable tool for data mining in massive graphs. In KDD ’02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 81–90, New York, NY, USA, 2002. ACM.
S. Papadopoulos, A. Skusa, A. Vakali, Y. Kompatsiaris, and N. Wagner. Bridge bounding: A local approach for efficient community discovery in complex networks. Feb 2009.
P. Sarkar and A. W. Moore. Dynamic social network analysis using latent space models. SIGKDD Explor. Newsl., 7(2):31–40, 2005.
T. Schank and D. Wagner. Finding, counting and listing all triangles in large graphs, an experimental study. In Workshop on Experimental and Efficient Algorithms, 2005.
A. Strehl and J. Ghosh. Cluster ensembles — a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res., 3:583–617, 2003.
L. Tang and H. Liu. Relational learning via latent social dimensions. In KDD ’09: Proceeding of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 2009.
L. Tang and H. Liu. Uncovering cross-dimension group structures in multi-dimensional networks. In SDM workshop on Analysis of Dynamic Networks, 2009.
L. Tang, H. Liu, J. Zhang, N. Agarwal, and J. J. Salerno. Topic taxonomy adaptation for group profiling. ACM Trans. Knowl. Discov. Data, 1(4):1–28, 2008.
L. Tang, H. Liu, J. Zhang, and Z. Nazeri. Community evolution in dynamic multi-mode networks. In KDD ’08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 677–685, New York, NY, USA, 2008. ACM.
S. Tauro, C. Palmer, G. Siganos, and M. Faloutsos. A simple conceptual model for the internet topology. In Global Telecommunications Conference, volume 3, pages 1667–1671, 2001.
J. Travers and S. Milgram. An experimental study of the small world problem. Sociometry, 32(4):425–443, 1969.
C. E. Tsourakakis. Fast counting of triangles in large real networks without counting: Algorithms and laws. IEEE International Conference on Data Mining, 0:608–617, 2008.
K. Wakita and T. Tsurumi. Finding community structure in mega-scale social networks: [extended abstract]. In WWW ’07: Proceedings of the 16th international conference on World Wide Web, pages 1275–1276, New York, NY, USA, 2007. ACM.
S. Wasserman and K. Faust. Social Network Analysis: Methods and Applications. Cambridge University Press, 1994.
D. J. Watts and S. H. Strogatz. Collective dynamics of ‘small-world’ networks. Nature, 393:440–442, 1998.
K. Yu, S. Yu, and V. Tresp. Soft clsutering on graphs. In NIPS, 2005.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag US
About this chapter
Cite this chapter
Tang, L., Liu, H. (2010). Graph Mining Applications to Social Network Analysis. In: Aggarwal, C., Wang, H. (eds) Managing and Mining Graph Data. Advances in Database Systems, vol 40. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-6045-0_16
Download citation
DOI: https://doi.org/10.1007/978-1-4419-6045-0_16
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-6044-3
Online ISBN: 978-1-4419-6045-0
eBook Packages: Computer ScienceComputer Science (R0)