Abstract
We address a problem of efficiently estimating value of a centrality measure for a node in a large social network only using a partial network generated by sampling nodes from the entire network. To this end, we propose a resampling-based framework to estimate the approximation error defined as the difference between the true and the estimated values of the centrality. We experimentally evaluate the fundamental performance of the proposed framework using the closeness and betweenness centralities on three real world networks, and show that it allows us to estimate the approximation error more tightly and more precisely with the confidence level of 95% even for a small partial network compared with the standard error traditionally used, and that we could potentially identify top nodes and possibly rank them in a given centrality measure with high confidence level only from a small partial network.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bonacichi, P.: Power and centrality: A family of measures. Amer. J. Sociol. 92, 1170–1182 (1987)
Brandes, U.: A faster algorithm for betweenness centrality. Journal of Mathematical Sociology 25, 163–177 (2001)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30, 107–117 (1998)
Chen, W., Lakshmanan, L., Castillo, C.: Information and influence propagation in social networks. Synthesis Lectures on Data Management 5(4), 1–177 (2013)
Freeman, L.: Centrality in social networks: Conceptual clarification. Social Networks 1, 215–239 (1979)
Henzinger, M.R., Heydon, A., Mitzenmacher, M., Najork, M.: On near-uniform url sampling. The International Journal of Computer and Telecommunications Networking 33(1-6), 295–308 (2000)
Katz, L.: A new status index derived from sociometric analysis. Sociometry 18, 39–43 (1953)
Kleinberg, J.: The convergence of social and technological networks. Communications of ACM 51(11), 66–72 (2008)
Klimt, B., Yang, Y.: The enron corpus: A new dataset for email classification research. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 217–226. Springer, Heidelberg (2004)
Kurant, M., Markopoulou, A., Thiran, P.: Towards unbiased bfs sampling. IEEE Journal on Selected Areas in Communications 29(9), 1799–1809 (2011)
Leskovec, J., Faloutsos, C.: Sampling from large graphs. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006), pp. 631–636 (2006)
Newman, M.E.J.: Scientific collaboration networks. ii. Shortest paths, weighted networks, and centrality. Physical Review E 64, 016132 (2001)
Zhuge, H., Zhang, J.: Topological centrality and its e-science applications. Journal of the American Society of Information Science and Technology 61, 1824–1841 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ohara, K., Saito, K., Kimura, M., Motoda, H. (2014). Resampling-Based Framework for Estimating Node Centrality of Large Social Network. In: Džeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds) Discovery Science. DS 2014. Lecture Notes in Computer Science(), vol 8777. Springer, Cham. https://doi.org/10.1007/978-3-319-11812-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-11812-3_20
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11811-6
Online ISBN: 978-3-319-11812-3
eBook Packages: Computer ScienceComputer Science (R0)