Abstract
Many of the real-world networks, such as complex social networks, are intrinsically weighted networks, and therefore, traditional network models, such as binary network models, will result in losing much of the information contained in the edge weights of the networks and is not very realistic. In this paper, we propose that when the network model is chosen to be a weighted network, then the network measures such as degree centrality, clustering coefficient and eigenvector centrality must be redefined and new network sampling algorithms must be designed to take the weights of the edges of the network into consideration. In this paper, first, some network measures for weighted networks are presented and then, six network sampling algorithms are proposed for sampling weighted networks. The evaluation is done through simulations on real and synthetic weighted networks in terms of relative error, skew divergence, Pearson’s correlation coefficient and the Kolmogorov–Smirnov statistic. A number of experiments have been conducted to compare the sampling algorithms for weighted networks proposed in this paper with their counterparts for unweighted networks. The experiments show that existing sampling algorithms for unweighted networks will not produce good results as used for sampling weighted networks when compared to the algorithms proposed in this paper.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Beebe NH (2002) Nelson HF Beebe’s bibliographies page. In: Nelson HF(ed) Beebe’s bibliographies page. http://www.math.utah.edu/~beebe/bibliographies.html
Blagus N, Šubelj L, Weiss G, Bajec M (2015) Sampling promotes community structure in social and information networks. Phys A 432:206–215
Chi G, Thill J-C, Tong D et al (2016) Uncovering regional characteristics from mobile phone data: a network science approach. Pap Reg Sci. doi:10.1111/pirs.12149:1-19
Cordeiro M, Sarmento RP, Gama J (2016) Dynamic community detection in evolving networks using locality modularity optimization. Soc Netw Anal Min 6:15. doi:10.1007/s13278-016-0325-1
Dall’Asta L, Barrat A, Barthélemy M, Vespignani A, (2006) Vulnerability of weighted networks. J Stat Mech: Theory Exp 2006:P04006
Dijkstra EW (1959) A note on two problems in connexion with graphs. Numer Math 1:269–271
Erdos P, Rényi A (1960) On the evolution of random graphs. Publ Math Instit Hung Acad Sci 5:17–61
Frank O (2011) Survey sampling in networks. In: The SAGE Handbook of Social Network Analysis. SAGE publications, pp 370–388
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11:86–92
Gao Q, Ding X, Pan F, Li W (2014) An improved sampling method of complex network. Int J Mod Phys C 25:1440007
García S, Molina D, Lozano M, Herrera F (2009) A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: a case study on the CEC’2005 special session on real parameter optimization. J Heuristics 15:617–644
Gile KJ, Handcock MS (2010) Respondent-driven sampling: an assessment of current methodology. Sociol Methodol 40:285–327
Gjoka M, Kurant M, Butts CT, Markopoulou A (2010) Walking in Facebook: A case study of unbiased sampling of OSNs. Proceedings IEEE INFOCOM 2010. San Diego, CA, pp 1–9
Gjoka M, Butts CT, Kurant M, Markopoulou A (2011) Multigraph sampling of online social networks. IEEE J Sel Areas Commun 29:1893–1905
Guns R, Rousseau R (2014) Recommending research collaborations using link prediction and random forest classifiers. Scientometrics 101:1461–1473
Hall BH, Jaffe AB, Trajtenberg M (2001) The NBER patent citation data file: Lessons, insights and methodological tools. National Bureau of Economic Research
Jalali ZS, Rezvanian A, Meybodi MR (2015) A two-phase sampling algorithm for social networks. In: 2015 2nd International Conference on Knowledge-Based Engineering and Innovation (KBEI). IEEE, pp 1165–1169
Jalali ZS, Rezvanian A, Meybodi MR (2016) Social network sampling using spanning trees. Int J Mod Phys C 27:1650052
Jana R, Bagchi SB (2015) Distributional aspects of some statistics in weighted social networks. J Math Sociol 39:1–28
Jarukasemratana S, Murata T (2015) Edge weight method for community detection on mixed scale-free networks. Int J Artif Intell Tools 24:1–24
Jin L, Chen Y, Hui P, et al (2011) Albatross sampling: robust and effective hybrid vertex sampling for social graphs. In: Proceedings of the 3rd ACM international workshop on MobiArch. pp 11–16
Khomami MMD, Rezvanian A, Meybodi MR (2016) Distributed learning automata-based algorithm for community detection in complex networks. Int J Mod Phys B 30:1650042
Kurant M, Markopoulou A, Thiran P (2010) On the bias of BFS (Breadth First Search). In: 2010 22nd International Teletraffic Congress (ITC). pp 1–8
Kurant M, Markopoulou A, Thiran P (2011) Towards unbiased BFS sampling. IEEE J Sel Areas Commun 29:1799–1809
Kurant M, Gjoka M, Wang Y, et al (2012) Coarse-grained topology estimation via graph sampling. In: Proceedings of the 2012 ACM workshop on Workshop on online social networks. ACM, pp 25–30
Leskovec J, Faloutsos C (2006) Sampling from large graphs. Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, Philadelphia, pp 631–636
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data (TKDD) 1:1–41
Li W, Cai X (2004) Statistical analysis of airport network of China. Phys Rev E 69:46106
Li M, Fan Y, Wu J, Di Z (2013a) Phase transitions in Ising model induced by weight redistribution on weighted regular networks. Int J Mod Phys B 27:1350146
Li P, Zhao Q, Wang H (2013b) A weighted local-world evolving network model based on the edge weights preferential selection. Int J Mod Phys B 27:1350039
Li Q, Zhou T, Lü L, Chen D (2014) Identifying influential spreaders by weighted LeaderRank. Phys A 404:47–55
Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inform Sci Technol 58:1019–1031
Lu J, Li D (2012) Sampling online social networks by random walk. Proceedings of the First ACM International Workshop on Hot Topics on Interdisciplinary Social Networks Research. ACM, Beijing, pp 33–40
Lü L, Zhou T (2010) Link prediction in weighted networks: the role of weak ties. EPL (Europhysics Letters) 89:18001
Lu Z, Sun X, Wen Y et al (2015) Algorithms and applications for community detection in weighted networks. IEEE Trans Parallel Distrib Syst 26:2916–2926
Luo P, Li Y, Wu C, Zhang G (2015) Toward cost-efficient sampling methods. Int J Mod Phys C 26:1550050
Maiya AS, Berger-Wolf TY (2010) Sampling community structure. In: Proceedings of the 19th international conference on World wide web. pp 701–710
Murai F, Ribeiro B, Towsley D, Wang P (2013) On set size distribution estimation and the characterization of large networks via sampling. IEEE J Sel Areas Commun 31:1017–1025
Nemenyi P (1962) Distribution-free multiple comparisons. In: Biometrics. International Biometric Soc 1441 I St, Nw, Suite 700, Washington, Dc 20005-2210, p 263
Newman ME (2001) The structure of scientific collaboration networks. Proc Natl Acad Sci 98:404–409
Newman MEJ (2004) Analysis of weighted networks. Phys Rev E 70:56131
Newman MEJ (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74:36104
Opsahl T, Panzarasa P (2009) Clustering in weighted networks. Social networks 31:155–163
Opsahl T, Agneessens F, Skvoretz J (2010) Node centrality in weighted networks: generalizing degree and shortest paths. Social Networks 32:245–251
Pálovics R, Benczúr AA (2015) Temporal influence over the Last.fm social network—Springer. Social Network Analysis and Mining 5:1–12
Papagelis M, Das G, Koudas N (2013) Sampling online social networks. IEEE Trans Knowl Data Eng 25:662–676
Park H, Moon S (2013) Sampling bias in user attribute estimation of OSNs. In: Proceedings of the 22nd international conference on World Wide Web companion. International World Wide Web Conferences Steering Committee, pp 183–184
Piña-García CA, Gu D (2013) Spiraling Facebook: an alternative Metropolis-Hastings random walk using a spiral proposal distribution. Soc Netw Anal Min 3:1403–1415
Rejaie R, Torkjazi M, Valafar M, Willinger W (2010) Sizing up online social networks. IEEE Netw 24:32–37
Rezvanian A, Meybodi MR (2015a) Finding maximum clique in stochastic graphs using distributed learning automata. Int J Uncertain, Fuzziness Knowl-Based Syst 23:1–31
Rezvanian A, Meybodi MR (2015b) Sampling social networks using shortest paths. Phys A 424:254–268
Rezvanian A, Meybodi MR (2016a) Stochastic graph as a model for social networks. Comput Hum Behav 64:621–640. doi:10.1016/j.chb.2016.07.032
Rezvanian A, Meybodi MR (2016b) A new learning automata-based sampling algorithm for social networks. Int J Commun Syst. doi:10.1002/dac.3091:1-21
Rezvanian A, Rahmati M, Meybodi MR (2014) Sampling from complex networks using distributed learning automata. Phys A 396:224–234
Ribeiro B, Towsley D (2010) Estimating and sampling graphs with multidimensional random walks. In: Proceedings of the 10th annual conference on Internet measurement. Melbourne, pp 390–403
Salehi M, Rabiee HR, Nabavi N, Pooya S (2011) Characterizing Twitter with Respondent-Driven Sampling. In: 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing (DASC). pp 1211–1217
Salehi M, Rabiee HR, Rajabi A (2012) Sampling from complex networks with high community structures. Chaos: an Interdisciplinary. J Nonlinear Sci 22:23126
Saramaki J, Onnela J-P, Kertész J, Kaski K (2005) Characterizing motifs in weighted complex networks. Science of Complex Networks From Biology to the Internet and WWW 776:108–117
Saramäki J, Kivelä M, Onnela J-P et al (2007) Generalizations of the clustering coefficient to weighted complex networks. Phys Rev E 75:27105
Sett N, Singh SR, Nandi S (2016) Influence of edge weight on node proximity based link prediction methods: an empirical analysis. Neurocomputing 172:71–83
Sun Y, Liu C, Zhang C-X, Zhang Z-K (2014) Epidemic spreading on weighted complex networks. Phys Lett A 378:635–640
Tasgin M, Bingol HO (2012) Gossip on weighted networks. Advances in Complex Systems 15:1–18
Thi DB, Ichise R, Le B (2014) Link Prediction in Social Networks Based on Local Weighted Paths. In: Future Data and Security Engineering. Springer, pp 151–163
Tong C, Lian Y, Niu J et al (2016) A novel green algorithm for sampling complex networks. J Netw Comput Appl 59:55–62
Wang S-L, Tsai Y-C, Kao H-Y et al (2013) Shortest paths anonymization on weighted graphs. Int J Software Eng Knowl Eng 23:65–79
Wang P, Zhao J, Lui J et al (2015) Unbiased characterization of node pairs over large graphs. ACM Transactions on Knowledge Discovery from Data (TKDD) 9:22
Yan X, Zhai L, Fan W (2013) C-index: a weighted network node centrality measure for collaboration competence. J Informetr 7:223–239
Yang C-L, Kung P-H, Chen C-A, Lin S-D (2013) Semantically sampling in heterogeneous social networks. In: Proceedings of the 22nd international conference on World Wide Web companion. pp 181–182
Yarlagadda R, Pinnaka S, Etinkaya EKÇ (2015) A time-evolving weighted-graph analysis of global petroleum exchange. In: 2015 7th International Workshop on Reliable Networks Design and Modeling (RNDM). IEEE, pp 266–273
Yoon S, Lee S, Yook SH, Kim Y (2007) Statistical properties of sampled networks by random walks. Phys Rev E 75:46114
Yoon S-H, Kim K-N, Hong J et al (2015) A community-based sampling method using DPL for online social networks. Inf Sci 306:53–69
Zhao SX, Rousseau R, Fred YY (2011) h-Degree as a basic measure in weighted networks. J Informetr 5:668–677
Zheng Y, Liu F, Gong Y-W (2014) Robustness in weighted networks with cluster structure. Mathemat Probl Eng 2014:1–8
Zhu M, Cao T, Jiang X (2014) Using clustering coefficient to construct weighted networks for supervised link prediction. Social Network Analysis and Mining 4:1–8
(2016a) The University of Florida Sparse Matrix Collection. In: The University of Florida Sparse Matrix Collection. http://www.cise.ufl.edu/research/sparse/matrices
(2016b) Pajek datasets. In: Pajek datasets. http://vlado.fmf.uni-lj.si/pub/networks/data
Acknowledgments
The authors would like to thank the anonymous reviewers of this paper for their useful comments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rezvanian, A., Meybodi, M.R. Sampling algorithms for weighted networks. Soc. Netw. Anal. Min. 6, 60 (2016). https://doi.org/10.1007/s13278-016-0371-8
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-016-0371-8