Abstract
We present a new framework for analysis and visualization of large complex networks based on structural information retrieved from their distance k-graphs and B-matrices. The construction of B-matrices for graphs with more than 1 million edges requires massive BFS computations and is facilitated using Cassovary - an open-source in-memory graph processing engine. The approach described in this paper enables efficient generation of expressive, multi-dimensional descriptors useful in graph embedding and graph mining tasks. In experimental section, we present how the developed tools helped in the analysis of real-world graphs from Stanford Large Network Dataset Collection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Diameter, efficiency, characteristic path length, vertex betweenness, vertex closeness, vertex eccentricity, transitivity, clustering coefficient, assortativity [3].
References
Avery, C.: Giraph: large-scale graph processing infrastructure on hadoop. In: Proceedings of the Hadoop Summit. Santa Clara (2011)
Barabási, A., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509 (1999)
Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Hwang, D.: Complex networks: structure and dynamics. Phys. Rep. 424(4–5), 175–308 (2006)
Borzeshi, E.Z., Piccardi, M., Riesen, K., Bunke, H.: Discriminative prototype selection methods for graph embedding. Pattern Recogn. 46(6), 1648–1657 (2013)
Brandes, U., Pfeffer, J., Mergel, I.: Studying Social Networks: A Guide to Empirical Research. Campus Verlag, Frankfurt (2012)
Bu, Y., Howe, B., Balazinska, M., Ernst, M.D.: Haloop: efficient iterative data processing on large clusters. Proc. VLDB Endowment 3(1–2), 285–296 (2010)
Czech, W.: Graph descriptors from B-matrix representation. In: Jiang, X., Ferrer, M., Torsello, A. (eds.) GbRPR 2011. LNCS, vol. 6658, pp. 12–21. Springer, Heidelberg (2011)
Czech, W., Goryczka, S., Arodz, T., Dzwinel, W., Dudek, A.: Exploring complex networks with graph investigator research application. Comput. Inform. 30(2), 381–410 (2011)
Czech, W.: Invariants of distance k-graphs for graph embedding. Pattern Recogn. Lett. 33(15), 1968–1979 (2012)
D’Alberto, P., Nicolau, A.: R-kleene: a high-performance divide-and-conquer algorithm for the all-pair shortest path for densely connected networks. Algorithmica 47(2), 203–213 (2007)
Dzwinel, W., Wcisło, R.: Very fast interactive visualization of large sets of high-dimensional data. In: Proceedings of ICCS 2015, Reykjavik, 1–3 June 2015, Iceland, Procedia Computer Science (2015) (in print)
Ekanayake, J., Li, H., Zhang, B., Gunarathne, T., Bae, S.H., Qiu, J., Fox, G.: Twister: a runtime for iterative mapreduce. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, pp. 810–818. ACM (2010)
Emms, D., Wilson, R.C., Hancock, E.R.: Graph matching using the interference of continuous-time quantum walks. Pattern Recogn. 42(5), 985–1002 (2009)
Foggia, P., Percannella, G., Vento, M.: Graph matching and learning in pattern recognition in the last 10 years. Int. J. Pattern Recogn. Artif. Intell. 28(01), 1450001 (2014)
Gibert, J., Valveny, E., Bunke, H.: Dimensionality reduction for graph of words embedding. In: Jiang, X., Ferrer, M., Torsello, A. (eds.) GbRPR 2011. LNCS, vol. 6658, pp. 22–31. Springer, Heidelberg (2011)
Gupta, P., Goel, A., Lin, J., Sharma, A., Wang, D., Zadeh, R.: Wtf: The who to follow service at twitter. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 505–514. International World Wide Web Conferences Steering Committee (2013)
Han, M., Daudjee, K., Ammar, K., Ozsu, M.T., Wang, X., Jin, T.: An experimental comparison of pregel-like graph processing systems. Proc. VLDB Endowment 7(12), 1047–1058 (2014)
Lee, W.-J., Duin, R.P.W.: A labelled graph based multiple classifier system. In: Benediktsson, J.A., Kittler, J., Roli, F. (eds.) MCS 2009. LNCS, vol. 5519, pp. 201–210. Springer, Heidelberg (2009)
Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data
Leskovec, J., Sosič, R.: SNAP: A general purpose network analysis and graph mining library in C++. http://snap.stanford.edu/snap
Low, Y., Gonzalez, J.E., Kyrola, A., Bickson, D., Guestrin, C.E., Hellerstein, J.: Graphlab: a new framework for parallel machine learning (2014). arXiv:1408.2041
Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 135–146. ACM (2010)
Qiu, H., Hancock, E.: Clustering and embedding using commute times. IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 1873–1890 (2007)
Salihoglu, S., Widom, J.: Gps: a graph processing system. In: Proceedings of the 25th International Conference on Scientific and Statistical Database Management, p. 22. ACM (2013)
Watts, D., Strogatz, S.: Collective dynamics of small-world networks. Nature 393(6684), 440–442 (1998)
Xiao, B., Hancock, E., Wilson, R.: A generative model for graph matching and embedding. Comput. Vis. Image Underst. 113(7), 777–789 (2009)
Zhang, Y., Gao, Q., Gao, L., Wang, C.: Priter: a distributed framework for prioritizing iterative computations. IEEE Trans. Parallel Distrib. Syst. 24(9), 1884–1893 (2013)
Acknowledgments
This research is supported by the National Centre Science Poland (NCN) DEC-2013/09/B/ST6/01549.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Czech, W., Mielczarek, W., Dzwinel, W. (2016). Comparison of Large Graphs Using Distance Information. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2015. Lecture Notes in Computer Science(), vol 9573. Springer, Cham. https://doi.org/10.1007/978-3-319-32149-3_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-32149-3_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32148-6
Online ISBN: 978-3-319-32149-3
eBook Packages: Computer ScienceComputer Science (R0)