Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Fast hierarchy construction for dense subgraphs

Published: 01 November 2016 Publication History

Abstract

Discovering dense subgraphs and understanding the relations among them is a fundamental problem in graph mining. We want to not only identify dense subgraphs, but also build a hierarchy among them (e.g., larger but sparser subgraphs formed by two smaller dense subgraphs). Peeling algorithms (k-core, k-truss, and nucleus decomposition) have been effective to locate many dense subgraphs. However, constructing a hierarchical representation of density structure, even correctly computing the connected k-cores and k-trusses, have been mostly overlooked. Keeping track of connected components during peeling requires an additional traversal operation, which is as expensive as the peeling process. In this paper, we start with a thorough survey and point to nuances in problem formulations that lead to significant differences in runtimes. We then propose efficient and generic algorithms to construct the hierarchy of dense subgraphs for k-core, k-truss, or any nucleus decomposition. Our algorithms leverage the disjoint-set forest data structure to efficiently construct the hierarchy during traversal. Furthermore, we introduce a new idea to avoid traversal. We construct the subgraphs while visiting neighborhoods in the peeling process, and build the relations to previously constructed subgraphs. We also consider an existing idea to find the k-core hierarchy and adapt for our objectives efficiently. Experiments on different types of large scale real-world networks show significant speedups over naive algorithms and existing alternatives. Our algorithms also outperform the hypothetical limits of any possible traversal-based solution.

References

[1]
J. I. Alvarez-Hamelin, A. Barrat, and A. Vespignani. Large scale networks fingerprinting and visualization using the k-core decomposition. In NIPS, pages 41--50, 2006.
[2]
J. I. Alvarez-Hamelin, L. Dall'Asta, A. Barrat, and A. Vespignani. K-core decomposition of Internet graphs: hierarchies, self-similarity and measurement biases. Networks and Heterogeneous Media, 3(2):371--293, 2008.
[3]
A. Angel, N. Sarkas, N. Koudas, and D. Srivastava. Dense subgraph maintenance under streaming edge weight updates for real-time story identification. PVLDB, 5(6):574--585, Feb. 2012.
[4]
G. D. Bader and C. W. Hogue. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics, 4(1):1--27, 2003.
[5]
V. Batagelj and M. Zaversnik. An o(m) algorithm for cores decomposition of networks. Technical Report cs/0310049, Arxiv, 2003.
[6]
F. Bonchi, F. Gullo, A. Kaltenbrunner, and Y. Volkovich. Core decomposition of uncertain graphs. In KDD, pages 1316--1325, 2014.
[7]
S. Carmi, S. Havlin, S. Kirkpatrick, Y. Shavitt, and E. Shir. A model of internet topology using k-shell decomposition. PNAS, 104(27):11150--11154, 2007.
[8]
J. Cheng, Y. Ke, S. Chu, and M. T. Ozsu. Efficient core decomposition in massive networks. In ICDE, pages 51--62, 2011.
[9]
A. Clauset, C. Moore, and M. E. J. Newman. Hierarchical structure and the prediction of missing links in networks. Nature, 453:98--101, 2008.
[10]
J. Cohen. Trusses: Cohesive subgraphs for social network analysis. National Security Agency Technical Report, 2008.
[11]
T. H. Cormen, C. Stein, R. L. Rivest, and C. E. Leiserson. Introduction to Algorithms. McGraw-Hill, 2001.
[12]
T. A. Davis and Y. Hu. The university of florida sparse matrix collection. ACM Trans. Math. Softw., 38(1):1:1--1:25, 2011.
[13]
Y. Dourisboure, F. Geraci, and M. Pellegrini. Extraction and classification of dense communities in the web. In WWW, pages 461--470, 2007.
[14]
X. Du, R. Jin, L. Ding, V. E. Lee, and J. H. T. Jr. Migration motif: a spatial - temporal pattern mining approach for financial markets. In KDD, pages 1135--1144, 2009.
[15]
P. Erdős and A. Hajnal. On chromatic number of graphs and set-systems. Acta Mathematica Hungarica, 17:61--99, 1966.
[16]
E. Fratkin, B. T. Naughton, D. L. Brutlag, and S. Batzoglou. Motifcut: regulatory motifs finding with maximum density subgraphs. In ISMB, pages 156--157, 2006.
[17]
E. C. Freuder. A sufficient condition for backtrack-free search. J. ACM, 29(1):24--32, Jan. 1982.
[18]
C. Giatsidis, D. M. Thilikos, and M. Vazirgiannis. Evaluating cooperation in communities with the k-core structure. In ASONAM, pages 87--93, 2011.
[19]
C. Giatsidis, D. M. Thilikos, and M. Vazirgiannis. D-cores: measuring collaboration of directed graphs based on degeneracy. Knowl. Inf. Syst., 35(2):311--343, 2013.
[20]
D. Gibson, R. Kumar, and A. Tomkins. Discovering large dense subgraphs in massive graphs. In VLDB, pages 721--732, 2005.
[21]
A. Gionis, F. Junqueira, V. Leroy, M. Serafini, and I. Weber. Piggybacking on social networks. PVLDB, 6(6):409--420, 2013.
[22]
A. Gionis and C. E. Tsourakakis. Dense subgraph discovery: Tutorial. In KDD, pages 2313--2314, 2015.
[23]
D. F. Gleich and C. Seshadhri. Vertex neighborhoods, low conductance cuts, and good seeds for local community methods. In KDD, pages 597--605, 2012.
[24]
E. Gregori, L. Lenzini, and C. Orsini. k-dense communities in the internet as-level topology. In COMSNETS, pages 1--10, 2011.
[25]
J. Healy, J. Janssen, E. Milios, and W. Aiello. Characterization of Graphs Using Degree Cores, pages 137--148. Springer Berlin Heidelberg, 2008.
[26]
X. Huang, H. Cheng, L. Qin, W. Tian, and J. X. Yu. Querying k-truss community in large and dynamic graphs. In SIGMOD, pages 1311--1322, 2014.
[27]
X. Huang, W. Lu, and L. V. Lakshmanan. Truss decomposition of probabilistic graphs: Semantics and algorithms. In SIGMOD, pages 77--90, 2016.
[28]
W. Khaouid, M. Barsky, S. Venkatesh, and A. Thomo. K-core decomposition of large networks on a single PC. PVLDB, 9(1):13--23, 2015.
[29]
R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Trawling the web for emerging cyber-communities. In WWW, pages 1481--1493, 1999.
[30]
V. E. Lee, N. Ruan, R. Jin, and C. Aggarwal. A survey of algorithms for dense subgraph discovery. In Managing and Mining Graph Data, volume 40. 2010.
[31]
J. Leskovec and A. Krevl. SNAP Datasets, June 2014.
[32]
R. Li, J. X. Yu, and R. Mao. Efficient core maintenance in large dynamic graphs. IEEE TKDE, 26(10):2453--2465, 2014.
[33]
D. Lick and A. White. k-degenerate graphs. Canadian Journal of Mathematics, 22:1082--1096, 1970.
[34]
D. Matula and L. Beck. Smallest-last ordering and clustering and graph coloring algorithms. JACM, 30(3):417--427, 1983.
[35]
D. W. Matula. A min-max theorem for graphs with application to graph coloring. SIAM Review, 10(4):481--482, 1968.
[36]
M. P. O'Brien and B. D. Sullivan. Locally estimating core numbers. In ICDM, pages 460--469, 2014.
[37]
C. Orsini, E. Gregori, L. Lenzini, and D. Krioukov. Evolution of the internet k-dense structure. IEEE/ ACM Trans. Netw., 22(6):1769--1780, 2014.
[38]
R. Rossi and N. Ahmed. The network data repository with interactive graph analysis and visualization. In AAAI, pages 4292--4293, 2015.
[39]
K. Saito and T. Yamada. Extracting communities from complex networks by the k-dense method. In ICDMW, 2006.
[40]
A. E. Sariyüce, B. Gedik, G. Jacques-Silva, K.-L. Wu, and Ü. V. Çatalyürek. Incremental k-core decomposition: algorithms and evaluation. VLDB Journal, pages 1--23, 2016.
[41]
A. E. Sariyüce, B. Gedik, G. Jacques-Silva, K.-L. Wu, and Ü. V. Çatalyürek. Streaming algorithms for k-core decomposition. Proc. VLDB Endow., 6(6):433--444, Apr. 2013.
[42]
A. E. Sariyüce, C. Seshadhri, A. Pinar, and Ü. V. Çatalyürek. Finding the hierarchy of dense subgraphs using nucleus decompositions. In WWW, pages 927--937, 2015.
[43]
S. B. Seidman. Network structure and minimum degree. Social Networks, 5(3):269--287, 1983.
[44]
P. Simon, M. Serrano, M. Beiro, J. Alvarez-Hamelin, and M. Boguna. Deciphering the global organization of clustering in real complex networks. Scientific Reports, 3(2517), 2013.
[45]
R. E. Tarjan. Efficiency of a good but not linear set union algorithm. JACM, 22(2):215--225, Apr. 1975.
[46]
A. L. Traud, P. J. Mucha, and M. A. Porter. Social structure of facebook networks. Physica A: Statistical Mechanics and its Applications, 391(16):4165 -- 4180, 2012.
[47]
A. Verma and S. Butenko. Network clustering via clique relaxations: A community based approach. In DIMACS Graph Part. and Clustering, pages 129--140, 2012.
[48]
J. Wang and J. Cheng. Truss decomposition in massive networks. PVLDB, 5(9):812--823, 2012.
[49]
S. Wasserman and K. Faust. Social Network Analysis: Methods and Applications. Cambridge Univ Press, 1994.
[50]
D. Watts and S. Strogatz. Collective dynamics of 'small-world' networks. Nature, 393:440--442, 1998.
[51]
D. Wen, L. Qin, Y. Zhang, X. Lin, and J. Yu. I/o efficient core graph decomposition at web scale. In ICDE, pages 133--144, 2016.
[52]
H. Wu, J. Cheng, Y. Lu, Y. Ke, Y. Huang, D. Yan, and H. Wu. Core decomposition in large temporal graphs. In Big Data, IEEE International Conf. on, pages 649--658, 2015.
[53]
S. Wuchty and E. Almaas. Peeling the yeast protein network. PROTEOMICS, 5(2):444--449, 2005.
[54]
B. Zhang and S. Horvath. A general framework for weighted gene co-expression network analysis. Statistical Applications in Genetics and Molecular Biology, 4(1):Article 17+, 2005.
[55]
H. Zhang, H. Zhao, W. Cai, J. Liu, and W. Zhou. Using the k-core decomposition to analyze the static structure of large-scale software systems. J. of Supercomputing, 53(2):352--369, 2009.
[56]
Y. Zhang and S. Parthasarathy. Extracting analyzing and visualizing triangle k-core motifs within networks. In ICDE, pages 1049--1060, 2012.
[57]
F. Zhao and A. Tung. Large scale cohesive subgraphs disc. for social network visual analysis. In PVLDB, pages 85--96, 2013.

Cited By

View all
  • (2024)Efficient Algorithms for Density Decomposition on Large Static and Dynamic GraphsProceedings of the VLDB Endowment10.14778/3681954.368197417:11(2933-2945)Online publication date: 1-Jul-2024
  • (2024)Parallel Algorithms for Hierarchical Nucleus DecompositionProceedings of the ACM on Management of Data10.1145/36392872:1(1-27)Online publication date: 26-Mar-2024
  • (2024)Optimizing Network Resilience via Vertex AnchoringProceedings of the ACM Web Conference 202410.1145/3589334.3645465(606-617)Online publication date: 13-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 10, Issue 3
November 2016
216 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 November 2016
Published in PVLDB Volume 10, Issue 3

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)4
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Efficient Algorithms for Density Decomposition on Large Static and Dynamic GraphsProceedings of the VLDB Endowment10.14778/3681954.368197417:11(2933-2945)Online publication date: 1-Jul-2024
  • (2024)Parallel Algorithms for Hierarchical Nucleus DecompositionProceedings of the ACM on Management of Data10.1145/36392872:1(1-27)Online publication date: 26-Mar-2024
  • (2024)Optimizing Network Resilience via Vertex AnchoringProceedings of the ACM Web Conference 202410.1145/3589334.3645465(606-617)Online publication date: 13-May-2024
  • (2024)Explainable decomposition of nested dense subgraphsData Mining and Knowledge Discovery10.1007/s10618-024-01053-838:6(3621-3642)Online publication date: 1-Nov-2024
  • (2024)A near-optimal approach to edge connectivity-based hierarchical graph decompositionThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00797-x33:1(49-71)Online publication date: 1-Jan-2024
  • (2023)Parallel Colorful h-Star Core Maintenance in Dynamic GraphsProceedings of the VLDB Endowment10.14778/3603581.360359316:10(2538-2550)Online publication date: 8-Aug-2023
  • (2023)Parallel Peeling of Bipartite Networks for Hierarchical Dense Subgraph DiscoveryACM Transactions on Parallel Computing10.1145/358308410:2(1-35)Online publication date: 20-Jun-2023
  • (2023)Quantifying Node Importance over Network Structural StabilityProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599480(3217-3228)Online publication date: 6-Aug-2023
  • (2023)Hierarchical Dense Pattern Detection in TensorsACM Transactions on Knowledge Discovery from Data10.1145/357702217:6(1-29)Online publication date: 28-Feb-2023
  • (2023)On Cohesively Polarized Communities in Signed NetworksCompanion Proceedings of the ACM Web Conference 202310.1145/3543873.3587698(1339-1347)Online publication date: 30-Apr-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media