Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2723372.2746478acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Efficient Enumeration of Maximal k-Plexes

Published: 27 May 2015 Publication History

Editorial Notes

Computationally Reproducible. The experimental results of this paper were reproduced by a SIGMOD Review Committee and were found to support the central results reported in the paper. Details of the review process are found here:
http://db-reproducibility.seas.harvard.edu/#process

Abstract

The problem of enumerating (i.e., generating) all maximal cliques in a graph has received extensive treatment, due to the plethora of applications in various areas such as data mining, bioinformatics, network analysis and community detection. However, requiring the enumerated subgraphs to be full cliques is too restrictive in common real-life scenarios where "almost cliques" are equally useful. Hence, the notion of a k-plex, a clique relaxation that allows every node to be "missing" k neighbors, has been introduced. But this seemingly minor relaxation casts existing algorithms for clique enumeration inapplicable, for inherent reasons. This paper presents the first provably efficient algorithms, both for enumerating the maximal k-plexes and for enumerating the maximal connected k-plexes. Our algorithms run in polynomial delay for a constant k and incremental FPT delay when k is a parameter. The importance of such algorithms is in the areas mentioned above, as well as in new applications. Extensive experimentation over both real and synthetic datasets shows the efficiency of our algorithms, and their scalability with respect to graph size, density and choice of k, as well as their clear superiority over the state-of-the-art.

References

[1]
E. A. Akkoyunlu. The enumeration of maximal cliques of large graphs. SIAM J. Comput., 2(1):1--6, 1973.
[2]
J. Bailey, T. Manoukian, and K. Ramamohanarao. A fast algorithm for computing hypergraph transversals and its application in mining emerging patterns. In Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM 2003), 19--22 December 2003, Melbourne, Florida, USA, pages 485--488, 2003.
[3]
B. Balasundaram, S. Butenko, and I. V. Hicks. Clique relaxations in social network analysis: The maximum k-plex problem. Oper. Res., 59(1):133--142, Jan. 2011.
[4]
H. Bernard, P. D. Killworth, and L. Sailer. Informant accuracy in social network data iv: a comparison of clique-level structure in behavioral and cognitive network data. Social Networks, 2(3):191 -- 218, 19791980.
[5]
K. Biswas, V. Muthukkumarasamy, and E. Sithirasenan. Maximal clique based clustering scheme for wireless sensor networks. In 2013 IEEE Eighth International Conference on Intelligent Sensors, Sensor Networks and Information Processing, Melbourne, Australia, April 2--5, 2013, pages 237--241, 2013.
[6]
V. Boginski, S. Butenko, and P. M. Pardalos. Statistical analysis of financial networks. Computational Statistics and Data Analysis, 48(2):431 -- 443, 2005.
[7]
E. Boros, K. M. Elbassioni, V. Gurvich, and L. Khachiyan. An efficient incremental algorithm for generating all maximal independent sets in hypergraphs of bounded dimension. Parallel Processing Letters, 10(4):253--266, 2000.
[8]
E. Boros, K. M. Elbassioni, V. Gurvich, and L. Khachiyan. Generating maximal independent sets for hypergraphs with bounded edge-intersections. In LATIN: Theoretical Informatics, 6th Latin American Symposium, Buenos Aires, Argentina, April 5--8, 2004, Proceedings, pages 488--498, 2004.
[9]
C. Bron and J. Kerbosch. Algorithm 457: Finding all cliques of an undirected graph. Commun. ACM, 16(9):575--577, Sept. 1973.
[10]
L. Chang, J. X. Yu, and L. Qin. Fast maximal cliques enumeration in sparse graphs. Algorithmica, 66(1):173--186, 2013.
[11]
J. Cheng, Y. Ke, A. W.-C. Fu, J. X. Yu, and L. Zhu. Finding maximal cliques in massive networks. ACM Trans. Database Syst., 36(4):21:1--21:34, Dec. 2011.
[12]
S. Cohen, B. Kimelfeld, and Y. Sagiv. Generating all maximal induced subgraphs for hereditary and connected-hereditary graph properties. J. Comput. Syst. Sci., 74(7):1147--1159, 2008.
[13]
T. Eiter, G. Gottlob, and K. Makino. New results on monotone dualization and generating hypergraph transversals. SIAM J. Comput., 32(2):514--537, 2003.
[14]
D. Eppstein, M. Löffler, and D. Strash. Listing all maximal cliques in sparse graphs in near-optimal time. In O. Cheong, K.-Y. Chwa, and K. Park, editors, Algorithms and Computation, volume 6506 of Lecture Notes in Computer Science, pages 403--414. Springer Berlin Heidelberg, 2010.
[15]
D. Eppstein and D. Strash. Listing all maximal cliques in large sparse real-world graphs. In P. Pardalos and S. Rebennack, editors, Experimental Algorithms, volume 6630 of Lecture Notes in Computer Science, pages 364--375. Springer Berlin Heidelberg, 2011.
[16]
G. Gottlob. Deciding monotone duality and identifying frequent itemsets in quadratic logspace. In Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2013, New York, NY, USA - June 22 - 27, 2013, pages 25--36, 2013.
[17]
E. Harley, A. Bonner, and N. Goodman. Uniform integration of genome mapping data using intersection graphs. Bioinformatics, 17(6):487--494, 2001.
[18]
D. S. Johnson, C. H. Papadimitriou, and M. Yannakakis. On generating all maximal independent sets. Inf. Process. Lett., 27(3):119--123, 1988.
[19]
B. Kimelfeld and Y. Sagiv. Finding and approximating top-k answers in keyword proximity search. In Proceedings of the Twenty-Fifth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 26--28, 2006, Chicago, Illinois, USA, pages 173--182, 2006.
[20]
I. Koch. Enumerating all connected maximal common subgraphs in two graphs. Theor. Comput. Sci., 250(1--2):1--30, Jan. 2001.
[21]
E. L. Lawler. A procedure for computing the k best solutions to discrete optimization problems and its application to the shortest path problem. Management Science, 18(7):401--405, 1972.
[22]
J. M. Lewis and M. Yannakakis. The node-deletion problem for hereditary properties is np-complete. J. Comput. Syst. Sci., 20(2):219--230, 1980.
[23]
D. Liben-Nowell and J. M. Kleinberg. The link-prediction problem for social networks. JASIST, 58(7):1019--1031, 2007.
[24]
S. Mohseni-Zadeh, P. Brézellec, and J.-L. Risler. Cluster-c, an algorithm for the large-scale clustering of protein sequences based on the extraction of maximal cliques. Computational Biology and Chemistry, 28(3):211 -- 218, 2004.
[25]
J. Moon and L. Moser. On cliques in graphs. Israel Journal of Mathematics, 3(1):23--28, 1965.
[26]
H. Moser, R. Niedermeier, and M. Sorge. Algorithms and experiments for clique relaxations--finding maximum s-plexes. In J. Vahrenhold, editor, Experimental Algorithms, volume 5526 of Lecture Notes in Computer Science, pages 233--244. Springer Berlin Heidelberg, 2009.
[27]
K. Murakami and T. Uno. Efficient algorithms for dualizing large-scale hypergraphs. Discrete Applied Mathematics, 170:83--94, 2014.
[28]
K. G. Murty. An algorithm for ranking all the assignments in order of increasing cost. Operations Research, 16(3):682--687, 1968.
[29]
G. Palla, I. Derenyi, and T. Vicsek. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043):814--818, 2005.
[30]
J. Pattillo, N. Youssef, and S. Butenko. On clique relaxation models in network analysis. European Journal of Operational Research, 226(1):9--18, 2013.
[31]
J. L. Pattillo. Mathematical Foundations and Algorithms for Clique Relaxations in Networks. PhD thesis, College Station, TX, USA, 2011. AAI3500375.
[32]
S. Seidman and B. Foster. A graph theoretic generalization of the clique concept. Journal of Mathematical Sociology, 6:139--154, 1978.
[33]
SNAP: Stanford network analysis platform. htto://snap.stanford.edu.
[34]
E. Tomita, A. Tanaka, and H. Takahashi. The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci., 363(1):28--42, Oct. 2006.
[35]
S. Wasserman and K. Faust. Social Network Analysis: Methods and Applications. Cambridge University Press, 1994.
[36]
B. Wu and X. Pei. A parallel algorithm for enumerating all the maximal phk -plexes. In Emerging Technologies in Knowledge Discovery and Data Mining, PAKDD 2007, International Workshops, Nanjing, China, May 22--25, 2007, Revised Selected Papers, pages 476--483, 2007.
[37]
B. Yan and S. Gregory. Detecting communities in networks by merging cliques. In IEEE International Conference on Intelligent Computing and Intelligent Systems, 2009.
[38]
M. Yannakakis. Algorithms for acyclic database schemes. In VLDB, pages 82--94, 1981.
[39]
J. Y. Yen. Finding the k shortest loopless paths in a network. Management Science, 17:712--716, 1971.
[40]
Z. Zou, J. Li, H. Gao, and S. Zhang. Finding top-k maximal cliques in an uncertain graph. In Proceedings of the 26th International Conference on Data Engineering, ICDE 2010, March 1--6, 2010, Long Beach, California, USA, pages 649--652, 2010.

Cited By

View all
  • (2024)Efficient Maximal Biplex Enumerations with Improved Worst-Case Time GuaranteeProceedings of the ACM on Management of Data10.1145/36549382:3(1-26)Online publication date: 30-May-2024
  • (2024)Maximum k-Plex Computation: Theory and PracticeProceedings of the ACM on Management of Data10.1145/36393182:1(1-26)Online publication date: 26-Mar-2024
  • (2024)Graph Contrastive Learning with Cohesive Subgraph AwarenessProceedings of the ACM Web Conference 202410.1145/3589334.3645470(629-640)Online publication date: 13-May-2024
  • Show More Cited By

Index Terms

  1. Efficient Enumeration of Maximal k-Plexes

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
    May 2015
    2110 pages
    ISBN:9781450327589
    DOI:10.1145/2723372
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 May 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. enumeration
    2. fixed-parameter tractability
    3. maximal graph clique
    4. maximal k-plex
    5. polynomial delay

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGMOD/PODS'15
    Sponsor:
    SIGMOD/PODS'15: International Conference on Management of Data
    May 31 - June 4, 2015
    Victoria, Melbourne, Australia

    Acceptance Rates

    SIGMOD '15 Paper Acceptance Rate 106 of 415 submissions, 26%;
    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)40
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 25 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Efficient Maximal Biplex Enumerations with Improved Worst-Case Time GuaranteeProceedings of the ACM on Management of Data10.1145/36549382:3(1-26)Online publication date: 30-May-2024
    • (2024)Maximum k-Plex Computation: Theory and PracticeProceedings of the ACM on Management of Data10.1145/36393182:1(1-26)Online publication date: 26-Mar-2024
    • (2024)Graph Contrastive Learning with Cohesive Subgraph AwarenessProceedings of the ACM Web Conference 202410.1145/3589334.3645470(629-640)Online publication date: 13-May-2024
    • (2024)Efficient Maximal Temporal Plex Enumeration2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00240(3098-3110)Online publication date: 13-May-2024
    • (2023)Hereditary Cohesive Subgraphs Enumeration on Bipartite Graphs: The Power of Pivot-based ApproachesProceedings of the ACM on Management of Data10.1145/35892831:2(1-26)Online publication date: 20-Jun-2023
    • (2023)Maximal Defective Clique EnumerationProceedings of the ACM on Management of Data10.1145/35889311:1(1-26)Online publication date: 30-May-2023
    • (2023)Computing Significant Cliques in Large Labeled NetworksIEEE Transactions on Big Data10.1109/TBDATA.2022.32236449:3(904-917)Online publication date: 1-Jun-2023
    • (2023)A novel network core structure extraction algorithm utilized variational autoencoder for community detectionExpert Systems with Applications10.1016/j.eswa.2023.119775222(119775)Online publication date: Jul-2023
    • (2023)Recent Advances in Practical Data ReductionAlgorithms for Big Data10.1007/978-3-031-21534-6_6(97-133)Online publication date: 18-Jan-2023
    • (2022)Mining Bursting Core in Large Temporal GraphsProceedings of the VLDB Endowment10.14778/3565838.356584515:13(3911-3923)Online publication date: 1-Sep-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media