research-article

Motif-driven Dense Subgraph Discovery in Directed and Labeled Networks

Author:

Ahmet Erdem SarıyüceAuthors Info & Claims

WWW '21: Proceedings of the Web Conference 2021

Pages 379 - 390

https://doi.org/10.1145/3442381.3450055

Published: 03 June 2021 Publication History

Abstract

Dense regions in networks are an indicator of interesting and unusual information. However, most existing methods only consider simple, undirected, unweighted networks. Complex networks in the real-world often have rich information though: edges are asymmetrical and nodes/edges have categorical and numerical attributes. Finding dense subgraphs in such networks in accordance with this rich information is an important problem with many applications. Furthermore, most existing algorithms ignore the higher-order relationships (i.e., motifs) among the nodes. Motifs are shown to be helpful for dense subgraph discovery but their wide spectrum in heterogeneous networks makes it challenging to utilize them effectively. In this work, we propose quark decomposition framework to locate dense subgraphs that are rich with a given motif. We focus on networks with directed edges and categorical attributes on nodes/edges. For a given motif, our framework builds subgraphs, called quarks, in varying quality and with hierarchical relations. Our framework is versatile, efficient, and extendible. We discuss the limitations and practical instantiations of our framework as well as the role confusion problem that needs to be considered in directed networks. We give an extensive evaluation of our framework in directed, signed-directed, and node-labeled networks. We consider various motifs and evaluate the quark decomposition using several real-world networks. Results show that quark decomposition performs better than the state-of-the-art techniques. Our framework is also practical and scalable to networks with up to 101M edges.

References

[1]

2021. Center for Computational Research, University at Buffalo. http://hdl.handle.net/10477/79221.

[2]

N. K. Ahmed, J. Neville, R. A. Rossi, N. Duffield, and T. L. Willke. 2016. Graphlet Decomposition: Framework, Algorithms, and Applications. KAIS (2016), 1–32.

[3]

S. Aksoy, T. G. Kolda, and A. Pinar. 2017. Measuring and Modeling Bipartite Graphs with Community Structure. Journal of Complex Networks 5, 4 (2017), 581–603.

[4]

Hélio Almeida, Dorgival Guedes, Wagner Meira, and Mohammed J Zaki. 2011. Is there a best quality metric for graph clusters?. In Joint European conference on machine learning and knowledge discovery in databases. Springer, 44–59.

Digital Library

[5]

J. Ignacio Alvarez-Hamelin, Alain Barrat, and Alessandro Vespignani. 2006. Large scale networks fingerprinting and visualization using the k-core decomposition. In NIPS. 41–50.

[6]

Solon Barocas, Moritz Hardt, and Arvind Narayanan. 2017. Fairness in machine learning. NIPS Tutorial 1(2017).

[7]

V. Batagelj and M. Zaversnik. 2003. An O(m) Algorithm for Cores Decomposition of Networks. Technical Report cs/0310049. Arxiv.

[8]

Austin R. Benson, David F. Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science 353, 6295 (2016), 163–166.

[9]

Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment 2008, 10(2008), P10008.

[10]

Shai Carmi, Shlomo Havlin, Scott Kirkpatrick, Yuval Shavitt, and Eran Shir. 2007. A model of Internet topology using k-shell decomposition. PNAS 104, 27 (2007), 11150–11154.

[11]

Aldo G Carranza, Ryan A Rossi, Anup Rao, and Eunyee Koh. 2020. Higher-order Clustering in Complex Heterogeneous Networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 25–35.

Digital Library

[12]

J. Cohen. 2008. Trusses: Cohesive subgraphs for social network analysis. National Security Agency Technical Report(2008).

[13]

X. Du, R. Jin, L. Ding, V. E. Lee, and J. H. Thornton Jr.2009. Migration motif: a spatial - temporal pattern mining approach for financial markets. In KDD. 1135–1144.

[14]

David Easley and Jon Kleinberg. 2010. Networks, crowds, and markets. (2010).

[15]

Yixiang Fang, Xin Huang, Lu Qin, Ying Zhang, Wenjie Zhang, Reynold Cheng, and Xuemin Lin. 2020. A survey of community search over big graphs. The VLDB Journal 29, 1 (2020), 353–392.

Digital Library

[16]

Yixiang Fang, Kaiqiang Yu, Reynold Cheng, Laks VS Lakshmanan, and Xuemin Lin. 2019. Efficient algorithms for densest subgraph discovery. PVLDB 12, 11 (2019), 1719–1732.

Digital Library

[17]

E. Fratkin, B. T. Naughton, D. L. Brutlag, and S. Batzoglou. 2006. MotifCut: regulatory motifs finding with maximum density subgraphs. In ISMB (2006-08-28). 156–157.

[18]

A. Gionis, F. Junqueira, V. Leroy, M. Serafini, and I. Weber. 2013. Piggybacking on Social Networks. PVLDB 6, 6 (2013), 409–420.

Digital Library

[19]

David F. Gleich and C. Seshadhri. 2012. Vertex Neighborhoods, Low Conductance Cuts, and Good Seeds for Local Community Methods. In KDD. 597–605.

[20]

A. V. Goldberg. 1984. Finding a Maximum Density Subgraph. Technical Report. Berkeley, CA, USA.

[21]

Ramanthan Guha, Ravi Kumar, Prabhakar Raghavan, and Andrew Tomkins. 2004. Propagation of trust and distrust. In WWW. 403–412.

[22]

Jiafeng Hu, Reynold Cheng, Kevin Chen-Chuan Chang, Aravind Sankar, Yixiang Fang, and Brian YH Lam. 2019. Discovering maximal motif cliques in large heterogeneous information networks. In 2019 IEEE (ICDE). IEEE, 746–757.

[23]

Xin Huang, Hong Cheng, Lu Qin, Wentao Tian, and Jeffrey Xu Yu. 2014. Querying K-truss Community in Large and Dynamic Graphs. In SIGMOD. 1311–1322.

[24]

Shweta Jain and C. Seshadhri. 2017. A Fast and Provable Method for Estimating Clique Counts Using Turán’s Theorem. In WWW. 441–449.

[25]

Zeinab S. Jalali, Weixiang Wang, Myunghwan Kim, Hema Raghavan, and Sucheta Soundarajan. 2020. On the Information Unfairness of Social Networks. 613–521.

[26]

R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 2009. 3-HOP: a high-compression indexing scheme for reachability query. In SIGMOD Conf.813–826.

[27]

Michael Kearns and Aaron Roth. 2019. The ethical algorithm: The science of socially aware algorithm design. Oxford University Press.

[28]

George R Kiss, Christine Armstrong, Robert Milroy, and James Piper. 1973. An associative thesaurus of English and its computer analysis. The computer and literary studies(1973), 153–165.

[29]

Maksim Kitsak, Lazaros K Gallos, Shlomo Havlin, Fredrik Liljeros, Lev Muchnik, H Eugene Stanley, and Hernán A Makse. 2010. Identification of influential spreaders in complex networks. Nature physics 6, 11 (2010), 888.

[30]

R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. 1999. Trawling the Web for Emerging Cyber-communities. In WWW. 1481–1493.

[31]

Srijan Kumar, William L Hamilton, Jure Leskovec, and Dan Jurafsky. 2018. Community interaction and conflict on the web. In WWW. 933–943.

[32]

Jérôme Kunegis, Andreas Lommatzsch, and Christian Bauckhage. 2009. The slashdot zoo: mining a social network with negative edges. In WWW. ACM, 741–750.

[33]

V. E. Lee, N. Ruan, R. Jin, and C. Aggarwal. 2010. A Survey of Algorithms for Dense Subgraph Discovery.In Managing and Mining Graph Data. Vol. 40.

[34]

Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets.

[35]

Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, and Michael W. Mahoney. 2008. Statistical Properties of Community Structure in Large Social and Information Networks(WWW ’08). ACM, 695–704.

[36]

Pei Li, Ling Huang, Chang-Dong Wang, and Jian-Huang Lai. 2019. EdMot: An Edge Enhancement Approach for Motif-aware Community Detection. In SIGKDD. 479–487.

[37]

Fragkiskos D Malliaros, Maria-Evgenia G Rossi, and Michalis Vazirgiannis. 2016. Locating influential nodes in complex networks. Scientific reports 6(2016), 19307.

[38]

Christopher D Manning, Hinrich Schütze, and Prabhakar Raghavan. 2008. Introduction to information retrieval. Cambridge university press.

[39]

Ine Melckenbeeck, Pieter Audenaert, Didier Colle, and Mario Pickavet. 2018. Efficiently counting all orbits of graphlets of any order in a graph using autogenerated equations. Bioinformatics 34, 8 (2018), 1372–1380.

[40]

Ron Milo, Shai Shen-Orr, Shalev Itzkovitz, Nadav Kashtan, Dmitri Chklovskii, and Uri Alon. 2002. Network motifs: simple building blocks of complex networks. Science 298, 5594 (2002), 824–827.

[41]

Mark EJ Newman, Stephanie Forrest, and Justin Balthrop. 2002. Email networks and the spread of computer viruses. Physical Review E 66, 3 (2002), 035101.

[42]

Chengbin Peng, Tamara G Kolda, and Ali Pinar. 2014. Accelerating community detection by using k-core subgraphs. arXiv preprint arXiv:1403.2226(2014).

[43]

Ali Pinar, C. Seshadhri, and Vaidyanathan Vishal. 2017. ESCAPE: Efficiently Counting All 5-Vertex Subgraphs(WWW ’17). 1431–1440.

[44]

Nataša Pržulj. 2007. Biological network comparison using graphlet degree distribution. Bioinformatics 23, 2 (2007), e177–e183.

Digital Library

[45]

Martin Rosvall and Carl T Bergstrom. 2008. Maps of random walks on complex networks reveal community structure. PNAS 105, 4 (2008), 1118–1123.

[46]

K. Saito and T. Yamada. 2006. Extracting Communities from Complex Networks by the k-dense Method. In ICDMW.

[47]

Anida Sarajlić, Noël Malod-Dognin, Ömer Nebil Yaveroğlu, and Nataša Pržulj. 2016. Graphlet-based characterization of directed networks. Scientific reports 6(2016), 35098.

[48]

A. Erdem Sariyüce and Ali Pinar. 2016. Fast Hierarchy Construction for Dense Subgraphs. Proc. VLDB Endow. 10, 3 (Nov. 2016), 97–108.

Digital Library

[49]

A. Erdem Sariyüce and Ali Pinar. 2018. Peeling Bipartite Networks for Dense Subgraph Discovery. In WSDM.

[50]

A. Erdem Sarıyüce, C. Seshadhri, A. Pınar, and Ü. V. Çatalyürek. 2015. Finding the Hierarchy of Dense Subgraphs Using Nucleus Decompositions. In WWW (Florence, Italy). 927–937.

[51]

A. Erdem Sariyüce, C. Seshadhri, Ali Pinar, and Ümit V. Çatalyürek. 2017. Nucleus Decompositions for Identifying Hierarchy of Dense Subgraphs. ACM Trans. Web 11, 3, Article 16 (July 2017), 27 pages. https://doi.org/10.1145/3057742

Digital Library

[52]

S. B. Seidman. 1983. Network structure and minimum degree. Social Networks 5, 3 (1983), 269–287.

[53]

C Seshadhri, Ali Pinar, Nurcan Durak, and Tamara G Kolda. 2016. Directed closure measures for networks with reciprocity. Journal of Complex Networks 5, 1 (2016), 32–47.

[54]

C. Seshadhri, Ali Pinar, and Tamara G. Kolda. 2014. Triadic Measures on Graphs: The Power of Wedge Sampling. Statistical Analysis and Data Mining 7, 4 (2014), 294–307.

Digital Library

[55]

Ana-Andreea Stoica and Augustin Chaintreau. 2019. Fairness in Social Influence Maximization. In Companion Proceedings of The 2019 Web Conference. 569–574.

[56]

Yizhou Sun and Jiawei Han. 2012. Mining heterogeneous information networks: principles and methodologies. Synthesis Lectures on Data Mining and Knowledge Discovery 3, 2(2012), 1–159.

Digital Library

[57]

Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S Yu, and Tianyi Wu. 2011. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proceedings of the VLDB Endowment 4, 11 (2011), 992–1003.

Digital Library

[58]

Taro Takaguchi and Yuichi Yoshida. 2016. Cycle and flow trusses in directed networks. Royal Society open science 3, 11 (2016), 160270.

[59]

Amanda L Traud, Peter J Mucha, and Mason A Porter. 2012. Social structure of facebook networks. Physica A: Statistical Mechanics and its Applications 391, 16(2012), 4165–4180.

[60]

C. Tsourakakis. 2015. The K-clique Densest Subgraph Problem(WWW ’15). 1122–1132.

[61]

C. E. Tsourakakis, J. Pachocki, and M. Mitzenmacher. 2017. Scalable Motif-aware Graph Clustering(WWW ’17). 1451–1460.

[62]

Robert E Ulanowicz, Cristina Bondavalli, and MS Egnotovich. 1998. Network analysis of trophic dynamics in south florida ecosystem, fy 97: The florida bay ecosystem. Annual Report to the United States Geological Service Biological Resources Division Ref. No.[UMCES] CBL(1998), 98–123.

[63]

Ulrike Von Luxburg. 2007. A tutorial on spectral clustering. Statistics and computing 17, 4 (2007), 395–416.

[64]

Hongzhi Wang, Jianzhong Li, and Hong Gao. 2016. Efficient entity resolution based on subgraph cohesion. Knowledge and information systems 46, 2 (2016), 285–314.

[65]

Jierui Xie, Stephen Kelley, and Boleslaw K. Szymanski. 2013. Overlapping Community Detection in Networks: The State-of-the-art and Comparative Study. ACM Comput. Surv. 45, 4, Article 43 (Aug. 2013), 35 pages.

[66]

Ömer Nebil Yaveroğlu, Noël Malod-Dognin, Darren Davis, Zoran Levnajic, Vuk Janjic, Rasa Karapandza, Aleksandar Stojmirovic, and Nataša Pržulj. 2014. Revealing the hidden language of complex networks. Scientific reports 4(2014), 4547.

Cited By

Chen NQiu TZhou XZhang SSi WOliver Wu D(2024)A Distributed Co-Evolutionary Optimization Method With Motif for Large-Scale IoT RobustnessIEEE/ACM Transactions on Networking10.1109/TNET.2024.340776932:5(4085-4098)Online publication date: Oct-2024
https://doi.org/10.1109/TNET.2024.3407769
Feng WWang LHooi BNg SLiu S(2024)Interrelated Dense Pattern Detection in Multilayer NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.339868336:11(6462-6476)Online publication date: Nov-2024
https://doi.org/10.1109/TKDE.2024.3398683
Mei GYe SLiu SPan LLi Q(2023)Heterogeneous graphlets-guided network embedding via eulerian-trail-based representationInformation Sciences: an International Journal10.1016/j.ins.2022.12.009622:C(1050-1063)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1016/j.ins.2022.12.009
Show More Cited By

Motif-driven Dense Subgraph Discovery in Directed and Labeled Networks
1. Mathematics of computing
  1. Discrete mathematics
    1. Graph theory

Recommendations

Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions
WWW '15: Proceedings of the 24th International Conference on World Wide Web

Finding dense substructures in a graph is a fundamental graph mining operation, with applications in bioinformatics, social networks, and visualization to name a few. Yet most standard formulations of this problem (like clique, quasiclique, k-densest ...
Nucleus Decompositions for Identifying Hierarchy of Dense Subgraphs

Finding dense substructures in a graph is a fundamental graph mining operation, with applications in bioinformatics, social networks, and visualization to name a few. Yet most standard formulations of this problem (like clique, quasi-clique, densest at-...
Peeling Bipartite Networks for Dense Subgraph Discovery
WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

Finding dense bipartite subgraphs and detecting the relations among them is an important problem for affiliation networks that arise in a range of domains, such as social network analysis, word-document clustering, the science of science, internet ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '21: Proceedings of the Web Conference 2021

April 2021

4054 pages

ISBN:9781450383127

DOI:10.1145/3442381

Editors:
Jure Leskovec
Stanford
,
Marko Grobelnik
Jožef Stefan Institute
,
Marc Najork
Google
,
Jie Tang
Tsinghua University
,
Leila Zia
Wikimedia Foundation

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '21

Sponsor:

SIGWEB

WWW '21: The Web Conference 2021

April 19 - 23, 2021

Ljubljana, Slovenia

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
349
Total Downloads

Downloads (Last 12 months)41
Downloads (Last 6 weeks)4

Reflects downloads up to 14 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen NQiu TZhou XZhang SSi WOliver Wu D(2024)A Distributed Co-Evolutionary Optimization Method With Motif for Large-Scale IoT RobustnessIEEE/ACM Transactions on Networking10.1109/TNET.2024.340776932:5(4085-4098)Online publication date: Oct-2024
https://doi.org/10.1109/TNET.2024.3407769
Feng WWang LHooi BNg SLiu S(2024)Interrelated Dense Pattern Detection in Multilayer NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.339868336:11(6462-6476)Online publication date: Nov-2024
https://doi.org/10.1109/TKDE.2024.3398683
Mei GYe SLiu SPan LLi Q(2023)Heterogeneous graphlets-guided network embedding via eulerian-trail-based representationInformation Sciences: an International Journal10.1016/j.ins.2022.12.009622:C(1050-1063)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1016/j.ins.2022.12.009
Shen CKo SLee GLee WYang D(2022)Density Personalized Group QueryProceedings of the VLDB Endowment10.14778/3574245.357424916:4(615-628)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.14778/3574245.3574249
Shi JDhulipala LShun J(2021)Theoretically and practically efficient parallel nucleus decompositionProceedings of the VLDB Endowment10.14778/3494124.349414015:3(583-596)Online publication date: 1-Nov-2021
https://dl.acm.org/doi/10.14778/3494124.3494140

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents