Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content

Receipt: refine coarse-grained independent tasks for parallel tip decomposition of bipartite graphs

Published: 01 November 2020 Publication History


Tip decomposition is a crucial kernel for mining dense subgraphs in bipartite networks, with applications in spam detection, analysis of affiliation networks etc. It creates a hierarchy of vertex-induced subgraphs with varying densities determined by the participation of vertices in butterflies (2, 2-bicliques). To build the hierarchy, existing algorithms iteratively follow a delete-update(peeling) process: deleting vertices with the minimum number of butterflies and correspondingly updating the butterfly count of their 2-hop neighbors. The need to explore 2-hop neighborhood renders tip-decomposition computationally very expensive. Furthermore, the inherent sequentiality in peeling only minimum butterfly vertices makes derived parallel algorithms prone to heavy synchronization.
In this paper, we propose a novel parallel tip-decomposition algorithm - REfine CoarsE-grained Independent Tasks (RECEIPT) that relaxes the peeling order restrictions by partitioning the vertices into multiple independent subsets that can be concurrently peeled. This enables RECEIPT to simultaneously achieve a high degree of parallelism and dramatic reduction in synchronizations. Further, RECEIPT employs a hybrid peeling strategy along with other optimizations that drastically reduce the amount of wedge exploration and execution time.
We perform detailed experimental evaluation of RECEIPT on a shared-memory multicore server. It can process some of the largest publicly available bipartite datasets orders of magnitude faster than the state-of-the-art algorithms - achieving up to 1100× and 64× reduction in the number of thread synchronizations and traversed wedges, respectively. Using 36 threads, RECEIPT can provide up to 17.1× self-relative speedup.


N. K. Ahmed, J. Neville, R. A. Rossi, and N. Duffield. Efficient graphlet counting for large networks. In 2015 IEEE International Conference on Data Mining, pages 1--10. IEEE, 2015.
N. K. Ahmed, J. Neville, R. A. Rossi, N. G. Duffield, and T. L. Willke. Graphlet decomposition: Framework, algorithms, and applications. Knowledge and Information Systems, 50(3):689--722, Mar. 2017.
S. G. Aksoy, T. G. Kolda, and A. Pinar. Measuring and modeling bipartite graphs with community structure. Journal of Complex Networks, 5(4):581--603, 2017.
A. Angel, N. Koudas, N. Sarkas, D. Srivastava, M. Svendsen, and S. Tirthapura. Dense subgraph maintenance under streaming edge weight updates for real-time story identification. The VLDB journal, 23(2):175--199, 2014.
S. Arifuzzaman, M. Khan, and M. Marathe. Patric: A parallel algorithm for counting triangles in massive networks. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management, pages 529--538, 2013.
B. Bhattarai, H. Liu, and H. H. Huang. Ceci: Compact embedding cluster index for scalable subgraph matching. In Proceedings of the 2019 International Conference on Management of Data, pages 1447--1462, 2019.
F. Bonchi, A. Khan, and L. Severini. Distance-generalized core decomposition. In Proceedings of the 2019 International Conference on Management of Data, pages 1006--1023, 2019.
V. T. Chakaravarthy, A. Goyal, P. Murali, S. S. Pandian, and Y. Sabharwal. Improved distributed algorithm for graph truss decomposition. In European Conference on Parallel Processing, pages 703--717. Springer, 2018.
J. Chen and Y. Saad. Dense subgraph extraction with application to community detection. IEEE Transactions on knowledge and data engineering, 24(7):1216--1230, 2010.
N. Chiba and T. Nishizeki. Arboricity and subgraph listing algorithms. SIAM Journal on computing, 14(1):210--223, 1985.
K. Date, K. Feng, R. Nagi, J. Xiong, N. S. Kim, and W.-M. Hwu. Collaborative (cpu+ gpu) algorithms for triangle counting and truss decomposition on the minsky architecture: Static graph challenge:Subgraph isomorphism. In 2017 IEEE High Performance Extreme Computing Conference (HPEC), pages 1--7. IEEE, 2017.
I. S. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 269--274, 2001.
L. Dhulipala, G. Blelloch, and J. Shun. Julienne: A framework for parallel graph algorithms using work-efficient bucketing. In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, pages 293--304, 2017.
A. Epasto, S. Lattanzi, V. Mirrokni, I. O. Sebe, A. Taei, and S. Verma. Ego-net community mining applied to friend suggestion. Proceedings of the VLDB Endowment, 9(4):324--335, 2015.
Y. Fang, Y. Yang, W. Zhang, X. Lin, and X. Cao. Effective and efficient community search over large heterogeneous information networks. Proceedings of the VLDB Endowment, 13(6):854--867, 2020.
Y. Fang, K. Yu, R. Cheng, L. V. Lakshmanan, and X. Lin. Efficient algorithms for densest subgraph discovery. Proceedings of the VLDB Endowment, 12(11):1719--1732, 2019.
G. Fei, A. Mukherjee, B. Liu, M. Hsu, M. Castellanos, and R. Ghosh. Exploiting burstiness in reviews for review spammer detection. In Seventh international AAAI conference on weblogs and social media, 2013.
J. Fox, O. Green, K. Gabert, X. An, and D. A. Bader. Fast and adaptive list intersections on the gpu. In 2018 IEEE High Performance extreme Computing Conference (HPEC), pages 1--7. IEEE, 2018.
E. Fratkin, B. T. Naughton, D. L. Brutlag, and S. Batzoglou. Motifcut: regulatory motifs finding with maximum density subgraphs. Bioinformatics (Oxford, England), 22(14):e150--7, July 2006.
D. Gibson, R. Kumar, and A. Tomkins. Discovering large dense subgraphs in massive graphs. In Proceedings of the 31st international conference on Very large data bases, pages 721--732, 2005.
D. Gibson, R. Kumar, and A. Tomkins. Discovering large dense subgraphs in massive graphs. In Proceedings of the 31st international conference on Very large data bases, pages 721--732, 2005.
R. L. Graham. Bounds on multiprocessing timing anomalies. SIAM journal on Applied Mathematics, 17(2):416--429, 1969.
O. Green, J. Fox, E. Kim, F. Busato, N. Bombieri, K. Lakhotia, S. Zhou, S. Singapura, H. Zeng, R. Kannan, et al. Quickly finding a truss in a haystack. In 2017 IEEE High Performance Extreme Computing Conference (HPEC), pages 1--7. IEEE, 2017.
S. Han, L. Zou, and J. X. Yu. Speeding up set intersections in graph algorithms using simd instructions. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD '18, page 1587--1602, New York, NY, USA, 2018. Association for Computing Machinery.
Y. Hu, H. Liu, and H. H. Huang. Tricore: Parallel triangle counting on gpus. In SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 171--182. IEEE, 2018.
Z. Huang, D. D. Zeng, and H. Chen. Analyzing consumer-product graphs: Empirical findings and applications in recommender systems. Management science, 53(7):1146--1164, 2007.
W. Khaouid, M. Barsky, V. Srinivasan, and A. Thomo. K-core decomposition of large networks on a single pc. Proceedings of the VLDB Endowment, 9(1):13--23, 2015.
J. Kunegis. Konect: the koblenz network collection. In Proceedings of the 22nd International Conference on World Wide Web, pages 1343--1350, 2013.
K. Lakhotia, R. Kannan, Q. Dong, and V. Prasanna. Planting trees for scalable and efficient canonical hub labeling. Proceedings of the VLDB Endowment, 13(4), 2019.
K. Lakhotia, R. Kannan, and V. Prasanna. Accelerating pagerank using partition-centric processing. In 2018 USENIX Annual Technical Conference (USENIX ATC 18), pages 427--440, 2018.
V. E. Lee, N. Ruan, R. Jin, and C. Aggarwal. A survey of algorithms for dense subgraph discovery. In Managing and Mining Graph Data, pages 303--336. Springer, 2010.
E. A. Leicht, P. Holme, and M. E. Newman. Vertex similarity in networks. Physical Review E, 73(2):026120, 2006.
W. Li, M. Qiao, L. Qin, Y. Zhang, L. Chang, and X. Lin. Scaling distance labeling on small-world networks. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD '19, page 1060--1077, NewYork, NY, USA, 2019. Association for Computing Machinery.
A. Lumsdaine, D. Gregor, B. Hendrickson, and J. Berry. Challenges in parallel graph processing. Parallel Processing Letters, 17(01):5--20, 2007.
C. Ma, R. Cheng, L. V. Lakshmanan, T. Grubenmann, Y. Fang, and X. Li. Linc: a motif counting algorithm for uncertain graphs. Proceedings of the VLDB Endowment, 13(2):155--168, 2019.
F. D. Malliaros, C. Giatsidis, A. N. Papadopoulos, and M. Vazirgiannis. The core decomposition of networks: Theory, algorithms and applications. The VLDB Journal, pages 1--32, 2019.
F. McSherry, M. Isard, and D. G. Murray. Scalability! but at what {COST}? In 15th Workshop on Hot Topics in Operating Systems (HotOS {XV}), 2015.
A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, and B. Bhattacharjee. Measurement and analysis of online social networks. In Proceedings of the 5th ACM/Usenix Internet Measurement Conference (IMC'07), San Diego, CA, October 2007.
A. Mukherjee, B. Liu, and N. Glance. Spotting fake reviewer groups in consumer reviews. In Proceedings of the 21st international conference on World Wide Web, pages 191--200, 2012.
S. Navlakha, R. Rastogi, and N. Shrivastava. Graph summarization with bounded error. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 419--432, 2008.
M. E. Newman. The structure of scientific collaboration networks. Proceedings of the national academy of sciences, 98(2):404--409, 2001.
M. E. J. Newman. Scientific collaboration networks. i. network construction and fundamental results. Phys. Rev. E, 64:016131, Jun 2001.
H.-M. Park, S.-H. Myaeng, and U. Kang. Pte: Enumerating trillion triangles on distributed systems. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '16. ACM Press, 2016.
P. Rozenshtein, A. Anagnostopoulos, A. Gionis, and N. Tatti. Event detection in activity networks. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1176--1185, 2014.
S. Samsi, V. Gadepally, M. Hurley, M. Jones, E. Kao, S. Mohindra, P. Monticciolo, A. Reuther, S. Smith, W. Song, et al. Static graph challenge: Subgraph isomorphism. In 2017 IEEE High Performance Extreme Computing Conference (HPEC), pages 1--6. IEEE, 2017.
S. Samsi, V. Gadepally, M. Hurley, M. Jones, E. Kao, S. Mohindra, P. Monticciolo, A. Reuther, S. Smith, W. Song, et al. Static graph challenge: Subgraph isomorphism. In 2017 IEEE High Performance Extreme Computing Conference (HPEC), pages 1--6. IEEE, 2017.
S.-V. Sanei-Mehri, A.E. Sariyuce, and S.Tirthapura. Butterfly counting in bipartite networks. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2150--2159, 2018.
S.-V. Sanei-Mehri, Y. Zhang, A. E. Sariyüce, and S. Tirthapura. Fleet: Butterfly estimation from a bipartite graph stream. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pages 1201--1210, 2019.
A. E. Sariyüce and A. Pinar. Fast hierarchy construction for dense subgraphs. Proceedings of the VLDB Endowment, 10(3), 2016.
A. E. Sariyuce and A. Pinar. Peeling bipartite networks for dense subgraph discovery. arXiv preprint arXiv:1611.02756, 2016.
A. E. Sariyüce and A. Pinar. Peeling bipartite networks for dense subgraph discovery. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pages 504--512, 2018.
A. E. Sariyüce, C. Seshadhri, and A. Pinar. Local algorithms for hierarchical dense subgraph discovery. Proceedings of the VLDB Endowment, 12(1):43--56, 2018.
A. E. Sariyuce, C. Seshadhri, A. Pinar, and U. V. Catalyurek. Finding the hierarchy of dense subgraphs using nucleus decompositions. In Proceedings of the 24th International Conference on World Wide Web, pages 927--937, 2015.
J. Shi and J. Shun. Parallel algorithms for butterfly computations. arXiv preprint arXiv:1907.08607, 2019.
S. Smith, X. Liu, N. K. Ahmed, A. S. Tom, F. Petrini, and G. Karypis. Truss decomposition on shared-memory parallel systems. In 2017 IEEE High Performance Extreme Computing Conference (HPEC), pages 1--6. IEEE, 2017.
J. Sun, H. Qu, D. Chakrabarti, and C. Faloutsos. Neighborhood formation and anomaly detection in bipartite graphs. In Fifth IEEE International Conference on Data Mining (ICDM'05), pages 8-pp. IEEE, 2005.
C. E. Tsourakakis. A novel approach to finding near-cliques: The triangle-densest subgraph problem. arXiv preprint arXiv:1405.1477, 2014.
C. E. Tsourakakis, J. Pachocki, and M. Mitzenmacher. Scalable motif-aware graph clustering. In Proceedings of the 26th International Conference on World Wide Web, pages 1451--1460, 2017.
C. Voegele, Y.-S. Lu, S. Pai, and K. Pingali. Parallel triangle counting and k-truss identification using graph-centric methods. In 2017 IEEE High Performance Extreme Computing Conference (HPEC), pages 1--7. IEEE, 2017.
J. Wang and J. Cheng. Truss decomposition in massive networks. Proceedings of the VLDB Endowment, 5(9), 2012.
J. Wang, A. W.-C. Fu, and J. Cheng. Rectangle counting in large bipartite graphs. In 2014 IEEE International Congress on Big Data, pages 17--24. IEEE, 2014.
K. Wang, X. Cao, X. Lin, W. Zhang, and L. Qin. Efficient computing of radius-bounded k-cores. In 2018 IEEE 34th International Conference on Data Engineering (ICDE), pages 233--244. IEEE, 2018.
K. Wang, X. Lin, L. Qin, W. Zhang, and Y. Zhang. Vertex priority based butterfly counting for large-scale bipartite networks. Proceedings of the VLDB Endowment, 12(10):1139--1152, 2019.
K. Wang, X. Lin, L. Qin, W. Zhang, and Y. Zhang. Efficient bitruss decomposition for large-scale bipartite graphs. arXiv preprint arXiv 2001.06111, 2020.
N. Wang, J. Zhang, K.-L. Tan, and A. K. Tung. On triangulation-based dense neighborhood graph discovery. Proceedings of the VLDB Endowment, 4(2):58--68, 2010.
H. Wei, J. X. Yu, C. Lu, and X. Lin. Speedup graph processing by graph ordering. In Proceedings of the 2016 International Conference on Management of Data, pages 1813--1828. ACM, 2016.
D. Wen, L. Qin, Y. Zhang, X. Lin, and J. X. Yu. I/o efficient core graph decomposition: application to degeneracy ordering. IEEE Transactions on Knowledge and Data Engineering, 31(1):75--90, 2018.
Y. Yang, L. Chu, Y. Zhang, Z. Wang, J. Pei, and E. Chen. Mining density contrast subgraphs. In 2018 IEEE 34th International Conference on Data Engineering (ICDE), pages 221--232. IEEE, 2018.
Z. Zou. Bitruss decomposition of bipartite graphs. In International Conference on Database Systems for Advanced Applications, pages 218--233. Springer, 2016.

Cited By

View all
  • (2024)Parallel Algorithms for Hierarchical Nucleus DecompositionProceedings of the ACM on Management of Data10.1145/36392872:1(1-27)Online publication date: 26-Mar-2024
  • (2023)Parallel Peeling of Bipartite Networks for Hierarchical Dense Subgraph DiscoveryACM Transactions on Parallel Computing10.1145/358308410:2(1-35)Online publication date: 20-Jun-2023



Information & Contributors


Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 14, Issue 3
November 2020
217 pages
Issue’s Table of Contents


VLDB Endowment

Publication History

Published: 01 November 2020
Published in PVLDB Volume 14, Issue 3


  • Research-article


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Jan 2025

Other Metrics


Cited By

View all
  • (2024)Parallel Algorithms for Hierarchical Nucleus DecompositionProceedings of the ACM on Management of Data10.1145/36392872:1(1-27)Online publication date: 26-Mar-2024
  • (2023)Parallel Peeling of Bipartite Networks for Hierarchical Dense Subgraph DiscoveryACM Transactions on Parallel Computing10.1145/358308410:2(1-35)Online publication date: 20-Jun-2023

View Options

Login options

Full Access

View options


View or Download as a PDF file.



View online with eReader.








Share this Publication link

Share on social media