Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3670684.3673405acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
poster

Fast, Scalable, and Machine-Verified Multicore Disjoint Set Union Data Structures and their Wide Deployment in Parallel Algorithms (Abstract)

Published: 26 July 2024 Publication History

Abstract

We design a simple, fast, scalable, and reliable concurrent disjoint set union (a.k.a. union-find) data structure. Our algorithms are the first scalable algorithms for the problem. Our best algorithm provides almost-linear speed-up, performing just Θ (m ⋅ (log (np/m + 1) + α (n, m/np))) work when p processes execute a total of m operations on an instance with sets of total size n. We give a rigorous, machine-verified proof of correctness, and we prove that the work-complexity is optimal amongst a class of symmetric algorithms, which include all known concurrent set union algorithms. Our algorithms are fast in practice and have seen wide adoption. They are implemented in Google's recently open-sourced graph-mining library, where they enable "parallel clustering algorithms which scale to graphs with tens of billions of edges" [5]. An MIT research group (Dhulipala, Hong, and Shun) independently implemented hundreds of parallel algorithms for connected components and revealed that our algorithms are consistently the fastest both on CPUs [2] and GPUs [6]. As an illustration, our algorithms were used to compute the components of the Hyperlink2012 graph of 128 billion edges in just 8.2 seconds on a standard 72 core machine; this is 3.1x faster than the state-of-the-art in any computational setting [2]. Several state-of-the-art algorithms for parallel clustering [5, 15, 16], graph analysis [2, 14], and model checking [1] rely on our data structures.

References

[1]
Bloemen, V. Strong Connectivity and Shortest Paths for Checking Models. PhD thesis, University of Twente, 2019.
[2]
Dhulipala, L., Hong, C., and Shun, J. Connectit: A framework for static and incremental parallel graph connectivity algorithms. Proc. VLDB Endow. (2020).
[3]
FRedman, M. L., and SaKs, M. E. The cell probe complexity of dynamic data structures. In Proceedings of the 21st Annual ACM Symposium on Theory of Computing (STOC) (1989).
[4]
Goel, A., Khanna, S., LaRKin, D. H., and TaRjan, R. E. Disjoint set union with randomized linking. In Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (2014).
[5]
Google-GRaph-Mining-Team. Google graph-mining. https://github.com/google/graph-mining, 2023.
[6]
Hong, C., Dhulipala, L., and Shun, J. Exploring the design space of static and incremental graph connectivity algorithms on gpus. Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques (2020).
[7]
Jayanti, P., Jayanti, S., Yavuz, U., and HeRnandez, L. A universal, sound, and complete forward reasoning technique for machine-verified proofs of linearizability. Proc. ACM Program. Lang. 8, POPL (jan 2024).
[8]
Jayanti, S., TaRjan, R. E., and Boix-AdseRÀ, E. Randomized concurrent set union and generalized wake-up. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC) (2019).
[9]
Jayanti, S. V. Generalized wake-up: Amortized shared memory lower bounds for linearizable data structures.
[10]
Jayanti, S. V. Simple, Fast, Scalable, and Reliable Multiprocessor Algorithms. PhD thesis, Massachusetts Institute of Technology (MIT), Department of Electrical Engineering and Computer Science, November 2022. Code available at: https://github.com/visveswara/machine-certified-linearizability.
[11]
Jayanti, S. V., and TaRjan, R. E. A randomized concurrent algorithm for disjoint set union. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC) (2016), ACM.
[12]
Jayanti, S. V., and TaRjan, R. E. Concurrent disjoint set union. Distributed Computing (2021).
[13]
MeRz, S. Proofs and proof certification in the tla proof system. In Proceedings of the International Workshop on Proof Exchange for Theorem Proving (PxTP) (2012), CEUR-WS.org.
[14]
Shi, J., Dhulipala, L., and Shun, J. Parallel algorithms for hierarchical nucleus decomposition, 2023.
[15]
Tseng, T., Dhulipala, L., and Shun, J. Parallel index-based structural graph clustering and its approximation. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021 (2021), G. Li, Z. Li, S. Idreos, and D. Srivastava, Eds., ACM, pp. 1851--1864.
[16]
Wang, Y., Gu, Y., and Shun, J. Theoretically-efficient and practical parallel DBSCAN. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020 (2020), D. Maier, R. Pottinger, A. Doan, W. Tan, A. Alawini, and H. Q. Ngo, Eds., ACM, pp. 2555--2571.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HOPC'24: Proceedings of the 2024 ACM Workshop on Highlights of Parallel Computing
June 2024
47 pages
ISBN:9798400707001
DOI:10.1145/3670684
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 July 2024

Check for updates

Author Tags

  1. concurrent
  2. disjoint set union
  3. fast
  4. inverse-ackermann
  5. machine-verified
  6. multicore
  7. parallel clustering
  8. union-find

Qualifiers

  • Poster

Conference

SPAA '24
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 35
    Total Downloads
  • Downloads (Last 12 months)35
  • Downloads (Last 6 weeks)5
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media