Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3670684.3673405acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
poster

Fast, Scalable, and Machine-Verified Multicore Disjoint Set Union Data Structures and their Wide Deployment in Parallel Algorithms (Abstract)

Published: 26 July 2024 Publication History

Abstract

We design a simple, fast, scalable, and reliable concurrent disjoint set union (a.k.a. union-find) data structure. Our algorithms are the first scalable algorithms for the problem. Our best algorithm provides almost-linear speed-up, performing just Θ (m ⋅ (log (np/m + 1) + α (n, m/np))) work when p processes execute a total of m operations on an instance with sets of total size n. We give a rigorous, machine-verified proof of correctness, and we prove that the work-complexity is optimal amongst a class of symmetric algorithms, which include all known concurrent set union algorithms. Our algorithms are fast in practice and have seen wide adoption. They are implemented in Google's recently open-sourced graph-mining library, where they enable "parallel clustering algorithms which scale to graphs with tens of billions of edges" [5]. An MIT research group (Dhulipala, Hong, and Shun) independently implemented hundreds of parallel algorithms for connected components and revealed that our algorithms are consistently the fastest both on CPUs [2] and GPUs [6]. As an illustration, our algorithms were used to compute the components of the Hyperlink2012 graph of 128 billion edges in just 8.2 seconds on a standard 72 core machine; this is 3.1x faster than the state-of-the-art in any computational setting [2]. Several state-of-the-art algorithms for parallel clustering [5, 15, 16], graph analysis [2, 14], and model checking [1] rely on our data structures.

References

[1]
Bloemen, V. Strong Connectivity and Shortest Paths for Checking Models. PhD thesis, University of Twente, 2019.
[2]
Dhulipala, L., Hong, C., and Shun, J. Connectit: A framework for static and incremental parallel graph connectivity algorithms. Proc. VLDB Endow. (2020).
[3]
FRedman, M. L., and SaKs, M. E. The cell probe complexity of dynamic data structures. In Proceedings of the 21st Annual ACM Symposium on Theory of Computing (STOC) (1989).
[4]
Goel, A., Khanna, S., LaRKin, D. H., and TaRjan, R. E. Disjoint set union with randomized linking. In Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (2014).
[5]
Google-GRaph-Mining-Team. Google graph-mining. https://github.com/google/graph-mining, 2023.
[6]
Hong, C., Dhulipala, L., and Shun, J. Exploring the design space of static and incremental graph connectivity algorithms on gpus. Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques (2020).
[7]
Jayanti, P., Jayanti, S., Yavuz, U., and HeRnandez, L. A universal, sound, and complete forward reasoning technique for machine-verified proofs of linearizability. Proc. ACM Program. Lang. 8, POPL (jan 2024).
[8]
Jayanti, S., TaRjan, R. E., and Boix-AdseRÀ, E. Randomized concurrent set union and generalized wake-up. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC) (2019).
[9]
Jayanti, S. V. Generalized wake-up: Amortized shared memory lower bounds for linearizable data structures.
[10]
Jayanti, S. V. Simple, Fast, Scalable, and Reliable Multiprocessor Algorithms. PhD thesis, Massachusetts Institute of Technology (MIT), Department of Electrical Engineering and Computer Science, November 2022. Code available at: https://github.com/visveswara/machine-certified-linearizability.
[11]
Jayanti, S. V., and TaRjan, R. E. A randomized concurrent algorithm for disjoint set union. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC) (2016), ACM.
[12]
Jayanti, S. V., and TaRjan, R. E. Concurrent disjoint set union. Distributed Computing (2021).
[13]
MeRz, S. Proofs and proof certification in the tla proof system. In Proceedings of the International Workshop on Proof Exchange for Theorem Proving (PxTP) (2012), CEUR-WS.org.
[14]
Shi, J., Dhulipala, L., and Shun, J. Parallel algorithms for hierarchical nucleus decomposition, 2023.
[15]
Tseng, T., Dhulipala, L., and Shun, J. Parallel index-based structural graph clustering and its approximation. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021 (2021), G. Li, Z. Li, S. Idreos, and D. Srivastava, Eds., ACM, pp. 1851--1864.
[16]
Wang, Y., Gu, Y., and Shun, J. Theoretically-efficient and practical parallel DBSCAN. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020 (2020), D. Maier, R. Pottinger, A. Doan, W. Tan, A. Alawini, and H. Q. Ngo, Eds., ACM, pp. 2555--2571.

Index Terms

  1. Fast, Scalable, and Machine-Verified Multicore Disjoint Set Union Data Structures and their Wide Deployment in Parallel Algorithms (Abstract)

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Conferences
            HOPC'24: Proceedings of the 2024 ACM Workshop on Highlights of Parallel Computing
            June 2024
            47 pages
            ISBN:9798400707001
            DOI:10.1145/3670684
            Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

            Sponsors

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 26 July 2024

            Check for updates

            Author Tags

            1. concurrent
            2. disjoint set union
            3. fast
            4. inverse-ackermann
            5. machine-verified
            6. multicore
            7. parallel clustering
            8. union-find

            Qualifiers

            • Poster

            Conference

            SPAA '24
            Sponsor:

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 28
              Total Downloads
            • Downloads (Last 12 months)28
            • Downloads (Last 6 weeks)4
            Reflects downloads up to 01 Jan 2025

            Other Metrics

            Citations

            View Options

            Login options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media