Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3597635.3598025acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
poster

Fast Parallel Algorithms for Euclidean Minimum Spanning Tree and Hierarchical Spatial Clustering (Abstract)

Published: 18 July 2023 Publication History

Abstract

This paper presents new parallel algorithms for generating Euclidean minimum spanning trees and spatial clustering hierarchies (known as HDBSCAN^*). Our approach is based on generating a well-separated pair decomposition followed by using Kruskal's minimum spanning tree algorithm and bichromatic closest pair computations. We introduce a new notion of well-separation to reduce the work and space of our algorithm for HDBSCAN^*. We also give a new parallel divide-and-conquer algorithm for computing the dendrogram and reachability plots, which are used in visualizing clusters of different scale that arise for both EMST and HDBSCAN^*. We show that our algorithms are theoretically efficient: they have work (number of operations) matching their sequential counterparts, and polylogarithmic depth (parallel time). We implement our algorithms and propose a memory optimization that requires only a subset of well-separated pairs to be computed and materialized, leading to savings in both space (up to 10x) and time (up to 8x). Our experiments on large real-world and synthetic data sets using a 48-core machine show that our fastest algorithms outperform the best serial algorithms for the problems by 11.13--55.89x, and existing parallel algorithms by at least an order of magnitude.

Supplemental Material

MP4 File
Presentation video

References

[1]
Mihael Ankerst, Markus Breunig, H. Kriegel, and Jörg Sander. 1999. OPTICS: Ordering Points to Identify the Clustering Structure. In ACM SIGMOD International Conference on Management of Data. 49--60.
[2]
Ricardo Campello, Davoud Moulavi, Arthur Zimek, and Jörg Sander. 2015. Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection. ACM Transactions on Knowledge Discovery from Data (TKDD), Article 5 (2015), bibinfonumpages5:1--5:51 pages.
[3]
Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A Density-based Algorithm for Discovering Clusters a Density-based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 226--231.
[4]
Junhao Gan and Yufei Tao. 2018. Fast Euclidean OPTICS with Bounded Precision in Low Dimensional Space. In ACM SIGMOD International Conference on Management of Data. 1067--1082.
[5]
Yiqiu Wang, Shangdi Yu, Yan Gu, and Julian Shun. 2021. Fast Parallel Algorithms for Euclidean Minimum Spanning Tree and Hierarchical Spatial Clustering. In Proceedings of the 2021 ACM International Conference on Management of Data (SIGMOD '21). 1982--1995. https://doi.org/10.1145/3448016.3457296 io

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HOPC '23: Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing
July 2023
33 pages
ISBN:9798400702181
DOI:10.1145/3597635
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2023

Check for updates

Author Tags

  1. clustering
  2. computational geometry
  3. parallel algorithm

Qualifiers

  • Poster

Funding Sources

  • NSF CAREER Award
  • Applications Driving Architectures (ADA) Research Center, a JUMP Center co-sponsored by SRC and DARPA
  • Google Faculty Research Award
  • DOE Early Career Award
  • DARPA SDH Award

Conference

SPAA '23
Sponsor:

Upcoming Conference

SPAA '25
37th ACM Symposium on Parallelism in Algorithms and Architectures
July 28 - August 1, 2025
Portland , OR , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 29
    Total Downloads
  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)2
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media