Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

ARC 2014: A Multidimensional FPGA-Based Parallel DBSCAN Architecture

Published: 04 November 2015 Publication History

Abstract

Clustering large numbers of data points is a very computationally demanding task that often needs to be accelerated in order to be useful in practical applications. This work focuses on the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm, which is one of the state-of-the-art clustering algorithms, and targets its acceleration using an FPGA device. The article presents an optimized, scalable, and parameterizable architecture that takes advantage of the internal memory structure of modern FPGAs in order to deliver a high-performance clustering system. Post-synthesis simulation results show that the developed system can obtain mean speedups of 31× in real-world tests and 202× in synthetic tests when compared to state-of-the-art software counterparts running on a quad-core 3.4GHz Intel i7-2600k. Additionally, this implementation is also capable of clustering data with any number of dimensions without impacting the performance.

References

[1]
Elke Achtert, Hans-Peter Kriegel, Erich Schubert, and Arthur Zimek. 2013. Interactive data mining with 3D-parallel-coordinate-trees. In 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD'13). ACM, New York, NY, 1009--1012.
[2]
Guilherme Andrade, Gabriel Ramos, Daniel Madeira, Rafael Sachetto, Renato Ferreira, and Leonardo Rocha. 2013. G-DBSCAN: A {GPU} accelerated algorithm for density-based clustering. Procedia Computer Science 18, 0 (2013), 369--378.
[3]
Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, and Jörg Sander. 1999. OPTICS: Ordering points to identify the clustering structure. ACM Press, 49--60.
[4]
A. Annovi and M. Beretta. 2010. A fast general-purpose clustering algorithm based on FPGAs for high-throughput data processing. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 617, 13 (2010), 254--257.
[5]
Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, and Bernhard Seeger. 1990. The R*-tree: An efficient and robust access method for points and rectangles. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data (SIGMOD'90). ACM, New York, NY, USA, 322--331.
[6]
Min Chen, Xuedong Gao, and HuiFei Li. 2010. Parallel DBSCAN with priority R-tree. In The 2010 2nd IEEE International Conference on Information Management and Engineering (ICIME). 508--511.
[7]
M. Daszykowski, B. Walczak, and D. L. Massart. 2001. Looking for natural patterns in data: Part 1. Density-based approach. Chemometrics and Intelligent Laboratory Systems 56, 2 (2001), 83--92.
[8]
Chris Harris and Mike Stephens. 1988. A combined corner and edge detector. In Proceedings of the 4th Alvey Vision Conference. 147--151.
[9]
J. A. Hartigan and M. A. Wong. 1979. A K-means clustering algorithm. Applied Statistics 28 (1979), 100--108.
[10]
Yaobin He, Haoyu Tan, Wuman Luo, Huajian Mao, Di Ma, Shengzhong Feng, and Jianping Fan. 2011. MR-DBSCAN: An efficient parallel density-based clustering algorithm using mapreduce. In 2011 IEEE 17th International Conference on Parallel and Distributed Systems (ICPADS). 473--480.
[11]
Hanaa M. Hussain, Khaled Benkrid, Ahmet T. Erdogan, and Huseyin Seker. 2011. Highly parameterized K-means clustering on FPGAs: Comparative results with GPPs and GPUs. In ReConFig, Peter M. Athanas, Jrgen Becker, and Ren Cumplido (Eds.). IEEE Computer Society, 475--480.
[12]
Lingjuan Li and Yang Xi. 2011. Research on clustering algorithm and its parallelization strategy. 2012 4th International Conference on Computational and Information Sciences 0 (2011), 325--328.
[13]
R. Llet, M. C. Ortiz, L. A. Sarabia, and M. S. Snchez. 2004. Selecting variables for k-means cluster analysis by using a genetic algorithm that optimises the silhouettes. Analytica Chimica Acta 515, 1 (2004), 87--100. Papers presented at the 5th Colloquium Chemiometricum Mediterraneum.
[14]
Hans-peter Kriegel Martin Ester, Jrg S, and Xiaowei Xu. 1996. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. AAAI Press, 226--231.
[15]
Tsutomu Maruyama. 2006. Real-time K-means clustering for color images on reconfigurable hardware. In ICPR (2) (2006-09-25). IEEE Computer Society, 816--819.
[16]
Microsoft. 2014. Most Cited Data Mining Articles on Microsoft Academic Search. Retrieved from http://academic.research.microsoft.com/RankList?entitytype=1&topDomainID=2&subDomainID=7&last=0&start=1&end=100.
[17]
Neil Scicluna and Christos-Savvas Bouganis. 2014. FPGA-based parallel DBSCAN architecture. In Reconfigurable Computing: Architectures, Tools, and Applications, Diana Goehringer, MarcoDomenico Santambrogio, Joo M. P. Cardoso, and Koen Bertels (Eds.). Lecture Notes in Computer Science, Vol. 8405. Springer International Publishing, 1--12.
[18]
Qi Yue Shaobo Shi and Qin Wang. 2014. FPGA based accelerator for parallel DBSCAN algorithm. Computer Modelling & New Technologies 18, 2 (2014), 135--142.
[19]
A. Shimada, Hongbo Zhu, and T. Shibata. 2013. A VLSI DBSCAN processor composed as an array of micro agents having self-growing interconnects. In 2013 IEEE International Symposium on Circuits and Systems (ISCAS). 2062--2065.
[20]
R. J. Thapa, C. Trefftz, and G. Wolffe. 2010. Memory-efficient implementation of a graphics processor-based cluster detection algorithm for large spatial databases. In 2010 IEEE International Conference on Electro/Information Technology (EIT). 1--5.
[21]
Andrea Vattani. 2011. k-means requires exponentially many iterations even in the plane. Discrete & Computational Geometry 45, 4 (2011), 596--616.
[22]
Tom White. 2009. Hadoop: The Definitive Guide (1st ed.). O'Reilly Media, Inc.
[23]
S. Bayliss, F. Winterstein, and G. A. Constantinides. 2013. FPGA-based K-means clustering using tree-based data structures. In 2013 23rd International Conference on Field Programmable Logic and Applications (FPL). 1--6.
[24]
Xiang Xiao, Tuo Shi, Pranav Vaidya, and Jaehwan John Lee. 2008. R-tree: A hardware implementation. In CDES (2009-12-05), Hamid R. Arabnia (Ed.). CSREA Press, 3--9.

Cited By

View all
  • (2024)Grid-Based DBSCAN Clustering Accelerator for LiDAR’s Point CloudElectronics10.3390/electronics1317339513:17(3395)Online publication date: 26-Aug-2024
  • (2024)SpecHD: Hyperdimensional Computing Framework for FPGA-Based Mass Spectrometry Clustering2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546776(1-6)Online publication date: 25-Mar-2024
  • (2023)Machine LearningDesign for Embedded Image Processing on FPGAs10.1002/9781119819820.ch14(403-439)Online publication date: 5-Sep-2023
  • Show More Cited By

Index Terms

  1. ARC 2014: A Multidimensional FPGA-Based Parallel DBSCAN Architecture

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Reconfigurable Technology and Systems
    ACM Transactions on Reconfigurable Technology and Systems  Volume 9, Issue 1
    Special Section on the 2014 International Symposium on Applied Reconfigurable Computing
    November 2015
    121 pages
    ISSN:1936-7406
    EISSN:1936-7414
    DOI:10.1145/2839314
    • Editor:
    • Steve Wilton
    Issue’s Table of Contents
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 November 2015
    Accepted: 01 January 2015
    Revised: 01 November 2014
    Received: 01 June 2014
    Published in TRETS Volume 9, Issue 1

    Check for updates

    Author Tags

    1. Clustering
    2. DBSCAN
    3. FPGA
    4. parallel hardware architectures

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)24
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 02 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Grid-Based DBSCAN Clustering Accelerator for LiDAR’s Point CloudElectronics10.3390/electronics1317339513:17(3395)Online publication date: 26-Aug-2024
    • (2024)SpecHD: Hyperdimensional Computing Framework for FPGA-Based Mass Spectrometry Clustering2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546776(1-6)Online publication date: 25-Mar-2024
    • (2023)Machine LearningDesign for Embedded Image Processing on FPGAs10.1002/9781119819820.ch14(403-439)Online publication date: 5-Sep-2023
    • (2022)GPS Receivers Spoofing Detection Based on Subtractive, FCM and DBSCAN Clustering AlgorithmsJournal of Circuits, Systems and Computers10.1142/S021812662350152932:09Online publication date: 8-Dec-2022
    • (2021)A survey on parallel clustering algorithms for Big DataArtificial Intelligence Review10.1007/s10462-020-09918-254:4(2411-2443)Online publication date: 1-Apr-2021
    • (2020)Adaptive Density-Based Spatial Clustering for Massive Data AnalysisIEEE Access10.1109/ACCESS.2020.29694408(23346-23358)Online publication date: 2020

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media