Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3412841.3441915acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

VD-tree: how to build an efficient and fit metric access method using voronoi diagrams

Published: 22 April 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Efficient similarity search is a core issue for retrieval operations on large amounts of complex data, often relying on Metric Access Methods (MAMs) to speed up the Range and k-NN queries. Among the most used MAMs are those based on covering radius, which create balanced structures, and enable efficient data retrieval and dynamic maintenance. MAMs typically suffer from node overlapping, which increases retrieval costs. Some strategies aim to reduce node over-lapping by employing global pivots to improve the filtering process during queries, but result at significant costs to maintain the pivots, whereas not completely removing the overlaps, which impacts queries over large databases. Other strategies use hyper-plane-based MAMs, which can get rid of overlaps but with large costs to create and update the index. We propose VD-Tree, a MAM which combines a covering radius strategy with a Voronoi-like organization. VD-Tree retains index flexibility for updates whereas reducing the node overlap using dynamic swap of elements among nodes. The method relies on only the solid organization fostered by Voronoi, and does not require storing further information to the tree. Experimental analysis using five real-world image datasets and four feature extractors shows that VD-Tree reduced node overlaps up to 43% and the average time needed to answer similarity queries by up to 28%, when compared to its closest competitor.

    References

    [1]
    Matej Antol and Vlastislav Dohnal. 2019. BM-index: Balanced Metric Space Index Based on Weighted Voronoi Partitioning. In Advances in Databases and Information Systems - 23rd European Conference, ADBIS 2019, Bled, Slovenia, September 8--11, 2019, Proceedings (Lecture Notes in Computer Science), Vol. 11695. Springer, 337--353.
    [2]
    Christian Beecks et al. 2013. Content-based exploration of multimedia databases. In 11th International Workshop on Content-Based Multimedia Indexing, CBMI 2013, Veszprém, Hungary, June 17--19, 2013. IEEE, IEEE, 59--64.
    [3]
    Sergey Brin. 1995. Near neighbor search in large metric spaces. (1995), 574--584.
    [4]
    M. T. Cazzolato et al. 2017. FiSmo: A Compilation of Datasets from Emergency Situations for Fire and Smoke Analysis. In Proceedings of Satellite Events of the 32nd Brazilian Symposium on Databases SBBD - DSW (Dataset Showcase Workshop). SBC, 213--223.
    [5]
    Paolo Ciaccia et al. 1997. M-tree: An Efficient Access Method for Similarity Search in Metric Spaces. In VLDB'97, Proceedings of 23rd International Conference on Very Large Data Bases, August 25--29, 1997, Athens, Greece. Morgan Kaufmann, 426--435.
    [6]
    Joseph Paul Cohen et al. 2020. COVID-19 Image Data Collection: Prospective Predictions Are the Future. CoRR abs/2006.11988 (2020). arXiv:2006.11988 https://arxiv.org/abs/2006.11988
    [7]
    Michel Marie Deza and Elena Deza. 2016. Encyclopedia of Distances (4th ed.). Springer, Heidelberg.
    [8]
    Aditya Khosla et al. 2011. Novel dataset for fine-grained image categorization: Stanford dogs. In Proc. CVPR Workshop on Fine-Grained Visual Categorization (FGVC), Vol. 2. Colorado Springs, CO.
    [9]
    Zineddine Kouahla et al. 2019. XM-tree: data driven computational model by using metric extended nodes with non-overlapping in high-dimensional metric spaces. Comput. Math. Organ. Theory 25, 2 (2019), 196--223.
    [10]
    Zineddine Kouahla and José Martinez. 2012. A new intersection tree for content-based image retrieval. In 10th International Workshop on Content-Based Multimedia Indexing, CBMI 2012, Annecy, France, June 27--29, 2012. IEEE, 1--6.
    [11]
    Ashnil Kumar et al. 2013. Content-based medical image retrieval: a survey of applications to multidimensional and multimodality data. J. Digit. Imaging 26, 6 (2013), 1025--1039.
    [12]
    José M Martínez et al. 2002. MPEG-7: the generic multimedia content description standard, part 1. IEEE Multim. 9, 2 (2002), 78--87.
    [13]
    Gonzalo Navarro and Roberto Uribe-Paredes. 2011. Fully dynamic metric access methods based on hyperplane partitioning. Inf. Syst. 36, 4 (2011), 734--747.
    [14]
    P. H. Oliveira et al. 2017. MAMMOSET: An Enhanced Dataset of Mammograms. In Proceedings of Satellite Events of the 32nd Brazilian Symposium on Databases SBBD - DSW (Dataset Showcase Workshop). SBC, 256--266.
    [15]
    Ives Renê Venturini Pola et al. 2014. The NOBH-tree: Improving in-memory metric access methods by using metric hyperplanes with non-overlapping nodes. Data Knowl. Eng. 94 (2014), 65--88.
    [16]
    Caetano. Traina-Jr. et al. 2002. Fast indexing and visualization of metric data sets using slim-trees. IEEE Trans. Knowl. Data Eng. 14, 2 (2002), 244--260.
    [17]
    Roberto Uribe et al. 2006. An index data structure for searching in metric space databases. In Computational Science - ICCS 2006, 6th International Conference, Reading, UK, May 28--31, 2006, Proceedings, Part I (Lecture Notes in Computer Science), Vol. 3991. Springer, 611--617.
    [18]
    Xiangmin Zhou et al. 2003. M+-tree: a new dynamical multidimensional index for metric spaces. In Database Technologies 2003, Proceedings of the 14th Australasian Database Conference, ADC 2003, Adelaide, South Australia, February 2003 (CRPIT). Australian Computer Society, 161--168.
    [19]
    Xiangmin Zhou et al. 2005. BM+-tree: A hyperplane-based index method for high-dimensional metric spaces. In Database Systems for Advanced Applications, 10th International Conference, DASFAA 2005, Beijing, China, April 17--20, 2005, Proceedings (Lecture Notes in Computer Science), Vol. 3453. Springer, 398--409.

    Cited By

    View all
    • (2023)Tree-based indexing technique for efficient and real-time label retrieval in the object tracking systemThe Journal of Supercomputing10.1007/s11227-023-05478-879:18(20562-20599)Online publication date: 16-Jun-2023
    • (2021)Data-Driven Learned Metric Index: An Unsupervised ApproachSimilarity Search and Applications10.1007/978-3-030-89657-7_7(81-94)Online publication date: 29-Sep-2021

    Index Terms

    1. VD-tree: how to build an efficient and fit metric access method using voronoi diagrams

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SAC '21: Proceedings of the 36th Annual ACM Symposium on Applied Computing
      March 2021
      2075 pages
      ISBN:9781450381048
      DOI:10.1145/3412841
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 22 April 2021

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. complex data
      2. index structure
      3. metric access method
      4. similarity queries
      5. voronoi diagram

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      SAC '21
      Sponsor:
      SAC '21: The 36th ACM/SIGAPP Symposium on Applied Computing
      March 22 - 26, 2021
      Virtual Event, Republic of Korea

      Acceptance Rates

      Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 12 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Tree-based indexing technique for efficient and real-time label retrieval in the object tracking systemThe Journal of Supercomputing10.1007/s11227-023-05478-879:18(20562-20599)Online publication date: 16-Jun-2023
      • (2021)Data-Driven Learned Metric Index: An Unsupervised ApproachSimilarity Search and Applications10.1007/978-3-030-89657-7_7(81-94)Online publication date: 29-Sep-2021

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media