Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-031-46994-7guideproceedingsBook PagePublication PagesConference Proceedingsacm-pubtype
Similarity Search and Applications: 16th International Conference, SISAP 2023, A Coruña, Spain, October 9–11, 2023, Proceedings
2023 Proceeding
Publisher:
  • Springer-Verlag
  • Berlin, Heidelberg
Conference:
International Conference on Similarity Search and ApplicationsCoruna, Spain9 October 2023
ISBN:
978-3-031-46993-0
Published:
15 November 2023

Reflects downloads up to 02 Sep 2024Bibliometrics
Abstract

No abstract available.

Skip Table Of Content Section
front-matter
Front Matter
Pages i–xxi
back-matter
Back Matter
Article
Front Matter
Page 1
Article
Finding HSP Neighbors via an Exact, Hierarchical Approach
Abstract

The Half Space Proximal (HSP) graph is a low out-degree monotonic graph with wide applications in various domains, including combinatorial optimization in strings, enhancing kNN classification, simplifying chemical networks, estimating local ...

Article
Approximate Similarity Search for Time Series Data Enhanced by Section Min-Hash
Abstract

Dynamic Time Warping (DTW) is a well-known similarity measure between time series data. Although DTW can calculate the similarity between time series with different lengths, it is computationally expensive. Therefore, fast algorithms that ...

Article
Mutual k-Nearest Neighbor Graph for Data Analysis: Application to Metric Space Clustering
Abstract

In this paper, we delve into the Mutual k-Nearest Neighbor Graph (mkNNG) and its significance in clustering and outlier detection. We present a rigorous mathematical framework elucidating its application and highlight its role in the success of ...

Article
An Alternating Optimization Scheme for Binary Sketches for Cosine Similarity Search
Abstract

Searching for similar objects in intrinsically high-dimensional data sets is a challenging task. Sketches have been proposed for faster similarity search using linear scans. Binary sketches are one such approach to find a good mapping from the ...

Article
Unbiased Similarity Estimators Using Samples
Abstract

Computing a similarity measure (or a distance) between two complex objects is a fundamental building block for a huge number of applications in a wide variety of domains. Since many tasks involve computing such similarities among many pairs of ...

Article
Retrieve-and-Rank End-to-End Summarization of Biomedical Studies
Abstract

An arduous biomedical task involves condensing evidence derived from multiple interrelated studies, given a context as input, to generate reviews or provide answers autonomously. We named this task context-aware multi-document summarization (CA-...

Article
Fine-Grained Categorization of Mobile Applications Through Semantic Similarity Techniques for Apps Classification
Abstract

The number of Android apps is constantly on the rise. Existing stores allow selecting apps from general named categories. To prevent miscategorization and facilitate user selection of the appropriate app, a closer examination of the categories’ ...

Article
Runs of Side-Sharing Tandems in Rectangular Arrays
Abstract

A side-sharing tandem is a rectangular array that is composed of two adjacent non-overlapping occurrences of the same rectangular block. Furthering our understanding of side-sharing tandems should facilitate the development of more efficient 2d ...

Article
Turbo Scan: Fast Sequential Nearest Neighbor Search in High Dimensions
Abstract

This paper introduces Turbo Scan (TS), a novel k-nearest neighbor search solution tailored for high-dimensional data and specific workloads where indexing can’t be efficiently amortized over time. There exist situations where the overhead of index ...

Article
Class Representatives Selection in Non-metric Spaces for Nearest Prototype Classification
Abstract

The nearest prototype classification is a less computationally intensive replacement for the k-NN method, especially when large datasets are considered. Centroids are often used as prototypes to represent whole classes in metric spaces. Selection ...

Article
The Dataset-Similarity-Based Approach to Select Datasets for Evaluation in Similarity Retrieval
Abstract

Most papers on similarity retrieval present experiments executed on an assortion of complex datasets. However, no work focuses on analyzing the selection of datasets to evaluate the techniques proposed in the related literature. Ideally, the ...

Article
Suitability of Nearest Neighbour Indexes for Multimedia Relevance Feedback
Abstract

User relevance feedback (URF) is emerging as an important component of the multimedia analytics toolbox. State-of-the-art URF systems employ high-dimensional vectors of semantic features and train linear-SVM classifiers in each round of ...

Article
Accelerating k-Means Clustering with Cover Trees
Abstract

The k-means clustering algorithm is a popular algorithm that partitions data into k clusters. There are many improvements to accelerate the standard algorithm. Most current research employs upper and lower bounds on point-to-cluster distances and ...

Article
Is Quantized ANN Search Cursed? Case Study of Quantifying Search and Index Quality
Abstract

Traditional evaluation of an approximate high-dimensional index typically consists of running a benchmark with known ground truth, analyzing the performance in terms of traditional result quality and latency measures, and then comparing those ...

Article
Minwise-Independent Permutations with Insertion and Deletion of Features
Abstract

The seminal work of Broder et al. [5] introduces the minHash algorithm that computes a low-dimensional sketch of high-dimensional binary data that closely approximates pairwise Jaccard similarity. Since its invention, minHash has been commonly ...

Article
SDOclust: Clustering with Sparse Data Observers
Abstract

Sparse Data Observers (SDO) is an unsupervised learning approach developed to cover the need for fast, highly interpretable and intuitively parameterizable anomaly detection. We present SDOclust, an extension that performs clustering while ...

Article
Solving k-Closest Pairs in High-Dimensional Data
Abstract

We investigate the k-closest pair problem in high dimensions, that is finding the k1 closest pairs of points in a set SX in a metric space (X,dist). This is a fundamental problem in computational geometry with a wide variety of applications, ...

Article
Vec2Doc: Transforming Dense Vectors into Sparse Representations for Efficient Information Retrieval
Abstract

The rapid development of deep learning and artificial intelligence has transformed our approach to solving scientific problems across various domains, including computer vision, natural language processing, and automatic content generation. ...

Article
Similarity Search with Multiple-Object Queries
Abstract

Within the topic of similarity search, all work we know assumes that search is based on a dissimilarity space, where a query comprises a single object in the space.

Here, we examine the possibility of a multiple-object query. There are at least ...

Article
Diversity Similarity Join for Big Data
Abstract

The Similarity Join (SJ) has become one of the most popular and valuable data processing operators in analyzing large amounts of data. Various types of similarity join operators have been effectively used in multiple scenarios. However, these ...

Article
Front Matter
Page 253
Article
Overview of the SISAP 2023 Indexing Challenge
Abstract

This manuscript presents the premiere SISAP 2023 Indexing Challenge, which seeks replicable and competitive solutions in the realm of approximate similarity search algorithms. Our aim is recall, all while optimizing build time, search time, and ...

    Article
    Enhancing Approximate Nearest Neighbor Search: Binary-Indexed LSH-Tries, Trie Rebuilding, and Batch Extraction
    Abstract

    Locality-Sensitive-Hashing (LSH) plays a crucial role in approximate nearest neighbour search and similarity-based queries. In this paper, we present a study on the performance of LSH for indexing and searching high-dimensional binary vectors ...

    Article
    General and Practical Tuning Method for Off-the-Shelf Graph-Based Index: SISAP Indexing Challenge Report by Team UTokyo
    Abstract

    Despite the efficacy of graph-based algorithms for Approximate Nearest Neighbor (ANN) searches, the optimal tuning of such systems remains unclear. This study introduces a method to tune the performance of off-the-shelf graph-based indexes, ...

    Article
    SISAP 2023 Indexing Challenge – Learned Metric Index
    Abstract

    This submission into the SISAP Indexing Challenge examines the experimental setup and performance of the Learned Metric Index, which uses an architecture of interconnected learned models to answer similarity queries. An inherent part of this ...

    Article
    Computational Enhancements of HNSW Targeted to Very Large Datasets
    Abstract

    The Hierarchical Navigable Small World (HNSW) Graph is a graph-based approximate similarity search algorithm that achieves fast and accurate search through a hierarchical structure providing long-range and short-range links. The HNSW remains as a ...

    Article
    CRANBERRY: Memory-Effective Search in 100M High-Dimensional CLIP Vectors
    Abstract

    Recent advances in cross-modal multimedia data analysis necessarily require efficient similarity search on the scales of hundreds of millions of high-dimensional vectors. We address this task by proposing the CRANBERRY algorithm that specifically ...

    Contributors
    • University of A Coruña
    • Pompeu Fabra University Barcelona

    Recommendations