Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Volume 33, Issue 2Mar 2024
Reflects downloads up to 20 Feb 2025Bibliometrics
Skip Table Of Content Section
research-article
Tabular data synthesis with generative adversarial networks: design space and optimizations
Abstract

The proliferation of big data has brought an urgent demand for privacy-preserving data publishing. Traditional solutions to this demand have limitations on effectively balancing the trade-off between privacy and utility of the released data. To ...

research-article
MinJoin++: a fast algorithm for string similarity joins under edit distance
Abstract

We study the problem of computing similarity joins under edit distance on a set of strings. Edit similarity joins is a fundamental problem in databases, data mining and bioinformatics. It finds many applications in data cleaning and integration, ...

research-article
xDBTagger: explainable natural language interface to databases using keyword mappings and schema graph
Abstract

Recently, numerous studies have been proposed to attack the natural language interfaces to data-bases (NLIDB) problem by researchers either as a conventional pipeline-based or an end-to-end deep-learning-based solution. Although each approach has ...

research-article
Cardinality estimation using normalizing flow
Abstract

Cardinality estimation is one of the most important problems in query optimization. Recently, machine learning-based techniques have been proposed to effectively estimate cardinality, which can be broadly classified into query-driven and data-...

research-article
Optimizing RPQs over a compact graph representation
Abstract

We propose techniques to evaluate regular path queries (RPQs) over labeled graphs (e.g., RDF). We apply a bit-parallel simulation of a Glushkov automaton representing the query over a ring: a compact wavelet-tree-based index of the graph. To the ...

research-article
A quantitative evaluation of persistent memory hash indexes
Abstract

Persistent memory (PMem) is increasingly being leveraged to build hash-based indexing structures featuring cheap persistence, high performance, and instant recovery. Especially with the release of Intel Optane DC Persistent Memory Modules, we have ...

research-article
Eris: efficiently measuring discord in multidimensional sources
Abstract

Data integration is a classical problem in databases, typically decomposed into schema matching, entity matching and data fusion. To solve the latter, it is mostly assumed that ground truth can be determined. However, in general, the data ...

research-article
A systematic evaluation of machine learning on serverless infrastructure
Abstract

Recently, the serverless paradigm of computing has inspired research on its applicability to data-intensive tasks such as ETL, database query processing, and machine learning (ML) model training. Recent efforts have proposed multiple systems for ...

research-article
A survey on transactional stream processing
Abstract

Transactional stream processing (TSP) strives to create a cohesive model that merges the advantages of both transactional and stream-oriented guarantees. Over the past decade, numerous endeavors have contributed to the evolution of TSP solutions, ...

research-article
Efficient detection of multivariate correlations with different correlation measures
Abstract

Correlation analysis is an invaluable tool in many domains, for better understanding the data and extracting salient insights. Most works to date focus on detecting high pairwise correlations. A generalization of this problem with known ...

research-article
A survey on the evolution of stream processing systems
Abstract

Stream processing has been an active research field for more than 20 years, but it is now witnessing its prime time due to recent successful efforts by the research community and numerous worldwide open-source communities. This survey provides a ...

research-article
RCBench: an RDMA-enabled transaction framework for analyzing concurrency control algorithms
Abstract

Distributed transaction processing over the TCP/IP network suffers from the weak transaction scalability problem, i.e., its performance drops significantly when the number of involved data nodes per transaction increases. Although quite a few of ...

Subjects

Comments