PACMMOD: Vol 2, No 1

Volume 2, Issue 1February 2024SIGMOD

Volume 2, Issue 1

February 2024

Editor:

Divyakant Agrawal
UC Santa Barbara, United States

Publisher:

Association for Computing Machinery
New York
NY
United States

EISSN:2836-6573

Tags:

Recommend ACM DL

ALREADY A SUBSCRIBER?SIGN IN

Bibliometrics

Issue Downloads

PDFFrontmatter: front cover, IFC, table of contents

Select All

Export Citations Save to Binder

editorial

Free

PACMMOD Volume 2 Issue 1: Editorial

Article No.: 1, Pages 1–2https://doi.org/10.1145/3639256

Welcome to Issue 1 of Volume 2 of the Proceedings of the ACM on Management of Data, which has papers from the third round of submissions to the SIGMOD research track.

Out of 230 submissions in this round, whose submission deadline was July 15, 2023, a ...

research-article

Open Access

Optimizing Distributed Protocols with Query Rewrites

Article No.: 2, Pages 1–25https://doi.org/10.1145/3639257

Distributed protocols such as 2PC and Paxos lie at the core of many systems in the cloud, but standard implementations do not scale. New scalable distributed protocols are developed through careful analysis and rewrites, but this process is ad hoc and ...

research-article

Open Access

Grafite: Taming Adversarial Queries with Optimal Range Filters

Article No.: 3, Pages 1–23https://doi.org/10.1145/3639258

Range filters allow checking whether a query range intersects a given set of keys with a chance of returning a false positive answer, thus generalising the functionality of Bloom filters from point to range queries. Existing practical range filters have ...

research-article

High-performance Effective Scientific Error-bounded Lossy Compression with Auto-tuned Multi-component Interpolation

Article No.: 4, Pages 1–27https://doi.org/10.1145/3639259

Error-bounded lossy compression has been identified as a promising solution for significantly reducing scientific data volumes upon users' requirements on data distortion. For the existing scientific error-bounded lossy compressors, some of them (such as ...

research-article

MWP: Multi-Window Parallel Evaluation of Regular Path Queries on Streaming Graphs

Article No.: 5, Pages 1–26https://doi.org/10.1145/3639260

A persistent Regular Path Query (RPQ) on a streaming graph is to continuously find every pair of vertices that are connected by a path in the graph within a sliding window, such that the edge label sequence of this path matches a given regular ...

research-article

Proximity Queries on Point Clouds using Rapid Construction Path Oracle

Article No.: 6, Pages 1–26https://doi.org/10.1145/3639261

The prevalence of computer graphics technology boosts the developments of point clouds in recent years, which offer advantages over terrain surfaces (represented by Triangular Irregular Networks, i.e., TINs) in proximity queries, including the shortest ...

research-article

Open Access

Efficient k-Clique Listing: An Edge-Oriented Branching Strategy

Article No.: 7, Pages 1–26https://doi.org/10.1145/3639262

k-clique listing is a vital graph mining operator with diverse applications in various networks. The state-of-the-art algorithms all adopt a branch-and-bound (BB) framework with a vertex-oriented branching strategy (called VBBkC), which forms a sub-...

research-article

Open Access

Relative Keys: Putting Feature Explanation into Context

Article No.: 8, Pages 1–28https://doi.org/10.1145/3639263

Formal feature explanations strictly maintain perfect conformity but are intractable to compute, while heuristic methods are much faster but can lead to problematic explanations due to lack of conformity guarantees. We propose relative keys that have the ...

research-article

NOC-NOC: Towards Performance-optimal Distributed Transactions

Article No.: 9, Pages 1–25https://doi.org/10.1145/3639264

Substantial research efforts have been devoted to studying the performance optimality problem for distributed database transactions. However, they focus just on optimizing transactional reads, and thus overlook crucial factors, such as the efficiency of ...

research-article

Robustness of Updatable Learning-based Index Advisors against Poisoning Attack

Article No.: 10, Pages 1–26https://doi.org/10.1145/3639265

Despite the promising performance of recent learning-based Index Advisors (IAs), they exhibited the robustness issue when poisoning attacks polluted training data. This paper presents the first attempt to study the robustness of updatable learning-based ...

research-article

Open Access

FedKNN: Secure Federated k-Nearest Neighbor Search

Article No.: 11, Pages 1–26https://doi.org/10.1145/3639266

Nearest neighbor search is a fundamental task in various domains, such as federated learning, data mining, information retrieval, and biomedicine. With the increasing need to utilize data from different organizations while respecting privacy regulations, ...

research-article

FineMon: An Innovative Adaptive Network Telemetry Scheme for Fine-Grained, Multi-Metric Data Monitoring with Dynamic Frequency Adjustment and Enhanced Data Recovery

Article No.: 12, Pages 1–26https://doi.org/10.1145/3639267

Network telemetry, characterized by its efficient push model and high-performance communication protocol (gRPC), offers a new avenue for collecting fine-grained real-time data. Despite its advantages, existing network telemetry systems lack a theoretical ...

research-article

Open Access

PECJ: Stream Window Join on Disorder Data Streams with Proactive Error Compensation

Article No.: 13, Pages 1–24https://doi.org/10.1145/3639268

Stream Window Join (SWJ), a vital operation in stream analytics, struggles with achieving a balance between accuracy and latency due to out-of-order data arrivals. Existing methods predominantly rely on adaptive buffering, but often fall short in ...

research-article

Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment

Article No.: 14, Pages 1–27https://doi.org/10.1145/3639269

High-dimensional vector similarity search (HVSS) is gaining prominence as a powerful tool for various data science and AI applications. As vector data scales up, in-memory indexes pose a significant challenge due to the substantial increase in main ...

research-article

One Seed, Two Birds: A Unified Learned Structure for Exact and Approximate Counting

Article No.: 15, Pages 1–26https://doi.org/10.1145/3639270

The modern database has many precise and approximate counting requirements. Nevertheless, a solitary multidimensional index or cardinality estimator is insufficient to cater to the escalating demands across all counting scenarios. Such approaches are ...

research-article

Open Access

Optimizing Nested Recursive Queries

Article No.: 16, Pages 1–27https://doi.org/10.1145/3639271

Datalog is a declarative programming language that has gained popularity in various domains due to its simplicity, expressiveness, and efficiency. But "pure" Datalog is limited to monotone queries, and cannot be used in most practical applications. For ...

research-article

Open Access

Sub-optimal Join Order Identification with L1-error

Article No.: 17, Pages 1–24https://doi.org/10.1145/3639272

Q-error -- the standard metric for quantifying the error of individual cardinality estimates -- has been widely adopted as a surrogate for query plan optimality in recent work on learning-based cardinality estimation. However, the only result connecting ...

research-article

Open Access

Efficient Algorithm for K-Multiple-Means

Article No.: 18, Pages 1–26https://doi.org/10.1145/3639273

K-Multiple-Means is an extension of K-means for the clustering of multiple means used in many applications, such as image segmentation, load balancing, and blind-source separation. Since K-means uses only one mean to represent each cluster, it fails to ...

research-article

Predictive and Near-Optimal Sampling for View Materialization in Video Databases

Article No.: 19, Pages 1–27https://doi.org/10.1145/3639274

Scalable video query optimization has re-emerged as an attractive research topic in recent years. The OTIF system, a video database with cutting-edge efficiency, has introduced a new paradigm of utilizing view materialization to facilitate online query ...

research-article

LIT: Lightning-fast In-memory Temporal Indexing

Article No.: 20, Pages 1–27https://doi.org/10.1145/3639275

We study the problem of temporal database indexing, i.e., indexing versions of a database table in an evolving database. With the larger and cheaper memory chips nowadays, we can afford to keep track of all versions of an evolving table in memory. This ...

research-article

Open Access

Optimizing Dataflow Systems for Scalable Interactive Visualization

Article No.: 21, Pages 1–25https://doi.org/10.1145/3639276

Supporting the interactive exploration of large datasets is a popular and challenging use case for data management systems. Traditionally, the interface and the back-end system are built and optimized separately, and interface design and system ...

research-article

Efficient Distributed Hop-Constrained Path Enumeration on Large-Scale Graphs

Article No.: 22, Pages 1–25https://doi.org/10.1145/3639277

The enumeration of hop-constrained simple paths is a building block in many graph-based areas. Due to the enormous search spaces in large-scale graphs, a single machine can hardly satisfy the requirements of both efficiency and memory, which causes an ...

research-article

Efficient High-Quality Clustering for Large Bipartite Graphs

Article No.: 23, Pages 1–27https://doi.org/10.1145/3639278

A bipartite graph contains inter-set edges between two disjoint vertex sets, and is widely used to model real-world data, such as user-item purchase records, author-article publications, and biological interactions between drugs and proteins. k-Bipartite ...

research-article

DTT: An Example-Driven Tabular Transformer for Joinability by Leveraging Large Language Models

Article No.: 24, Pages 1–24https://doi.org/10.1145/3639279

Many organizations rely on data from government and third-party sources, and those sources rarely follow the same data formatting. This introduces challenges in integrating data from multiple sources or aligning external sources with internal databases. ...

research-article

Open Access

Determining Exact Quantiles with Randomized Summaries

Article No.: 25, Pages 1–26https://doi.org/10.1145/3639280

Quantiles are fundamental statistics in various data science tasks, but costly to compute, e.g., by loading the entire data in memory for ranking. With limited memory space, prevalent in end devices or databases with heavy loads, it needs to scan the ...

research-article

An LDP Compatible Sketch for Securely Approximating Set Intersection Cardinalities

Article No.: 26, Pages 1–27https://doi.org/10.1145/3639281

Given two sets of elements held by two different parties separately, computing the cardinality (i.e., the number of distinct elements) of their intersection set is a fundamental task in applications such as network monitoring and database systems. To ...

research-article

Spruce: a Fast yet Space-saving Structure for Dynamic Graph Storage

Article No.: 27, Pages 1–26https://doi.org/10.1145/3639282

Dynamic graphs have been gaining increasing popularity across various application domains. With the growing size of these graphs, the update performance as well as space occupancy is becoming a crucial aspect of dynamic graph storage. Although existing ...

research-article

Controllable Tabular Data Synthesis Using Diffusion Models

Article No.: 28, Pages 1–29https://doi.org/10.1145/3639283

Controllable tabular data synthesis plays a crucial role in numerous applications by allowing users to generate synthetic data with specific conditions. These conditions can include synthesizing tuples with predefined attribute values or creating tuples ...

research-article

HERO: A Hierarchical Set Partitioning and Join Framework for Speeding up the Set Intersection Over Graphs

Article No.: 29, Pages 1–25https://doi.org/10.1145/3639284

As one of the most primitive operators in graph algorithms, such as the triangle counting, maximal clique enumeration, and subgraph listing, a set intersection operator returns common vertices between any two given sets of vertices in data graphs. It is ...

research-article

Local Differentially Private Heavy Hitter Detection in Data Streams with Bounded Memory

Article No.: 30, Pages 1–27https://doi.org/10.1145/3639285

Top-k frequent items detection is a fundamental task in data stream mining. Many promising solutions are proposed to improve memory efficiency while still maintaining high accuracy for detecting the Top-k items. Despite the memory efficiency concern, the ...

Sections

Issue Downloads

Save to Binder

Subjects

Comments