Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Volume 2, Issue 1February 2024SIGMOD
Bibliometrics
editorial
Free
PACMMOD Volume 2 Issue 1: Editorial
Article No.: 1, Pages 1–2https://doi.org/10.1145/3639256

Welcome to Issue 1 of Volume 2 of the Proceedings of the ACM on Management of Data, which has papers from the third round of submissions to the SIGMOD research track.

Out of 230 submissions in this round, whose submission deadline was July 15, 2023, a ...

research-article
Open Access
Optimizing Distributed Protocols with Query Rewrites
Article No.: 2, Pages 1–25https://doi.org/10.1145/3639257

Distributed protocols such as 2PC and Paxos lie at the core of many systems in the cloud, but standard implementations do not scale. New scalable distributed protocols are developed through careful analysis and rewrites, but this process is ad hoc and ...

research-article
Open Access
Grafite: Taming Adversarial Queries with Optimal Range Filters
Article No.: 3, Pages 1–23https://doi.org/10.1145/3639258

Range filters allow checking whether a query range intersects a given set of keys with a chance of returning a false positive answer, thus generalising the functionality of Bloom filters from point to range queries. Existing practical range filters have ...

research-article
High-performance Effective Scientific Error-bounded Lossy Compression with Auto-tuned Multi-component Interpolation
Article No.: 4, Pages 1–27https://doi.org/10.1145/3639259

Error-bounded lossy compression has been identified as a promising solution for significantly reducing scientific data volumes upon users' requirements on data distortion. For the existing scientific error-bounded lossy compressors, some of them (such as ...

research-article
MWP: Multi-Window Parallel Evaluation of Regular Path Queries on Streaming Graphs
Article No.: 5, Pages 1–26https://doi.org/10.1145/3639260

A persistent Regular Path Query (RPQ) on a streaming graph is to continuously find every pair of vertices that are connected by a path in the graph within a sliding window, such that the edge label sequence of this path matches a given regular ...

research-article
Proximity Queries on Point Clouds using Rapid Construction Path Oracle
Article No.: 6, Pages 1–26https://doi.org/10.1145/3639261

The prevalence of computer graphics technology boosts the developments of point clouds in recent years, which offer advantages over terrain surfaces (represented by Triangular Irregular Networks, i.e., TINs) in proximity queries, including the shortest ...

research-article
Open Access
Efficient k-Clique Listing: An Edge-Oriented Branching Strategy
Article No.: 7, Pages 1–26https://doi.org/10.1145/3639262

k-clique listing is a vital graph mining operator with diverse applications in various networks. The state-of-the-art algorithms all adopt a branch-and-bound (BB) framework with a vertex-oriented branching strategy (called VBBkC), which forms a sub-...

research-article
Open Access
Relative Keys: Putting Feature Explanation into Context
Article No.: 8, Pages 1–28https://doi.org/10.1145/3639263

Formal feature explanations strictly maintain perfect conformity but are intractable to compute, while heuristic methods are much faster but can lead to problematic explanations due to lack of conformity guarantees. We propose relative keys that have the ...

research-article
NOC-NOC: Towards Performance-optimal Distributed Transactions
Article No.: 9, Pages 1–25https://doi.org/10.1145/3639264

Substantial research efforts have been devoted to studying the performance optimality problem for distributed database transactions. However, they focus just on optimizing transactional reads, and thus overlook crucial factors, such as the efficiency of ...

research-article
Robustness of Updatable Learning-based Index Advisors against Poisoning Attack
Article No.: 10, Pages 1–26https://doi.org/10.1145/3639265

Despite the promising performance of recent learning-based Index Advisors (IAs), they exhibited the robustness issue when poisoning attacks polluted training data. This paper presents the first attempt to study the robustness of updatable learning-based ...

research-article
Open Access
FedKNN: Secure Federated k-Nearest Neighbor Search
Article No.: 11, Pages 1–26https://doi.org/10.1145/3639266

Nearest neighbor search is a fundamental task in various domains, such as federated learning, data mining, information retrieval, and biomedicine. With the increasing need to utilize data from different organizations while respecting privacy regulations, ...

research-article
FineMon: An Innovative Adaptive Network Telemetry Scheme for Fine-Grained, Multi-Metric Data Monitoring with Dynamic Frequency Adjustment and Enhanced Data Recovery
Article No.: 12, Pages 1–26https://doi.org/10.1145/3639267

Network telemetry, characterized by its efficient push model and high-performance communication protocol (gRPC), offers a new avenue for collecting fine-grained real-time data. Despite its advantages, existing network telemetry systems lack a theoretical ...

research-article
Open Access
PECJ: Stream Window Join on Disorder Data Streams with Proactive Error Compensation
Article No.: 13, Pages 1–24https://doi.org/10.1145/3639268

Stream Window Join (SWJ), a vital operation in stream analytics, struggles with achieving a balance between accuracy and latency due to out-of-order data arrivals. Existing methods predominantly rely on adaptive buffering, but often fall short in ...

research-article
Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment
Article No.: 14, Pages 1–27https://doi.org/10.1145/3639269

High-dimensional vector similarity search (HVSS) is gaining prominence as a powerful tool for various data science and AI applications. As vector data scales up, in-memory indexes pose a significant challenge due to the substantial increase in main ...

research-article
One Seed, Two Birds: A Unified Learned Structure for Exact and Approximate Counting
Article No.: 15, Pages 1–26https://doi.org/10.1145/3639270

The modern database has many precise and approximate counting requirements. Nevertheless, a solitary multidimensional index or cardinality estimator is insufficient to cater to the escalating demands across all counting scenarios. Such approaches are ...

research-article
Open Access
Optimizing Nested Recursive Queries
Article No.: 16, Pages 1–27https://doi.org/10.1145/3639271

Datalog is a declarative programming language that has gained popularity in various domains due to its simplicity, expressiveness, and efficiency. But "pure" Datalog is limited to monotone queries, and cannot be used in most practical applications. For ...

research-article
Open Access
Sub-optimal Join Order Identification with L1-error
Article No.: 17, Pages 1–24https://doi.org/10.1145/3639272

Q-error -- the standard metric for quantifying the error of individual cardinality estimates -- has been widely adopted as a surrogate for query plan optimality in recent work on learning-based cardinality estimation. However, the only result connecting ...

research-article
Open Access
Efficient Algorithm for K-Multiple-Means
Article No.: 18, Pages 1–26https://doi.org/10.1145/3639273

K-Multiple-Means is an extension of K-means for the clustering of multiple means used in many applications, such as image segmentation, load balancing, and blind-source separation. Since K-means uses only one mean to represent each cluster, it fails to ...

research-article
Predictive and Near-Optimal Sampling for View Materialization in Video Databases
Article No.: 19, Pages 1–27https://doi.org/10.1145/3639274

Scalable video query optimization has re-emerged as an attractive research topic in recent years. The OTIF system, a video database with cutting-edge efficiency, has introduced a new paradigm of utilizing view materialization to facilitate online query ...

research-article
LIT: Lightning-fast In-memory Temporal Indexing
Article No.: 20, Pages 1–27https://doi.org/10.1145/3639275

We study the problem of temporal database indexing, i.e., indexing versions of a database table in an evolving database. With the larger and cheaper memory chips nowadays, we can afford to keep track of all versions of an evolving table in memory. This ...

research-article
Open Access
Optimizing Dataflow Systems for Scalable Interactive Visualization
Article No.: 21, Pages 1–25https://doi.org/10.1145/3639276

Supporting the interactive exploration of large datasets is a popular and challenging use case for data management systems. Traditionally, the interface and the back-end system are built and optimized separately, and interface design and system ...

research-article
Efficient Distributed Hop-Constrained Path Enumeration on Large-Scale Graphs
Article No.: 22, Pages 1–25https://doi.org/10.1145/3639277

The enumeration of hop-constrained simple paths is a building block in many graph-based areas. Due to the enormous search spaces in large-scale graphs, a single machine can hardly satisfy the requirements of both efficiency and memory, which causes an ...

research-article
Efficient High-Quality Clustering for Large Bipartite Graphs
Article No.: 23, Pages 1–27https://doi.org/10.1145/3639278

A bipartite graph contains inter-set edges between two disjoint vertex sets, and is widely used to model real-world data, such as user-item purchase records, author-article publications, and biological interactions between drugs and proteins. k-Bipartite ...

research-article
DTT: An Example-Driven Tabular Transformer for Joinability by Leveraging Large Language Models
Article No.: 24, Pages 1–24https://doi.org/10.1145/3639279

Many organizations rely on data from government and third-party sources, and those sources rarely follow the same data formatting. This introduces challenges in integrating data from multiple sources or aligning external sources with internal databases. ...

research-article
Open Access
Determining Exact Quantiles with Randomized Summaries
Article No.: 25, Pages 1–26https://doi.org/10.1145/3639280

Quantiles are fundamental statistics in various data science tasks, but costly to compute, e.g., by loading the entire data in memory for ranking. With limited memory space, prevalent in end devices or databases with heavy loads, it needs to scan the ...

research-article
An LDP Compatible Sketch for Securely Approximating Set Intersection Cardinalities
Article No.: 26, Pages 1–27https://doi.org/10.1145/3639281

Given two sets of elements held by two different parties separately, computing the cardinality (i.e., the number of distinct elements) of their intersection set is a fundamental task in applications such as network monitoring and database systems. To ...

research-article
Spruce: a Fast yet Space-saving Structure for Dynamic Graph Storage
Article No.: 27, Pages 1–26https://doi.org/10.1145/3639282

Dynamic graphs have been gaining increasing popularity across various application domains. With the growing size of these graphs, the update performance as well as space occupancy is becoming a crucial aspect of dynamic graph storage. Although existing ...

research-article
Controllable Tabular Data Synthesis Using Diffusion Models
Article No.: 28, Pages 1–29https://doi.org/10.1145/3639283

Controllable tabular data synthesis plays a crucial role in numerous applications by allowing users to generate synthetic data with specific conditions. These conditions can include synthesizing tuples with predefined attribute values or creating tuples ...

research-article
HERO: A Hierarchical Set Partitioning and Join Framework for Speeding up the Set Intersection Over Graphs
Article No.: 29, Pages 1–25https://doi.org/10.1145/3639284

As one of the most primitive operators in graph algorithms, such as the triangle counting, maximal clique enumeration, and subgraph listing, a set intersection operator returns common vertices between any two given sets of vertices in data graphs. It is ...

research-article
Local Differentially Private Heavy Hitter Detection in Data Streams with Bounded Memory
Article No.: 30, Pages 1–27https://doi.org/10.1145/3639285

Top-k frequent items detection is a fundamental task in data stream mining. Many promising solutions are proposed to improve memory efficiency while still maintaining high accuracy for detecting the Top-k items. Despite the memory efficiency concern, the ...

Subjects

Comments