Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Volume 1, Issue 4December 2023PACMMOD
Bibliometrics
editorial
Free
PACMMOD Volume 1 Issue 4: Editorial
Article No.: 222, Pages 1–2https://doi.org/10.1145/3626709

Welcome to this issue of the Proceedings of the ACM on Management of Data (Volume 1, Issue 4 (SIGMOD)). While this issue has papers from the SIGMOD track, PACMMOD will soon also have issues with papers from the newly created PODS track. Out of 189 ...

research-article
GEqO: ML-Accelerated Semantic Equivalence Detection
Article No.: 223, Pages 1–25https://doi.org/10.1145/3626710

Large scale analytics engines have become a core dependency for modern data-driven enterprises to derive business insights and drive actions. These engines support a large number of analytic jobs processing huge volumes of data on a daily basis, and ...

research-article
The Battleship Approach to the Low Resource Entity Matching Problem
Article No.: 224, Pages 1–25https://doi.org/10.1145/3626711

Entity matching, a core data integration problem, is the task of deciding whether two data tuples refer to the same real-world entity. Recent advances in deep learning methods, using pre-trained language models, were proposed for resolving entity ...

research-article
Open Access
Udon: Efficient Debugging of User-Defined Functions in Big Data Systems with Line-by-Line Control
Article No.: 225, Pages 1–26https://doi.org/10.1145/3626712

Many big data systems are written in languages such as C, C++, Java, and Scala to process large amounts of data efficiently, while data analysts often use Python to conduct data wrangling, statistical analysis, and machine learning. User-defined ...

research-article
ChainKV: A Semantics-Aware Key-Value Store for Ethereum System
Article No.: 226, Pages 1–23https://doi.org/10.1145/3626713

The Log-Structure Merged tree (LSM-tree) based key-value (KV) store has been widely adopted as the storage engine for blockchain systems, such as Ethereum, in which blockchain data are uniformly transformed into randomly distributed KV items for ...

research-article
Proving Query Equivalence Using Linear Integer Arithmetic
Article No.: 227, Pages 1–26https://doi.org/10.1145/3626768

Proving the equivalence between SQL queries is a fundamental problem in database research. Existing solvers model queries using algebraic representations and convert such representations into first-order logic formulas so that query equivalence can be ...

research-article
Open Access
A Unified Approach for Resilience and Causal Responsibility with Integer Linear Programming (ILP) and LP Relaxations
Article No.: 228, Pages 1–27https://doi.org/10.1145/3626715

What is a minimal set of tuples to delete from a database in order to eliminate all query answers? This problem is called "the resilience of a query" and is one of the key algorithmic problems underlying various forms of reverse data management, such as ...

research-article
ADGNN: Towards Scalable GNN Training with Aggregation-Difference Aware Sampling
Article No.: 229, Pages 1–26https://doi.org/10.1145/3626716

Distributed computing is promising to enable large-scale graph neural network (GNN) model training. However, care is needed to avoid excessive computational and communication overheads. Sampling is promising in terms of enabling scalability, and sampling ...

research-article
Open Access
ALP: Adaptive Lossless floating-Point Compression
Article No.: 230, Pages 1–26https://doi.org/10.1145/3626717

IEEE 754 doubles do not exactly represent most real values, introducing rounding errors in computations and [de]serialization to text. These rounding errors inhibit the use of existing lightweight compression schemes such as Delta and Frame Of Reference (...

research-article
Anchor: A Library for Building Secure Persistent Memory Systems
Article No.: 231, Pages 1–31https://doi.org/10.1145/3626718

Cloud infrastructure is experiencing a shift towards disaggregated setups, especially with the introduction of the Compute Express Link (CXL) technology, where byte-addressable ersistent memory (PM) is becoming prominent. To fully utilize the potential ...

research-article
AS-Parser: Log Parsing Based on Adaptive Segmentation
Article No.: 232, Pages 1–26https://doi.org/10.1145/3626719

System logs have long been recognized as valuable data for analyzing and diagnosing system failures. One fundamental task of log processing is to convert unstructured logs into structured logs through log parsing. All previous log parsing approaches ...

research-article
Open Access
Cackle: Analytical Workload Cost and Performance Stability With Elastic Pools
Article No.: 233, Pages 1–25https://doi.org/10.1145/3626720

Analytical query workloads are prone to rapid fluctuations in resource demands. These rapid, hard to predict resource demand changes make provisioning a challenge. Users must either over provision at excessive cost or suffer poor query latency when ...

research-article
ChainedFilter: Combining Membership Filters by Chain Rule
Article No.: 234, Pages 1–27https://doi.org/10.1145/3626721

Membership (membership query/membership testing) is a fundamental problem across databases, networks and security. However, previous research has primarily focused on either approximate solutions, such as Bloom Filters, or exact methods, like perfect ...

research-article
Open Access
Correlation Joins over Time Series Data Streams Utilizing Complementary Dimension Reduction and Transformation
Article No.: 235, Pages 1–26https://doi.org/10.1145/3626722

A common analysis task over a stream of time series is to find all pairs of windows whose correlation is above a given threshold. For a large number of streams, doing so naively, i.e., checking the Cartesian product, is too expensive. In essence, finding ...

research-article
Demystifying the QoS and QoE of Edge-hosted Video Streaming Applications in the Wild with SNESet
Article No.: 236, Pages 1–29https://doi.org/10.1145/3626723

Video streaming applications (VSAs) are increasingly being deployed on large-scale edge platforms, which have the potential to significantly improve the quality of service (QoS) and end-user experience (QoE), ultimately maximizing business outcomes. ...

research-article
DGC: Training Dynamic Graphs with Spatio-Temporal Non-Uniformity using Graph Partitioning by Chunks
Article No.: 237, Pages 1–25https://doi.org/10.1145/3626724

Dynamic Graph Neural Network (DGNN) has shown a strong capability of learning dynamic graphs by exploiting both spatial and temporal features. Although DGNN has recently received considerable attention by AI community and various DGNN models have been ...

research-article
DP-starJ: A Differential Private Scheme towards Analytical Star-Join Queries
Article No.: 238, Pages 1–24https://doi.org/10.1145/3626725

Star-join query is the fundamental task in data warehouse and has wide applications in On-line Analytical Processing (olap) scenarios. Due to the large number of foreign key constraints and the asymmetric effect in the neighboring instance between the ...

research-article
Open Access
Efficient Approximation Framework for Attribute Recommendation
Article No.: 239, Pages 1–26https://doi.org/10.1145/3626726

Trend analysis is a fundamental type of analytical query in online analytical processing (OLAP) systems. In trend analysis, a key step is to identify k valuable attributes whose distributions in two subsets under different predicates significantly differ ...

research-article
Open Access
Equitable Top-k Results for Long Tail Data
Article No.: 240, Pages 1–24https://doi.org/10.1145/3626727

For datasets exhibiting long tail phenomenon, we identify a fairness concern in existing top-k algorithms, that return a "fixed" set of k results for a given query. This causes a handful of popular records (products, items, etc) getting overexposed and ...

research-article
F3KM: Federated, Fair, and Fast k-means
Article No.: 241, Pages 1–25https://doi.org/10.1145/3626728

This paper proposes a federated, fair, and fast k-means algorithm (F3KM) to solve the fair clustering problem efficiently in scenarios where data cannot be shared among different parties. The proposed algorithm decomposes the fair k-means problem into ...

research-article
FACET: Robust Counterfactual Explanation Analytics
Article No.: 242, Pages 1–27https://doi.org/10.1145/3626729

Machine learning systems are deployed in domains such as hiring and healthcare, where undesired classifications can have serious ramifications for the user. Thus, there is a rising demand for explainable AI systems which provide actionable steps for lay ...

research-article
Generation of Training Examples for Tabular Natural Language Inference
Article No.: 243, Pages 1–27https://doi.org/10.1145/3626730

Tabular data is becoming increasingly important in Natural Language Processing (NLP) tasks, such as Tabular Natural Language Inference (TNLI). Given a table and a hypothesis expressed in NL text, the goal is to assess if the former structured data ...

research-article
Open Access
Hierarchical Cut Labelling - Scaling Up Distance Queries on Road Networks
Article No.: 244, Pages 1–25https://doi.org/10.1145/3626731

Answering the shortest-path distance between two arbitrary locations is a fundamental problem in road networks. Labelling-based solutions are the current state-of-the-arts to render fast response time, which can generally be categorised into hub-based ...

research-article
High-Ratio Compression for Machine-Generated Data
Article No.: 245, Pages 1–27https://doi.org/10.1145/3626732

Machine-generated data is rapidly growing and poses challenges for data-intensive systems, especially as the growth of data outpaces the growth of storage space. To cope with the storage issue, compression plays a critical role in storage engines, ...

research-article
Open Access
HongTu: Scalable Full-Graph GNN Training on Multiple GPUs
Article No.: 246, Pages 1–27https://doi.org/10.1145/3626733

Full-graph training on graph neural networks (GNN) has emerged as a promising training method for its effectiveness. Full-graph training requires extensive memory and computation resources. To accelerate this training process, researchers have proposed ...

research-article
Open Access
Lemo: A Cache-Enhanced Learned Optimizer for Concurrent Queries
Article No.: 247, Pages 1–26https://doi.org/10.1145/3626734

With the expansion of modern database services, multi-user access has become a crucial feature in various practical application scenarios, including enterprise applications and e-commerce platforms. However, if multiple users submit queries within a ...

research-article
Lightweight Materialization for Fast Dashboards Over Joins
Article No.: 248, Pages 1–27https://doi.org/10.1145/3626735

Dashboards are vital in modern business intelligence tools, providing non-technical users with an interface to access comprehensive business data. With the rise of cloud technology, there is an increased number of data sources to provide enriched ...

research-article
Open Access
MirrorKV: An Efficient Key-Value Store on Hybrid Cloud Storage with Balanced Performance of Compaction and Querying
Article No.: 249, Pages 1–27https://doi.org/10.1145/3626736

LSM-based key-value stores have been leveraged in many state-of-the-art data-intensive applications as storage engines. As data volume scales up, a cost-efficient approach is to deploy these applications on hybrid cloud storage with hot/cold separation, ...

research-article
Open Access
MOST: Model-Based Compression with Outlier Storage for Time Series Data
Article No.: 250, Pages 1–29https://doi.org/10.1145/3626737

Time series data are used in a wide variety of applications. The explosive growth of the amount of time series data poses a significant challenge in efficient data storage and query processing. Unfortunately, existing compression techniques either show ...

research-article
Neural Attributed Community Search at Billion Scale
Article No.: 251, Pages 1–25https://doi.org/10.1145/3626738

Community search has been extensively studied in the past decades. In recent years, there is a growing interest in attributed community search that aims to identify a community based on both the query nodes and query attributes. A set of techniques have ...

Subjects

Comments