Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Volume 2, Issue 6December 2024SIGMOD
Editor:
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
EISSN:2836-6573
Reflects downloads up to 16 Feb 2025Bibliometrics
editorial
Free
PACMMOD V2, N6 (SIGMOD), December 2024: Editorial
Article No.: 223, Pages 1–2https://doi.org/10.1145/3698798

The Proceedings of the ACM on Management of Data (PACMMOD) is concerned with the principles, algorithms, techniques, systems, and applications of database management systems, data management technology, and science and engineering of data. It includes ...

research-article
A Universal Sketch for Estimating Heavy Hitters and Per-Element Frequency Moments in Data Streams with Bounded Deletions
Article No.: 224, Pages 1–28https://doi.org/10.1145/3698799

In the field of data stream processing, there are two prevalent models, i.e., insertion-only, and turnstile models. Most previous works were proposed for the insertion-only model, which assumes new elements arrive continuously as a stream, and neglects ...

research-article
An Efficient and Exact Algorithm for Locally h-Clique Densest Subgraph Discovery
Article No.: 225, Pages 1–26https://doi.org/10.1145/3698800

Detecting locally, non-overlapping, near-clique densest subgraphs is a crucial problem for community search in social networks. As a vertex may be involved in multiple overlapped local cliques, detecting locally densest sub-structures considering h-...

research-article
Open Access
Buffered Persistence in B+ Trees
Article No.: 226, Pages 1–24https://doi.org/10.1145/3698801

Non-volatile Memory (NVM) offers the opportunity to build large, durable B+ trees with markedly higher performance and faster post-crash recovery than is possible with traditional disk- or flash-based persistence. Unfortunately, cache flush and fence ...

research-article
Camel: Efficient Compression of Floating-Point Time Series
Article No.: 227, Pages 1–26https://doi.org/10.1145/3698802

Time series compression encodes the information in a time-ordered sequence of data points into fewer bits, thereby reducing storage costs and possibly other costs. Compression methods are either general or XOR-based. General compression methods are time-...

research-article
Common Neighborhood Estimation over Bipartite Graphs under Local Differential Privacy
Article No.: 228, Pages 1–26https://doi.org/10.1145/3698803

Bipartite graphs, formed by two vertex layers, arise as a natural fit for modeling the relationships between two groups of entities. In bipartite graphs, common neighborhood computation between two vertices on the same vertex layer is a basic operator, ...

research-article
Connectivity-Oriented Property Graph Partitioning for Distributed Graph Pattern Query Processing
Article No.: 229, Pages 1–26https://doi.org/10.1145/3698804

Graph pattern query is a powerful tool for extracting crucial information from property graphs. With the exponential growth of sizes, property graphs are typically divided into multiple subgraphs (referred to as partitions) and stored across various ...

research-article
Constant-time Connectivity Querying in Dynamic Graphs
Article No.: 230, Pages 1–23https://doi.org/10.1145/3698805

Connectivity query processing is a fundamental problem in graph processing. Given an undirected graph and two query vertices, the problem aims to identify whether they are connected via a path. Given frequent edge updates in real graph applications, in ...

research-article
CtxPipe: Context-aware Data Preparation Pipeline Construction for Machine Learning
Article No.: 231, Pages 1–27https://doi.org/10.1145/3698831

Machine learning models are only as good as their training data. Simple models trained on well-chosen features extracted from the raw data often outperform complex models trained directly on the raw data. Data preparation pipelines, which clean and ...

research-article
Open Access
Directional Queries: Making Top-k Queries More Effective in Discovering Relevant Results
Article No.: 232, Pages 1–26https://doi.org/10.1145/3698807

Top-k queries, in particular those based on a linear scoring function, are a common way to extract relevant results from large datasets. Their major advantage over alternative approaches, such as skyline queries (which return all the undominated objects ...

research-article
Open Access
Disclosure-Compliant Query Answering
Article No.: 233, Pages 1–28https://doi.org/10.1145/3698808

In today's data-driven world, organizations face increasing pressure to comply with data disclosure policies, which require data masking measures and robust access control mechanisms. This paper presents Mascara, a middleware for specifying and enforcing ...

research-article
Open Access
DPconv: Super-Polynomially Faster Join Ordering
Article No.: 234, Pages 1–26https://doi.org/10.1145/3698809

We revisit the join ordering problem in query optimization. The standard exact algorithm, DPccp, has a worst-case running time of O(3n). This is prohibitively expensive for large queries, which are not that uncommon anymore. We develop a new algorithmic ...

research-article
Open Access
Finding Logic Bugs in Spatial Database Engines via Affine Equivalent Inputs
Article No.: 235, Pages 1–26https://doi.org/10.1145/3698810

Spatial Database Management Systems (SDBMSs) aim to store, manipulate, and retrieve spatial data. SDBMSs are employed in various modern applications, such as geographic information systems, computer-aided design tools, and location-based services. ...

research-article
GIDCL: A Graph-Enhanced Interpretable Data Cleaning Framework with Large Language Models
Article No.: 236, Pages 1–29https://doi.org/10.1145/3698811

Data quality is critical across many applications. The utility of data is undermined by various errors, making rigorous data cleaning a necessity. Traditional data cleaning systems depend heavily on predefined rules and constraints, which necessitate ...

research-article
GOLAP: A GPU-in-Data-Path Architecture for High-Speed OLAP
Article No.: 237, Pages 1–26https://doi.org/10.1145/3698812

In this paper, we suggest a novel GPU-in-data-path architecture that leverages a GPU to accelerate the I/O path and thus can achieve almost in-memory bandwidth using SSDs. In this architecture, the main idea is to stream data in heavy-weight compressed ...

research-article
Open Access
High-Performance Query Processing with NVMe Arrays: Spilling without Killing Performance
Article No.: 238, Pages 1–27https://doi.org/10.1145/3698813

This paper aims to bridge the gap between fast in-memory query engines and slow but robust engines that can utilize external storage. We find that current systems have to choose between fast in-memory operators and slower out-of-memory operators. We ...

research-article
Open Access
iRangeGraph: Improvising Range-dedicated Graphs for Range-filtering Nearest Neighbor Search
Article No.: 239, Pages 1–26https://doi.org/10.1145/3698814

Range-filtering approximate nearest neighbor (RFANN) search is attracting increasing attention in academia and industry. Given a set of data objects, each being a pair of a high-dimensional vector and a numeric value, an RFANN query with a vector and a ...

research-article
Live Patching for Distributed In-Memory Key-Value Stores
Article No.: 241, Pages 1–26https://doi.org/10.1145/3698816

Providers of high-availability data stores need to roll out software updates without causing noticeable downtimes. For distributed data stores like Redis Cluster, the state-of-the-art is a rolling update, where the nodes are restarted in sequence. This ...

research-article
Open Access
Transforming RDF Graphs to Property Graphs using Standardized Schemas
Article No.: 242, Pages 1–25https://doi.org/10.1145/3698817

Knowledge Graphs can be encoded using different data models. They are especially abundant using RDF and recently also as property graphs. While knowledge graphs in RDF adhere to the subject-predicate-object structure, property graphs utilize multi-...

research-article
LSMGraph: A High-Performance Dynamic Graph Storage System with Multi-Level CSR
Article No.: 243, Pages 1–28https://doi.org/10.1145/3698818

The growing volume of graph data may exhaust the main memory. It is crucial to design a disk-based graph storage system to ingest updates and analyze graphs efficiently. However, existing dynamic graph storage systems suffer from read or write ...

research-article
Memento Filter: A Fast, Dynamic, and Robust Range Filter
Article No.: 244, Pages 1–27https://doi.org/10.1145/3698820

Range filters are probabilistic data structures that answer approximate range emptiness queries. They aid in avoiding processing empty range queries and have use cases in many application domains such as key-value stores and social web analytics. However,...

research-article
Multivariate Time Series Cleaning under Speed Constraints
Article No.: 245, Pages 1–26https://doi.org/10.1145/3698821

Errors are common in time series due to unreliable sensor measurements. Existing methods focus on univariate data but do not utilize the correlation between dimensions. Cleaning each dimension separately may lead to a less accurate result, as some errors ...

research-article
Navigating Labels and Vectors: A Unified Approach to Filtered Approximate Nearest Neighbor Search
Article No.: 246, Pages 1–27https://doi.org/10.1145/3698822

Given a query vector, approximate nearest neighbor search (ANNS) aims to retrieve similar vectors from a set of high-dimensional base vectors. However, many real-world applications jointly query both vector data and structured data, imposing label ...

research-article
Online Detection of Anomalies in Temporal Knowledge Graphs with Interpretability
Article No.: 247, Pages 1–26https://doi.org/10.1145/3698823

Temporal knowledge graphs (TKGs) are valuable resources for capturing evolving relationships among entities, yet they are often plagued by noise, necessitating robust anomaly detection mechanisms. Existing dynamic graph anomaly detection approaches ...

research-article
Open Access
Pasta: A Cost-Based Optimizer for Generating Pipelining Schedules for Dataflow DAGs
Article No.: 248, Pages 1–26https://doi.org/10.1145/3698832

Data analytics tasks are often formulated as data workflows represented as directed acyclic graphs (DAGs) of operators. The recent trend of adopting machine learning (ML) techniques in workflows results in increasingly complicated DAGs with many ...

research-article
Open Access
Personalized Truncation for Personalized Privacy
Article No.: 249, Pages 1–25https://doi.org/10.1145/3698825

In the standard model of differential privacy (DP), every user's privacy is treated equally, which is captured by a single privacy parameter \varepsilon. However, in many real-world situations, users may have diverse privacy concerns and requirements, ...

research-article
Provenance-Enabled Explainable AI
Article No.: 250, Pages 1–27https://doi.org/10.1145/3698826

Machine learning (ML) algorithms have advanced significantly in recent years, progressively evolving into artificial intelligence (AI) agents capable of solving complex, human-like intellectual challenges. Despite the advancements, the interpretability ...

research-article
Open Access
SPID-Join: A Skew-resistant Processing-in-DIMM Join Algorithm Exploiting the Bank- and Rank-level Parallelisms of DIMMs
Article No.: 251, Pages 1–27https://doi.org/10.1145/3698827

Recent advances in Dual In-line Memory Modules (DIMMs) allow DIMMs to support Processing-In-DIMM (PID) by placing In-DIMM Processors (IDPs) near their memory banks. Prior studies have shown that in-memory joins can benefit from PID by offloading their ...

research-article
Towards a Converged Relational-Graph Optimization Framework
Article No.: 252, Pages 1–27https://doi.org/10.1145/3698828

The recent ISO SQL:2023 standard adopts SQL/PGQ (Property Graph Queries), facilitating graph-like querying within relational databases. This advancement, however, underscores a significant gap in how to effectively optimize SQL/PGQ queries within ...

research-article
Open Access
Understanding and Reusing Test Suites Across Database Systems
Article No.: 253, Pages 1–26https://doi.org/10.1145/3698829

Database Management System (DBMS) developers have implemented extensive test suites to test their DBMSs. For example, the SQLite test suites contain over 92 million lines of code. Despite these extensive efforts, test suites are not systematically reused ...

Subjects

Comments