PVLDB: Vol 16, No 9

Volume 16, Issue 9May 2023

Volume 16, Issue 9

May 2023

Editor:

Georgia Koutrika
Athena Research Center
,
Jun Yang
Duke University

Publisher:

VLDB Endowment

ISSN:2150-8097

Subscribe to Journal Recommend ACM DL

ALREADY A SUBSCRIBER?SIGN IN

Bibliometrics

Issue Downloads

PDFFront matter (Cover, Contents, Organization, Letter from the editors in chief)

Select All

Export Citations Save to Binder

research-article

Neighborhood-Based Hypergraph Core Decomposition

Pages 2061–2074https://doi.org/10.14778/3598581.3598582

We propose neighborhood-based core decomposition: a novel way of decomposing hypergraphs into hierarchical neighborhood-cohesive subhypergraphs. Alternative approaches to decomposing hypergraphs, e.g., reduction to clique or bipartite graphs, are not ...

research-article

Temporal SIR-GN: Efficient and Effective Structural Representation Learning for Temporal Graphs

Pages 2075–2089https://doi.org/10.14778/3598581.3598583

Node representation learning (NRL) generates numerical vectors (embeddings) for the nodes of a graph. Structural NRL specifically assigns similar node embeddings for those nodes that exhibit similar structural roles. This is in contrast with its ...

research-article

What Modern NVMe Storage Can Do, and How to Exploit it: High-Performance I/O for High-Performance Storage Engines

Pages 2090–2102https://doi.org/10.14778/3598581.3598584

NVMe SSDs based on flash are cheap and offer high throughput. Combining several of these devices into a single server enables 10 million I/O operations per second or more. Our experiments show that existing out-of-memory database systems and storage ...

research-article

WiscSort: External Sorting for Byte-Addressable Storage

Pages 2103–2116https://doi.org/10.14778/3598581.3598585

We present WiscSort, a new approach to high-performance concurrent sorting for existing and future byte-addressable storage (BAS) devices. WiscSort carefully reduces writes, exploits random reads by splitting keys and values during sorting, and performs ...

research-article

Text Indexing for Long Patterns: Anchors are All you Need

Pages 2117–2131https://doi.org/10.14778/3598581.3598586

In many real-world database systems, a large fraction of the data is represented by strings: sequences of letters over some alphabet. This is because strings can easily encode data arising from different sources. It is often crucial to represent such ...

research-article

The FastLanes Compression Layout: Decoding > 100 Billion Integers per Second with Scalar Code

Pages 2132–2144https://doi.org/10.14778/3598581.3598587

The open-source FastLanes project aims to improve big data formats, such as Parquet, ORC and columnar database formats, in multiple ways. In this paper, we significantly accelerate decoding of all common Light-Weight Compression (LWC) schemes: DICT, FOR,...

research-article

VeriBench: Analyzing the Performance of Database Systems with Verifiability

Pages 2145–2157https://doi.org/10.14778/3598581.3598588

Database systems are paying more attention to data security in recent years. Immutable systems such as blockchains, verifiable databases, and ledger databases are equipped with various verifiability mechanisms to protect data. Such systems often adopt ...

research-article

Towards Designing and Learning Piecewise Space-Filling Curves

Pages 2158–2171https://doi.org/10.14778/3598581.3598589

To index multi-dimensional data, space-filling curves (SFCs) have been used to map the data to one dimension, and then a one-dimensional indexing method such as the B-tree is used to index the mapped data. The existing SFCs all adopt a single mapping ...

research-article

MiniGraph: Querying Big Graphs with a Single Machine

Pages 2172–2185https://doi.org/10.14778/3598581.3598590

This paper presents MiniGraph, an out-of-core system for querying big graphs with a single machine. As opposed to previous single-machine graph systems, MiniGraph proposes a pipelined architecture to overlap I/O and CPU operations, and improves multi-...

research-article

BICE: Exploring Compact Search Space by Using Bipartite Matching and Cell-Wide Verification

Pages 2186–2198https://doi.org/10.14778/3598581.3598591

Subgraph matching is the problem of searching for all embeddings of a query graph in a data graph, and subgraph query processing (also known as subgraph search) is to find all the data graphs that contain a query graph as subgraphs. Extensive research ...

research-article

Maximal D-Truss Search in Dynamic Directed Graphs

Pages 2199–2211https://doi.org/10.14778/3598581.3598592

Community search (CS) aims at personalized subgraph discovery which is the key to understanding the organisation of many real-world networks. CS in undirected networks has attracted significant attention from researchers, including many solutions for ...

research-article

DILI: A Distribution-Driven Learned Index

Pages 2212–2224https://doi.org/10.14778/3598581.3598593

Targeting in-memory one-dimensional search keys, we propose a novel DIstribution-driven Learned Index tree (DILI), where a concise and computation-efficient linear regression model is used for each node. An internal node's key range is equally divided ...

research-article

Pre-Trained Embeddings for Entity Resolution: An Experimental Analysis

Pages 2225–2238https://doi.org/10.14778/3598581.3598594

Many recent works on Entity Resolution (ER) leverage Deep Learning techniques involving language models to improve effectiveness. This is applied to both main steps of ER, i.e., blocking and matching. Several pre-trained embeddings have been tested, ...

research-article

Decoupled Graph Neural Networks for Large Dynamic Graphs

Pages 2239–2247https://doi.org/10.14778/3598581.3598595

Real-world graphs, such as social networks, financial transactions, and recommendation systems, often demonstrate dynamic behavior. This phenomenon, known as graph stream, involves the dynamic changes of nodes and the emergence and disappearance of ...

research-article

Adaptive Indexing of Objects with Spatial Extent

Pages 2248–2260https://doi.org/10.14778/3598581.3598596

Can we quickly explore large multidimensional data in main memory? Adaptive indexing responds to this need by building an index incrementally, in response to queries; in its default form, it indexes a single attribute or, in the presence of several ...

research-article

LEON: A New Framework for ML-Aided Query Optimization

Pages 2261–2273https://doi.org/10.14778/3598581.3598597

Query optimization has long been a fundamental yet challenging topic in the database field. With the prosperity of machine learning (ML), some recent works have shown the advantages of reinforcement learning (RL) based learned query optimizer. However, ...

research-article

TiQuE: Improving the Transactional Performance of Analytical Systems for True Hybrid Workloads

Pages 2274–2288https://doi.org/10.14778/3598581.3598598

Transactions have been a key issue in database management for a long time and there are a plethora of architectures and algorithms to support and implement them. The current state-of-the-art is focused on storage management and is tightly coupled with ...

research-article

Seiden: Revisiting Query Processing in Video Database Systems

Pages 2289–2301https://doi.org/10.14778/3598581.3598599

State-of-the-art video database management systems (VDBMSs) often use lightweight proxy models to accelerate object retrieval and aggregate queries. The key assumption underlying these systems is that the proxy model is an order of magnitude faster than ...

research-article

Extract-Transform-Load for Video Streams

Pages 2302–2315https://doi.org/10.14778/3598581.3598600

Social media, self-driving cars, and traffic cameras produce video streams at large scales and cheap cost. However, storing and querying video at such scales is prohibitively expensive. We propose to treat large-scale video analytics as a data ...

research-article

Pando: Enhanced Data Skipping with Logical Data Partitioning

Pages 2316–2329https://doi.org/10.14778/3598581.3598601

With enormous volumes of data, quickly retrieving data that is relevant to a query is essential for achieving high performance. Modern cloud-based database systems often partition the data into blocks and employ various techniques to skip irrelevant ...

research-article

Cracking-Like Join for Trusted Execution Environments

Pages 2330–2343https://doi.org/10.14778/3598581.3598602

Data processing on non-trusted infrastructures, such as the public cloud, has become increasingly popular, despite posing risks to data privacy. However, the existing cloud DBMSs either lack sufficient privacy guarantees or underperform. In this paper, ...

research-article

Opportunities for Quantum Acceleration of Databases: Optimization of Queries and Transaction Schedules

Pages 2344–2353https://doi.org/10.14778/3598581.3598603

The capabilities of quantum computers, such as the number of supported qubits and maximum circuit depth, have grown exponentially in recent years. Commercially relevant applications that take advantage of quantum computing are expected to be available ...

research-article

SDPipe: A Semi-Decentralized Framework for Heterogeneity-Aware Pipeline-parallel Training

Pages 2354–2363https://doi.org/10.14778/3598581.3598604

The increasing size of both deep learning models and training data necessitates the ability to scale out model training through pipeline-parallel training, which combines pipelined model parallelism and data parallelism. However, most of them assume an ...

research-article

LRU-C: Parallelizing Database I/Os for Flash SSDs

Pages 2364–2376https://doi.org/10.14778/3598581.3598605

The conventional database buffer managers have two inherent sources of I/O serialization: read stall and mutex conflict. The serialized I/O makes storage and CPU under-utilized, limiting transaction throughput and latency. Such harm stands out on flash ...

research-article

Why Not Yet: Fixing a Top-k Ranking that is Not Fair to Individuals

Pages 2377–2390https://doi.org/10.14778/3598581.3598606

This work considers why-not questions in the context of top-k queries and score-based ranking functions. Following the popular linear scalarization approach for multi-objective optimization, we study rankings based on the weighted sum of multiple ...

Subjects

Currently Not Available

Proceedings of the VLDB Endowment

Sections

Issue Downloads

Neighborhood-Based Hypergraph Core Decomposition

Temporal SIR-GN: Efficient and Effective Structural Representation Learning for Temporal Graphs

What Modern NVMe Storage Can Do, and How to Exploit it: High-Performance I/O for High-Performance Storage Engines

WiscSort: External Sorting for Byte-Addressable Storage

Text Indexing for Long Patterns: Anchors are All you Need

The FastLanes Compression Layout: Decoding > 100 Billion Integers per Second with Scalar Code

VeriBench: Analyzing the Performance of Database Systems with Verifiability

Towards Designing and Learning Piecewise Space-Filling Curves

MiniGraph: Querying Big Graphs with a Single Machine

BICE: Exploring Compact Search Space by Using Bipartite Matching and Cell-Wide Verification

Maximal D-Truss Search in Dynamic Directed Graphs

DILI: A Distribution-Driven Learned Index

Pre-Trained Embeddings for Entity Resolution: An Experimental Analysis

Decoupled Graph Neural Networks for Large Dynamic Graphs

Adaptive Indexing of Objects with Spatial Extent

LEON: A New Framework for ML-Aided Query Optimization

TiQuE: Improving the Transactional Performance of Analytical Systems for True Hybrid Workloads

Seiden: Revisiting Query Processing in Video Database Systems

Extract-Transform-Load for Video Streams

Pando: Enhanced Data Skipping with Logical Data Partitioning

Cracking-Like Join for Trusted Execution Environments

Opportunities for Quantum Acceleration of Databases: Optimization of Queries and Transaction Schedules

SDPipe: A Semi-Decentralized Framework for Heterogeneity-Aware Pipeline-parallel Training

LRU-C: Parallelizing Database I/Os for Flash SSDs

Why Not Yet: Fixing a Top-k Ranking that is Not Fair to Individuals

Sections

Issue Downloads

Save to Binder

Subjects

Comments