Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleAugust 2024
LeanStore: A High-Performance Storage Engine for NVMe SSDs
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4536–4545https://doi.org/10.14778/3685800.3685915Neither traditional disk-based database systems nor modern inmemory database systems are capable of fully exploiting modern servers with multiple NVMe SSDs. LeanStore is a high-performance OLTP storage engine specifically optimized for NVMe SSDs and ...
- research-articleAugust 2024
Vector Databases: What's Really New and What's Next? (VLDB 2024 Panel)
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4505–4506https://doi.org/10.14778/3685800.3685911Vector databases have recently emerged as a hot topic in the field of databases, especially in industry. This is due to the widespread interest in Large Language Models (LLMs), where vector databases provide the relevant context for LLMs to produce more ...
- research-articleAugust 2024
X-Stor: A Cloud-Native NoSQL Database Service with Multi-Model Support
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4025–4037https://doi.org/10.14778/3685800.3685824In recent years at Tencent, we have observed that the use of multiple NoSQL databases for storing business data with diverse models has led to increased programming and deployment costs, as well as inefficient maintenance and underutilized resources. In ...
- research-articleAugust 2024
TDSQL: Tencent Distributed Database System
- Yuxing Chen,
- Anqun Pan,
- Hailin Lei,
- Anda Ye,
- Shuo Han,
- Yan Tang,
- Wei Lu,
- Yunpeng Chai,
- Feng Zhang,
- Xiaoyong Du
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 3869–3882https://doi.org/10.14778/3685800.3685812Distributed databases have become indispensable in contemporary computing and data processing, owing to their pivotal role in ensuring high availability and scalability. They effectively cater to the requirements of data management and high-concurrency ...
- research-articleAugust 2024
Db2une: Tuning Under Pressure via Deep Learning
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 3855–3868https://doi.org/10.14778/3685800.3685811Modern database systems including IBM Db2 have numerous parameters, "knobs," that require precise configuration to achieve optimal workload performance. Even for experts, manually "tuning" these knobs is a challenging process. We present Db2une, an ...
-
- research-articleAugust 2024
An Examination of CXL Memory Use Cases for In-Memory Database Management Systems Using SAP HANA
- Minseon Ahn,
- Thomas Willhalm,
- Norman May,
- Donghun Lee,
- Suprasad Mutalik Desai,
- Daniel Booss,
- Jungmin Kim,
- Navneet Singh,
- Daniel Ritter,
- Oliver Rebholz
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 3827–3840https://doi.org/10.14778/3685800.3685809CXL-based disaggregated memory systems offer options to expand the memory beyond the limits of a single server via cache-coherent memory expansion cards or memory pools. Especially, In-Memory Database Management Systems (IMDBMSs) can benefit from ...
Towards Resource Efficiency: Practical Insights into Large-Scale Spark Workloads at ByteDance
- Yixin Wu,
- Xiuqi Huang,
- Zhongjia Wei,
- Hang Cheng,
- Chaohui Xin,
- Zuzhi Chen,
- Binbin Chen,
- Yufei Wu,
- Hao Wang,
- Tieying Zhang,
- Rui Shi,
- Xiaofeng Gao,
- Yuming Liang,
- Pengwei Zhao,
- Guihai Chen
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 3759–3771https://doi.org/10.14778/3685800.3685804At ByteDance, where we execute over a million Spark jobs and handle 500PB of shuffled data daily, ensuring resource efficiency is paramount for cost savings. However, achieving optimization of resource efficiency in large-scale production environments ...
OLAP on Modern Chiplet-Based Processors
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 11Pages 3428–3441https://doi.org/10.14778/3681954.3682011Chiplet-based CPUs, which combine multiple independent dies on a single package, allow hardware to scale to higher CPU core counts at the cost of more memory heterogeneity and performance variability. This introduces challenges when existing query ...
The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-Actions
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 11Pages 3373–3387https://doi.org/10.14778/3681954.3682007Existing machine learning (ML) approaches to automatically optimize database management systems (DBMSs) only target a single configuration space at a time (e.g., knobs, query hints, indexes). Simultaneously tuning multiple configuration spaces is ...
- research-articleJuly 2024
nsDB: Architecting the Next Generation Database by Integrating Neural and Symbolic Systems
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 11Pages 3283–3289https://doi.org/10.14778/3681954.3682000In this paper, we propose nsDB, a novel neuro-symbolic database system that integrates neural and symbolic system architectures natively to address the weaknesses of each, providing a strong database capable of data managing, model learning, and complex ...
Agile-Ant: Self-Managing Distributed Cache Management for Cost Optimization of Big Data Applications
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 11Pages 3151–3164https://doi.org/10.14778/3681954.3681990Distributed in-memory processing frameworks accelerate application runs by caching important datasets in memory. Allocating a suitable cluster configuration for caching these datasets plays a crucial role in achieving minimal cost. We present Agile-ant, ...
BonsaiKV: Towards Fast, Scalable, and Persistent Key-Value Stores with Tiered, Heterogeneous Memory System
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 4Pages 726–739https://doi.org/10.14778/3636218.3636228Emerging NUMA/CXL-based tiered memory systems with heterogeneous memory devices such as DRAM and NVMM deliver ultrafast speed, large capacity, and data persistence all at once, offering great promise to high-performance in-memory key-value stores. To ...
GPU Database Systems Characterization and Optimization
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 3Pages 441–454https://doi.org/10.14778/3632093.3632107GPUs offer massive parallelism and high-bandwidth memory access, making them an attractive option for accelerating data analytics in database systems. However, while modern GPUs possess more resources than ever before (e.g., higher DRAM bandwidth), ...
SmartLite: A DBMS-Based Serving System for DNN Inference in Resource-Constrained Environments
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 3Pages 278–291https://doi.org/10.14778/3632093.3632095Many IoT applications require the use of multiple deep neural networks (DNNs) to perform various tasks on low-cost edge devices with limited computation resources. However, existing DNN model serving platforms, such as TensorFlow Serving and TorchServe, ...
Catalyst: Optimizing Cache Management for Large In-memory Key-value Systems
Proceedings of the VLDB Endowment (PVLDB), Volume 16, Issue 13Pages 4339–4352https://doi.org/10.14778/3625054.3625068In-memory key-value cache systems, such as Memcached and Redis, are essential in today's data centers. A key mission of such cache systems is to identify the most valuable data for caching. To achieve this, the current system design keeps track of each ...
AMNES: Accelerating the Computation of Data Correlation Using FPGAs
Proceedings of the VLDB Endowment (PVLDB), Volume 16, Issue 13Pages 4174–7187https://doi.org/10.14778/3625054.3625056A widely used approach to characterize input data in both databases and ML is computing the correlation between attributes. The operation is supported by all major database engines and ML platforms. However, it is an expensive operation as the number of ...
- research-articleJuly 2021
The art of balance: a RateupDB™ experience of building a CPU/GPU hybrid database product
Proceedings of the VLDB Endowment (PVLDB), Volume 14, Issue 12Pages 2999–3013https://doi.org/10.14778/3476311.3476378GPU-accelerated database systems have been studied for more than 10 years, ranging from prototyping development to industry products serving in multiple domains of data applications. Existing GPU database research solutions are often focused on specific ...
- research-articleJuly 2021
The end of Moore's law and the rise of the data processor
- Niv Dayan,
- Moshe Twitto,
- Yuval Rochman,
- Uri Beitler,
- Itai Ben Zion,
- Edward Bortnikov,
- Shmuel Dashevsky,
- Ofer Frishman,
- Evgeni Ginzburg,
- Igal Maly,
- Avraham (Poza) Meir,
- Mark Mokryn,
- Iddo Naiss,
- Noam Rabinovich
Proceedings of the VLDB Endowment (PVLDB), Volume 14, Issue 12Pages 2932–2944https://doi.org/10.14778/3476311.3476373With the end of Moore's Law, database architects are turning to hardware accelerators to offload computationally intensive tasks from the CPU. In this paper, we show that accelerators can facilitate far more than just computation: they enable algorithms ...
- research-articleJuly 2021
Robust voice querying with MUVE: optimally visualizing results of phonetically similar queries
Proceedings of the VLDB Endowment (PVLDB), Volume 14, Issue 11Pages 2397–2409https://doi.org/10.14778/3476249.3476289Recently proposed voice query interfaces translate voice input into SQL queries. Unreliable speech recognition on top of the intrinsic challenges of text-to-SQL translation makes it hard to reliably interpret user input. We present MUVE (Multiplots for ...
- research-articleJuly 2021
SKT: a one-pass multi-sketch data analytics accelerator
Proceedings of the VLDB Endowment (PVLDB), Volume 14, Issue 11Pages 2369–2382https://doi.org/10.14778/3476249.3476287Data analysts often need to characterize a data stream as a first step to its further processing. Some of the initial insights to be gained include, e.g., the cardinality of the data set and its frequency distribution. Such information is typically ...