Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleNovember 2024
Databases Unbound: Querying All of the World's Bytes with AI
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4546–4554https://doi.org/10.14778/3685800.3685916Over the past five decades, the relational database model has proven to be a scaleable and adaptable model for querying a variety of structured data, with use cases in analytics, transactions, graphs, streaming and more. However, most of the world's data ...
PrismX: A Single-Machine System for Querying Big Graphs
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4485–4488https://doi.org/10.14778/3685800.3685906We demonstrate PrismX (PRAM with SSDs as Memory eXtension), a single-machine system for graph analytics. PrismX allows users to make practical use of existing PRAM algorithms without any change. To cope with the limited DRAM capacity, it employs NVMe ...
- research-articleNovember 2024
Pyneapple-G: Scalable Spatial Grouping Queries
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4469–4472https://doi.org/10.14778/3685800.3685902This paper demonstrates Pynapple-G, an open-source library for scalable spatial grouping queries based on Apache Sedona (formerly known as GeoSpark). We demonstrate two modules, namely, SGPAC and DDCEL, that support grouping points, grouping lines, and ...
Catcher: A Cache Analysis System for Top-k Pub/Sub Service
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4389–4392https://doi.org/10.14778/3685800.3685882Top-k Publish/Subscribe (TkPS) service is widely studied in spatial database, with various cache-based methods proposed to address its efficiency challenge in top-k result maintenance. These methods require in-depth exploration of relationships between ...
SEER: An End-to-End Toolkit for Benchmarking Time Series Database Systems in Monitoring Applications
- Luca Althaus,
- Mourad Khayati,
- Abdelouahab Khelifati,
- Anton Dignös,
- Djellel Difallah,
- Philippe Cudré-Mauroux
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4361–4364https://doi.org/10.14778/3685800.3685875Time series database systems (TSDBs) are prevalent in many applications ranging from monitoring and IoT devices to scientific research. Those systems are specifically designed to efficiently manage data indexed by time. Because of the variety of ...
-
UniView: A Unified Autonomous Materialized View Management System for Various Databases
- Zhenrong Xu,
- Pengfei Wang,
- Guoze Xue,
- Qitong Yan,
- Shenghao Gong,
- Yelan Jiang,
- Yuren Mao,
- Yunjun Gao,
- Shu Shen,
- Wei Zhang,
- Dan Luo,
- Lu Chen
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4353–4356https://doi.org/10.14778/3685800.3685873Materialized views (MVs) are critical for improving query performance of database systems, especially in online analytical processing (OLAP) databases. Typically, MVs are maintained by DBAs, which relies on prior knowledge and manual operations. Recently,...
- research-articleNovember 2024
QPJVis Demo: Quality-Boost Progressive Join Query Processing System
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4345–4348https://doi.org/10.14778/3685800.3685871Progressive query processing enables data scientists to efficiently analyze and explore large datasets. Data scientists can start further analyses earlier if the progressive result can represent the complete results well. Most progressive processing ...
Rodeo: Making Refinements for Diverse Top-K Queries
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4341–4344https://doi.org/10.14778/3685800.3685870Database queries are commonly used to select and rank items. With the increasing awareness of diversity, ensuring a diverse output (i.e., the representation of different groups in the top-k positions) becomes essential. To address this challenge, we ...
- research-articleNovember 2024
DBG-PT: A Large Language Model Assisted Query Performance Regression Debugger
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4337–4340https://doi.org/10.14778/3685800.3685869In this paper we explore the ability of Large Language Models (LLMs) in analyzing and comparing query plans, and resolving query performance regressions. We present DBG-PT, a query regression debugging framework powered by LLMs. DBG-PT keeps track of ...
- research-articleNovember 2024
Spatial Query Optimization With Learning
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4245–4248https://doi.org/10.14778/3685800.3685846Query optimization is a key component in database management systems (DBMS) and distributed data processing platforms. Recent research in the database community incorporated techniques from artificial intelligence to enhance query optimization. Various ...
- research-articleNovember 2024
Native Distributed Databases: Problems, Challenges and Opportunities
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4217–4220https://doi.org/10.14778/3685800.3685839Native distributed databases, crucial for scalable applications, offer transactional and analytical prowess but face data intricacies and network challenges. Under the CAP theorem's constraints, latency and replication issues necessitate creative ...
LLM for Data Management
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4213–4216https://doi.org/10.14778/3685800.3685838Machine learning techniques have been verified to be effective in optimizing data management systems and are widely researched in recent years. However, traditional small-sized ML models often struggle to generalize to new scenarios, and have limited ...
- research-articleNovember 2024
Grouping, Subsumption, and Disjunctive Join Optimizations in Oracle
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4200–4212https://doi.org/10.14778/3685800.3685837Query optimization must evolve with new workloads. As analytic and data warehouse workloads become more ubiquitous, optimization techniques that reduce the amount of data processed during query execution, enable shared computation and avoid expensive ...
Petabyte-Scale Row-Level Operations in Data Lakehouses
- Anton Okolnychyi,
- Chao Sun,
- Kazuyuki Tanimura,
- Russell Spitzer,
- Ryan Blue,
- Szehon Ho,
- Yufei Gu,
- Vishwanath Lakkundi,
- DB Tsai
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4159–4172https://doi.org/10.14778/3685800.3685834Data lakehouses combine the almost infinite scale and diverse tooling of a data lake with the reliability and functionality of a data warehouse. This paper presents extensions that enhance data lake-houses using Apache Iceberg and Apache Spark with ...
- research-articleNovember 2024
Lindorm-UWC: An Ultra-Wide-Column Database for Internet of Vehicles
- Qianyu Ouyang,
- Chunhui Shen,
- Wenlong Yang,
- Peng Yu,
- Qiang Xiao,
- Jianhui Lei,
- Yadong Chen,
- Qilu Zhong,
- Xiang Wang,
- Yong Lin,
- Qingyi Meng,
- Zhicheng Ji,
- Wei Meng,
- Cen Zheng,
- Sheng Wang,
- Dan Pei,
- Wei Zhang,
- Feifei Li,
- Jingren Zhou
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4117–4129https://doi.org/10.14778/3685800.3685831In the Internet of Vehicle (IoV) systems, intelligent vehicles generate huge amounts of data that supports diverse services and applications. In practice, database systems are deployed in the cloud to manage data uploaded from the vehicle side and ...
Presto's History-Based Query Optimizer
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4077–4089https://doi.org/10.14778/3685800.3685828An important feature of modern query optimizers is the ability to produce a query plan that is optimal for the underlying data set. This requires the ability to estimate cardinalities and computational costs of intermediate query plan nodes, which is ...
- research-articleNovember 2024
SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL
- Jeff Shute,
- Shannon Bales,
- Matthew Brown,
- Jean-Daniel Browne,
- Brandon Dolphin,
- Romit Kudtarkar,
- Andrey Litvinov,
- Jingchi Ma,
- John Morcos,
- Michael Shen,
- David Wilhite,
- Xi Wu,
- Lulan Yu
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4051–4063https://doi.org/10.14778/3685800.3685826SQL has been extremely successful as the de facto standard language for working with data. Virtually all mainstream database-like systems use SQL as their primary query language. But SQL is an old language with significant design problems, making it ...
- research-articleNovember 2024
Adaptive and Robust Query Execution for Lakehouses at Scale
- Maryann Xue,
- Steven Chen,
- Andy Lam,
- Yuanjian Li,
- Yingyi Bu,
- Herman van Hovell,
- Yunxiao Ma,
- Xiao Li,
- Sameer Paranjpye,
- Abhishek Somani,
- Bart Samwel,
- Vuk Ercegovac,
- Sriram Krishnamurthy,
- Reynold Xin,
- Wenchen Fan,
- Mostafa Mokhtar,
- Jiexing Li,
- Amit Shukla,
- Matei Zaharia,
- Ziqi Liu,
- RK Korlapati,
- Alexander Behm,
- Michalis Petropoulos
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 3947–3959https://doi.org/10.14778/3685800.3685818Many organizations have embraced the "Lakehouse" data management paradigm, which involves constructing structured data warehouses on top of open, unstructured data lakes. This approach stands in stark contrast to traditional, closed, relational databases ...
- research-articleNovember 2024
Db2une: Tuning Under Pressure via Deep Learning
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 3855–3868https://doi.org/10.14778/3685800.3685811Modern database systems including IBM Db2 have numerous parameters, "knobs," that require precise configuration to achieve optimal workload performance. Even for experts, manually "tuning" these knobs is a challenging process. We present Db2une, an ...
- research-articleNovember 2024
ClickHouse - Lightning Fast Analytics for Everyone
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 3731–3744https://doi.org/10.14778/3685800.3685802Over the past several decades, the amount of data being stored and analyzed has increased exponentially. Businesses across industries and sectors have begun relying on this data to improve products, evaluate performance, and make business-critical ...