Newsletter Downloads
TECHNICAL PERSPECTIVE: Ad Hoc Transactions: What They Are and Why We Should Care
Most database research papers are prescriptive. They identify a technical problem and show us how to solve it. They present new algorithms, theorems, and evaluations of prototypes. Other papers follow a different path: descriptive rather than ...
Ad Hoc Transactions: What They Are and Why We Should Care
Many transactions in web applications are constructed ad hoc in the application code. For example, developers might explicitly use locking primitives or validation procedures to coordinate critical code fragments. We refer to database operations ...
Technical Perspective: Sortledton: a Universal Graph Data Structure
Graph processing is becoming ubiquitous due to the proliferation of interconnected data in several domains, including life sciences, social networks, cybersecurity, finance and logistics, to name a few. In parallel with the growth of the underlying ...
Sortledton: a Universal Graph Data Structure
Despite the wide adoption of graph processing across many different application domains, there is no underlying data structure that can serve a variety of graph workloads (analytics, traversals, and pattern matching) on dynamic graphs with single edge ...
Technical Perspective for Skeena: Efficient and Consistent Cross-Engine Transactions
The paper proposes a solution to the problem of inadequate support for transactions in multi-engine database systems. Multi-engine database systems are databases that integrate new (fast) memory-optimized storage engines with (slow) traditional engines, ...
Efficiently Making Cross-Engine Transactions Consistent
Database systems are becoming increasingly multi-engine. In particular, a main-memory engine may coexist with a traditional storage-centric engine in a system to support various applications. It is desirable to allow applications to access data in both ...
Technical Perspective: When is it safe to run a transactional workload under Read Committed?
A data management platform provides many capabilities to assist the data owner, application coder, or end-user. For example, it should support an expressive query language, schema definition, and sophisticated access control. Another way many platforms ...
When is it safe to run a transactional workload under Read Committed?
The popular isolation level multiversion Read Committed (RC) exchanges some of the strong guarantees of serializability for increased transaction throughput. Nevertheless, transaction workloads can sometimes be executed under RC while still guaranteeing ...
Technical Perspective for Sherman: A Write-Optimized Distributed B+Tree Index on Disaggregated Memory
Separation of compute and storage has become the defacto standard for cloud database systems. First proposed in 2007 for database systems [2], it is now widely adopted by all major cloud providers such as Amazon Redshift, Google BigQuery, and Snowflake. ...
Building Write-Optimized Tree Indexes on Disaggregated Memory
Memory disaggregation architecture physically separates CPU and memory into independent components, which are connected via high-speed RDMA networks, greatly improving resource utilization of database systems. However, such an architecture poses unique ...
Technical Perspective: Conjunctive Queries with Comparisons
Query processing, the art of efficiently executing a relational query on a given database, is a foundational and core area in data management research. Established at the dawn of relational database systems in the 1970's, relational query processing ...
Conjunctive Queries with Comparisons
Conjunctive queries with predicates in the form of comparisons that span multiple relations have regained interest recently, due to their relevance in OLAP queries, spatiotemporal databases, and machine learning over relational data. The standard ...
Technical Perspective: Query Answers - Fewer is Faster
We often write queries using LIMIT k, indicating that only k answers are to be returned. This feature is present in most query languages, for different data models: SQL, SPARQL, Cypher etc. For example, in a repository of about 250M SPARQL queries, ...
Threshold Queries
- Angela Bonifati,
- Stefania Dumbrava,
- George Fletcher,
- Jan Hidders,
- Matthias Hofer,
- Wim Martens,
- Filip Murlak,
- Joshua Shinavier,
- Slawek Staworko,
- Dominik Tomaszuk
Threshold queries are an important class of queries that only require computing or counting answers up to a specified threshold value. To the best of our knowledge, threshold queries have been largely disregarded in the research literature, which is ...
Technical Perspective: (Pre-) Semirings Come to the Recursion Party
(This article is an imagined conversation with my U. at Buffalo UG algorithms class students.)
Convergence of Datalog over (Pre-) Semirings
Recursive queries have been traditionally studied in the framework of datalog, a language that restricts recursion to monotone queries over sets, which is guaranteed to converge in polynomial time in the size of the input. But modern big data systems ...
Technical Perspective: Optimal Algorithms for Multiway Search on Partial Orders
Given a list of comparable items A = {a1, . . . , an sorted so that a1 < a2 < . . . < an, a canonical problem is locating a target item q within A if it exists. The canonical algorithm for this problem, of course, is binary search, which locates q using ...
An Optimal Algorithm for Partial Order Multiway Search
Partial order multiway search (POMS) is an important problem that finds use in crowdsourcing, distributed file systems, software testing, etc. In this problem, a game is played between an algorithm A and an oracle, based on a directed acyclic graph G ...
Technical Perspective: Accurate Summary-based Cardinality Estimation Through the Lens of Cardinality Estimation Graphs
Query engines are really good at choosing an efficient query plan. Users don't need to worry about how they write their query, since the optimizer makes all the right choices for executing the query, while taking into account all aspects of data, such ...
Accurate Summary-based Cardinality Estimation Through the Lens of Cardinality Estimation Graphs
We study two classes of summary-based cardinality estimators that use statistics about input relations and small-size joins: (i) optimistic estimators, which were defined in the context of graph database management systems, that make uniformity and ...
Technical Perspective: Revisiting Runtime Dynamic Optimization for Join Queries in Big Data Management Systems
Query optimization is the process of finding an efficient query execution plan for a given SQL query. The runtime difference between a good and a bad plan can be tremendous. For example, in the case of TPC-H query 5, a query with 5 joins, the difference ...
Revisiting Runtime Dynamic Optimization for Join Queries in Big Data Management Systems
Effective query optimization remains an open problem for Big Data Management Systems. In this work, we revisit an old idea, runtime dynamic optimization, and adapt it to a big data management system, AsterixDB. The approach runs in stages (re-...
Technical Perspective on 'R2T: Instance-optimal Truncation for Differentially Private Query Evaluation with Foreign Keys
Increased use of data to inform decision making has brought with it a rising awareness of the importance of privacy, and the need for appropriate mitigations to be put in place to protect the interests of individuals whose data is being processed. From ...
R2T: Instance-optimal Truncation for Differentially Private Query Evaluation with Foreign Keys
Answering SPJA queries under differential privacy (DP), including graph pattern counting under node-DP as an important special case, has received considerable attention in recent years. The dual challenge of foreign-key constraints and self-joins is ...
Subjects
Currently Not Available