Performance prediction for concurrent database workloads

J Duggan, U Cetintemel, O Papaemmanouil… - Proceedings of the 2011 …, 2011 - dl.acm.org
Proceedings of the 2011 ACM SIGMOD International Conference on Management of …, 2011dl.acm.org
Current trends in data management systems, such as cloud and multi-tenant databases, are
leading to data processing environments that concurrently execute heterogeneous query
workloads. At the same time, these systems need to satisfy diverse performance
expectations. In these newly-emerging settings, avoiding potential Quality-of-Service (QoS)
violations heavily relies on performance predictability, ie, the ability to estimate the impact of
concurrent query execution on the performance of individual queries in a continuously …
Current trends in data management systems, such as cloud and multi-tenant databases, are leading to data processing environments that concurrently execute heterogeneous query workloads. At the same time, these systems need to satisfy diverse performance expectations. In these newly-emerging settings, avoiding potential Quality-of-Service (QoS) violations heavily relies on performance predictability, i.e., the ability to estimate the impact of concurrent query execution on the performance of individual queries in a continuously evolving workload.
This paper presents a modeling approach to estimate the impact of concurrency on query performance for analytical workloads. Our solution relies on the analysis of query behavior in isolation, pairwise query interactions and sampling techniques to predict resource contention under various query mixes and concurrency levels. We introduce a simple yet powerful metric that accurately captures the joint effects of disk and memory contention on query performance in a single value. We also discuss predicting the execution behavior of a time-varying query workload through query-interaction timelines, i.e., a fine-grained estimation of the time segments during which discrete mixes will be executed concurrently. Our experimental evaluation on top of PostgreSQL/TPC-H demonstrates that our models can provide query latency predictions within approximately 20% of the actual values in the average case.
ACM Digital Library