Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2723372.2749454acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

CliffGuard: A Principled Framework for Finding Robust Database Designs

Published: 27 May 2015 Publication History

Abstract

A fundamental problem in database systems is choosing the best physical design, i.e., a small set of auxiliary structures that enable the fastest execution of future queries. Almost all commercial databases come with designer tools that create a number of indices or materialized views (together comprising the physical design) that they exploit during query processing. Existing designers are what we call nominal; that is, they assume that their input parameters are precisely known and equal to some nominal values. For instance, since future workload is often not known a priori, it is common for these tools to optimize for past workloads in hopes that future queries and data will be similar. In practice, however, these parameters are often noisy or missing. Since nominal designers do not take the influence of such uncertainties into account, they find designs that are sub-optimal and remarkably brittle. Often, as soon as the future workload deviates from the past, their overall performance falls off a cliff, leading to customer discontent and expensive redesigns. Thus, we propose a new type of database designer that is robust against parameter uncertainties, so that overall performance degrades more gracefully when future workloads deviate from the past. Users express their risk tolerance by deciding on how much nominal optimality they are willing to trade for attaining their desired level of robustness against uncertain situations. To the best of our knowledge, this paper is the first to adopt the recent breakthroughs in the theory of robust optimization to build a practical framework for solving some of the most fundamental problems in databases, replacing today's brittle designs with a principled world of robust designs that can guarantee predictable and consistent performance.

References

[1]
CliffGuard: A General Framework for Robust and Efficient Database Optimization. http://www.cliffguard.org.
[2]
S. Acharya, P. B. Gibbons, and V. Poosala. Aqua: A fast decision support system using approximate query answers. In VLDB, 1999.
[3]
S. Acharya, P. B. Gibbons, and V. Poosala. Congressional samples for approximate answering of group-by queries. In SIGMOD, May 2000.
[4]
S. Agarwal, H. Milner, A. Kleiner, A. Talwalkar, M. Jordan, S. Madden, B. Mozafari, and I. Stoica. Knowing when you're wrong: Building fast and reliable approximate query processing systems. In SIGMOD, 2014.
[5]
S. Agarwal, B. Mozafari, A. Panda, H. Milner, S. Madden, and I. Stoica. Blinkdb: queries with bounded errors and bounded response times on very large data. In EuroSys, 2013.
[6]
S. Agarwal, A. Panda, B. Mozafari, A. P. Iyer, S. Madden, and I. Stoica. Blink and it's done: Interactive queries on very large data. PVLDB, 2012.
[7]
S. Agrawal, S. Chaudhuri, and V. Narasayya. Materialized view and index selection tool for microsoft sql server 2000. 2001.
[8]
I. Alagiannis, D. Dash, K. Schnaitter, A. Ailamaki, and N. Polyzotis. An automated, yet interactive and portable db designer. In SIGMOD, 2010.
[9]
I. Alagiannis, S. Idreos, and A. Ailamaki. H2o: a hands-free adaptive store. In SIGMOD, 2014.
[10]
M. Armbrust and et. al. Piql: Success-tolerant query processing in the cloud. PVLDB, 5, 2011.
[11]
B. Babcock and S. Chaudhuri. Towards a robust query optimizer: A principled and practical approach. In SIGMOD, 2005.
[12]
B. Babcock, S. Chaudhuri, and G. Das. Dynamic sample selection for approximate query processing. In VLDB, 2003.
[13]
A. Ben-Tal, L. El Ghaoui, and A. Nemirovski. Robust optimization. Princeton University Press, 2009.
[14]
D. Bertsimas and D. B. Brown. Constrained stochastic lqc: a tractable approach. Automatic Control, IEEE Transactions on, 52(10), 2007.
[15]
D. Bertsimas and et. al. Theory and applications of robust optimization. SIAM, 53, 2011.
[16]
D. Bertsimas, O. Nohadani, and K. M. Teo. Robust nonconvex optimization for simulation-based problems. Operations Research, 2007.
[17]
D. Bertsimas and M. Sim. The price of robustness. Operations research, 52(1), 2004.
[18]
J. R. Birge and et. al. Improving thin-film manufacturing yield with robust optimization. Applied optics, 50, 2011.
[19]
S. P. Boyd and L. Vandenberghe. Convex optimization. Cambridge university press, 2004.
[20]
D. P. Brown, J. Chaware, and M. Koppuravuri. Index selection in a database system, Mar. 3 2009. US Patent 7,499,907.
[21]
N. Bruno, S. Chaudhuri, A. C. König, V. R. Narasayya, R. Ramamurthy, and M. Syamala. Autoadmin project at microsoft research: Lessons learned. IEEE Data Eng. Bull., 34(4), 2011.
[22]
C. Chatfield. Time-series forecasting. CRC Press, 2002.
[23]
S. Chaudhuri, G. Das, and V. Narasayya. Optimized stratified sampling for approximate query processing. TODS, 2007.
[24]
S. Chaudhuri, A. K. Gupta, and V. Narasayya. Compressing sql workloads. In SIGMOD, 2002.
[25]
S. Chaudhuri, H. Lee, and V. R. Narasayya. Variance aware optimization of parameterized queries. In SIGMOD, 2010.
[26]
S. Chaudhuri and V. Narasayya. Self-tuning database systems: A decade of progress. In VLDB, 2007.
[27]
S. Chaudhuri, V. Narasayya, M. Datar, et al. Linear programming approach to assigning benefit to database physical design structures, Nov. 21 2006. US Patent 7,139,778.
[28]
A. N. K. Chen and et. al. Heuristics for selecting robust database structures with dynamic query patterns. EJOR, 168, 2006.
[29]
X. Chen, M. Sim, and P. Sun. A robust optimization perspective on stochastic programming. Operations Research, 55, 2007.
[30]
F. Chu, J. Y. Halpern, and P. Seshadri. Least expected cost query optimization: An exercise in utility. In PODS, 1999.
[31]
B. e. Dageville. Automatic sql tuning in oracle 10g. In VLDB, 2004.
[32]
K. Deb. Geneas: A robust optimal design technique for mechanical component design. 1997.
[33]
X. Ding and J. Le. Adaptive projection in column-stores. In FSKD, 2011.
[34]
I. Doltsinis and et. al. Robust design of non-linear structures using optimization methods. Computer methods in applied mechanics and engineering, 194, 2005.
[35]
D. Donjerkovic and R. Ramakrishnan. Probabilistic optimization of top n queries. In VLDB, 1999.
[36]
K. E. Gebaly and A. Aboulnaga. Robustness in automatic physical database design. In Advances in database technology, 2008.
[37]
G. Graefe and H. Kuno. Adaptive indexing for relational keys. In ICDEW, 2010.
[38]
G. Graefe and H. Kuno. Self-selecting, self-tuning, incrementally optimized indexes. In ICEDT, 2010.
[39]
F. Halim and et. al. Stochastic database cracking: Towards robust adaptive indexing in main-memory column-stores. PVLDB, 5, 2012.
[40]
M. Holze, A. Haschimi, and N. Ritter. Towards workload-aware self-management: Predicting significant workload shifts. In CDEW, 2010.
[41]
K.-L. Hsiung, S.-J. Kim, and S. Boyd. Power control in lognormal fading wireless channels with uptime probability specifications via robust geometric programming. In American Control Conference, 2005.
[42]
R. Huang, R. Chirkova, and Y. Fathi. Two-stage stochastic view selection for data-analysis queries. In Advances in Databases and Information Systems, 2013.
[43]
S. Idreos, M. L. Kersten, and S. Manegold. Database cracking. In CIDR, 2007.
[44]
S. Idreos, M. L. Kersten, and S. Manegold. Self-organizing tuple reconstruction in column-stores. In SIGMOD, 2009.
[45]
A. C. Konig and S. U. Nabar. Scalable exploration of physical database design. In ICDE, 2006.
[46]
M. Kormilitsin, R. Chirkova, Y. Fathi, and M. Stallmann. View and index selection for query-performance improvement: quality-centered algorithms and heuristics. In CIKM, 2008.
[47]
A. Lamb, M. Fuller, R. Varadarajan, N. Tran, B. Vandiver, L. Doshi, and C. Bear. The vertica analytic database: C-store 7 years later. PVLDB, 5(12), 2012.
[48]
K.-H. Lee and G.-J. Park. A global robust optimization using kriging based approximation model. JSME, 49, 2006.
[49]
J. LeFevre, J. Sankaranarayanan, H. Hacigumus, J. Tatemura, N. Polyzotis, and M. J. Carey. Opportunistic physical design for big data analytics. In SIGMOD, 2014.
[50]
J. Li, A. C. König, V. Narasayya, and S. Chaudhuri. Robust estimation of resource consumption for sql queries using statistical techniques. PVLDB, 5(11), 2012.
[51]
W. Liang, H. Wang, and M. E. Orlowska. Materialized view selection under the maintenance time constraint. Data & Knowledge Engineering, 37(2), 2001.
[52]
S. S. Lightstone, G. Lohman, and D. Zilio. Toward autonomic computing with db2 universal database. SIGMOD Record, 31(3), 2002.
[53]
R. G. Lorenz and S. P. Boyd. Robust minimum variance beamforming. Signal Processing, IEEE Transactions on, 53(5), 2005.
[54]
C. Maier and et. al. Parinda: an interactive physical designer for postgresql. In ICEDT, 2010.
[55]
I. Mami and Z. Bellahsene. A survey of view selection methods. SIGMOD, 41(1), 2012.
[56]
V. Markl and et. al. Robust query processing through progressive optimization. In SIGMOD, 2004.
[57]
B. Mozafari, C. Curino, A. Jindal, and S. Madden. Performance and resource modeling in highly-concurrent OLTP workloads. In SIGMOD, 2013.
[58]
B. Mozafari, C. Curino, and S. Madden. Dbseer: Resource and performance prediction for building a next generation database cloud. In CIDR, 2013.
[59]
B. Mozafari, E. Z. Y. Goh, and D. Y. Yoon. Cliffguard: An extended report. Technical report, University of Michigan, Ann Arbor, 2015.
[60]
G. Palermo, C. Silvano, and V. Zaccaria. Robust optimization of soc architectures: A multi-scenario approach. In Embedded Systems for Real-Time Multimedia, 2008.
[61]
S. Papadomanolakis and A. Ailamaki. An integer linear programming approach to database design. In ICDE Workshop, 2007.
[62]
D. Patil, S. Yun, S.-J. Kim, A. Cheung, M. Horowitz, and S. Boyd. A new method for design of robust digital circuits. In ISQED, 2005.
[63]
V. Raman and et. al. Constant-time query processing. In ICDE, 2008.
[64]
K. Schnaitter, S. Abiteboul, T. Milo, and N. Polyzotis. On-line index selection for shifting workloads. In ICDE Workshop, 2007.
[65]
A. Shukla, P. Deshpande, J. F. Naughton, et al. Materialized view selection for multidimensional datasets. In VLDB, volume 98, 1998.
[66]
Z. A. Talebi, R. Chirkova, and Y. Fathi. An integer programming approach for the view and index selection problem. Data & Knowledge Engineering, 83, 2013.
[67]
Q. T. Tran, I. Jimenez, R. Wang, N. Polyzotis, and A. Ailamaki. Rita: An index-tuning advisor for replicated databases. arXiv preprint arXiv:1304.1411, 2013.
[68]
J. Tu, K. K. Choi, and Y. H. Park. A new study on reliability-based design optimization. Journal of Mechanical Design, 121(4), 1999.
[69]
G. Valentin and et. al. Db2 advisor: An optimizer smart enough to recommend its own indexes. In ICDE, 2000.
[70]
A. van der Vaart. Asymptotic statistics, volume 3. Cambridge university press, 2000.
[71]
R. Varadarajan, V. Bharathan, A. Cary, J. Dave, and S. Bodagala. Dbdesigner: A customizable physical design tool for vertica analytic database. In ICDE, 2014.
[72]
J. X. Yu, X. Yao, C.-H. Choi, and G. Gou. Materialized view selection as constrained evolutionary optimization. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 33(4), 2003.
[73]
K. Zeng, S. Gao, J. Gu, B. Mozafari, and C. Zaniolo. Abs: a system for scalable approximate queries with accuracy guarantees. In SIGMOD, 2014.
[74]
K. Zeng, S. Gao, B. Mozafari, and C. Zaniolo. The analytical bootstrap: a new method for fast error estimation in approximate query processing. In SIGMOD, 2014.
[75]
Y. Zhang. General robust-optimization formulation for nonlinear programming. JOTA, 132, 2007.
[76]
D. C. Zilio and et. al. Recommending materialized views and indexes with the ibm db2 design advisor. In ICAC, 2004.

Cited By

View all
  • (2024)ML-Powered Index Tuning: An Overview of Recent Progress and Open ChallengesACM SIGMOD Record10.1145/3641832.364183652:4(19-30)Online publication date: 19-Jan-2024
  • (2024)Towards flexibility and robustness of LSM treesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00826-933:4(1105-1128)Online publication date: 1-Jul-2024
  • (2023)Consolidating Industrial Small Files Using Robust Graph ClusteringIEEE Transactions on Network Science and Engineering10.1109/TNSE.2022.319535010:5(2814-2831)Online publication date: 1-Sep-2023
  • Show More Cited By

Index Terms

  1. CliffGuard: A Principled Framework for Finding Robust Database Designs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
    May 2015
    2110 pages
    ISBN:9781450327589
    DOI:10.1145/2723372
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 May 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. database performance
    2. physical design
    3. robust optimization
    4. workload resilience

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS'15
    Sponsor:
    SIGMOD/PODS'15: International Conference on Management of Data
    May 31 - June 4, 2015
    Victoria, Melbourne, Australia

    Acceptance Rates

    SIGMOD '15 Paper Acceptance Rate 106 of 415 submissions, 26%;
    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)14
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 30 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)ML-Powered Index Tuning: An Overview of Recent Progress and Open ChallengesACM SIGMOD Record10.1145/3641832.364183652:4(19-30)Online publication date: 19-Jan-2024
    • (2024)Towards flexibility and robustness of LSM treesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00826-933:4(1105-1128)Online publication date: 1-Jul-2024
    • (2023)Consolidating Industrial Small Files Using Robust Graph ClusteringIEEE Transactions on Network Science and Engineering10.1109/TNSE.2022.319535010:5(2814-2831)Online publication date: 1-Sep-2023
    • (2023)Explainable AI for DBA: Bridging the DBA's experience and machine learning in tuning database systemsConcurrency and Computation: Practice and Experience10.1002/cpe.769835:21Online publication date: 31-Mar-2023
    • (2022)EndureProceedings of the VLDB Endowment10.14778/3529337.352934515:8(1605-1618)Online publication date: 22-Jun-2022
    • (2022)DIAMETRICSCommunications of the ACM10.1145/356746465:12(105-112)Online publication date: 22-Nov-2022
    • (2021)DIAMetricsACM SIGMOD Record10.1145/3471485.347149250:1(24-31)Online publication date: 17-Jun-2021
    • (2021)DeepCM: Deep neural networks to improve accuracy prediction of database cost modelsConcurrency and Computation: Practice and Experience10.1002/cpe.672434:10Online publication date: 8-Dec-2021
    • (2020)DIAMetricsProceedings of the VLDB Endowment10.14778/3415478.341555113:12(3285-3298)Online publication date: 14-Sep-2020
    • (2020)Self-driving database systems: a conceptual approachDistributed and Parallel Databases10.1007/s10619-020-07288-w38:4(795-817)Online publication date: 16-Mar-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media