Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
review-article

Robust Query Optimization Methods With Respect to Estimation Errors: A Survey

Published: 03 December 2015 Publication History

Abstract

The quality of a query execution plan chosen by a Cost-Based Optimizer (CBO) depends greatly on the estimation accuracy of input parameter values. Many research results have been produced on improving the estimation accuracy, but they do not work for every situation. Therefore, "robust query optimization" was introduced, in an effort to minimize the sub-optimality risk by accepting the fact that estimates could be inaccurate. In this survey, we aim to provide an overview of robust query optimization methods by classifying them into different categories, explaining the essential ideas, listing their advantages and limitations, and comparing them with multiple criteria.

References

[1]
Aboulnaga, A. and Chaudhuri, S. 1999. Self-tuning Histograms: Building Histograms without Looking at Data. In SIGMOD. New York, USA, 181--192.
[2]
Amsaleg, L., al.1996. Scrambling Query Plans to Cope with Unexpected Delays. In PDIS. Miami, USA, 208--219.
[3]
Antonshenkov, G. 1993. Dynamic Query Optimization in Rdb/VMS. In ICDE. Vienna, Austria, 538--547.
[4]
Arcangeli, J.P., et al. 2004. Mobile Agent Based Self-Adaptive Join for Wide-Area Distributed Query Processing. Journal of Database Management, 15(4): 25--44.
[5]
Avnur, R. and Hellerstein, J.M. 2000. Eddies: Continuously Adaptive Query Processing. SIGMOD. Dallas, USA, 261--272.
[6]
Babu, S. and Bizarro, P. 2005. Adaptive Query Processing in the Looking Glass. In CIDR. Asilomar, USA, 238--249.
[7]
Babu, S., et al. 2005. Proactive Re-Optimization. In SIGMOD. Baltimore, USA, 107--118.
[8]
Babcock, B. and Chaudhuri, S. 2005. Towards a Robust Query Optimizer: A Principled and Practical Approach. In SIGMOD. Baltimore, USA, 119--130.
[9]
Bizarro, P., et al. 2005. Content-Based Routing: Different Plans for Different Data. In VLDB. Trondheim, Norway, 757--768.
[10]
Bizarro, P., et al. 2009. Progressive Parametric Query Optimization. KDE, 21(4): 582--594.
[11]
Bonneau, S. and Hameurlain, A. 1999. Hybrid Simultaneous Scheduling and Mapping in SQL Multi-Query Parallelization. In DEXA. Florence, Italy, 88--98.
[12]
Bouganim, L., et al. 2000. Dynamic Query Scheduling in Data Integration Systems. In ICDE. San Diego, USA, 425--434.
[13]
Bruno, N., et al. 2001. STHoles: a multidimensional workloadaware histogram. In SIGMOD. Santa Barbara, USA, 211--222.
[14]
Bruno, N., et al. 2011. AutoAdmin Project at Microsoft Research: Lessons Learned. IEEE Data Eng. Bull, 34(4): 12--19.
[15]
Bruno, N., et al. 2013. Continuous Cloud-Scale Query Optimization and Processing. PVLDB, 6(11): 961--972.
[16]
Cao, L. and Rundensteiner, E.A. 2013. High Performance Stream Query Processing With Correlation-Aware Partitioning. PVLDB, 7(4): 265--276.
[17]
Chaudhuri, S., et al. 2008. A Pay-As-You-Go Framework for Query Execution Feedback. PVLDB, 1(1): 1141--1152.
[18]
Chaudhuri, S. 2009. Query Optimizers: Time to Rethink the Contract? In SIGMOD. Providence, USA, 961--968.
[19]
Chaudhuri, S., et al. 2009. Exact Cardinality Query Optimization for Optimizer Testing. PVLDB, 2(1): 994--1005.
[20]
Chen, C.M. and Roussopoulos, N. 1994. Adaptive Selectivity Estimation Using Query Feedback. In SIGMOD. Minneapolis, USA, 161--172.
[21]
Chu, F., Halpern, J., Gehrke, J. 2002. Least expected cost query optimization: what can we expect? In PODS. Madison, USA, 293--302.
[22]
Cole, R.L. and Graefe, G. 1994. Optimization of Dynamic Query Evaluation Plans. In SIGMOD. Minneapolis, 150--160.
[23]
Deshpande, A. 2004. An Initial Study of Overheads of Eddies. SIGMOD Record, 33(1): 44--49.
[24]
Deshpande, A., et al. 2007. Adaptive Query Processing. Foundations and Trends in Databases, 1(1): 1--140.
[25]
Dutt, A. and Haritsa, J. 2014. Plan bouquets: query processing without selectivity estimation. In SIGMOD. Snowbird, USA, 1039--1050.
[26]
Ergenç, B., et al. 2007. Robust Placement of Mobile Relational Operators for Large Scale Distributed Query Optimization. In PDCAT. Adelaide, Australia, 227--235.
[27]
Evrendilek, C., et al. 1997. Multidatabase Query Optimization. Distributed and Parallel Databases, 5(1):77--114.
[28]
Ghazal, A., et al. 2012. Adaptive Optimizations of Recursive Queries in Teradata. In SIGMOD. Scottsdale, USA, 851--860.
[29]
Graefe, G. and Ward, K. 1989. Dynamic query evaluation plans. In SIGMOD. Portland, USA, 358--366.
[30]
Graefe, G., et al. 2009. Visualizing the Robustness of Query Execution. In CIDR. Asilomar, USA.
[31]
Graefe, G., et al. 2010. Robust Query Processing. Dagstuhl Workshop Summary 10381, Wadern, Germany.
[32]
Graefe, G. 2011. Robust Query Processing (Research Panel). In ICDE. Hannover, Germany, 1361.
[33]
Graefe, G., et al. 2012. Robust Query Processing. Dagstuhl Workshop Summary 12321, Wadern, Germany.
[34]
Gounaris, A., et al. 2002. Adaptive Query Processing: A Survey. In BNCOD. Sheffield, UK, 11--25.
[35]
Gounaris, A., et al. 2013. Adaptive Query Processing in Distributed Settings. Advanced Query Processing, Vol. 1: 211--236.
[36]
Hameurlain, A. and Morvan, F. 2002. CPU and Incremental Memory Allocation in Dynamic Parallelization of SQL Queries. Journal of Parallel Computing, 28(4): 525--556.
[37]
Han, W., et al. 2007. Progressive Optimization in a Shared-Nothing Parallel Database. In SIGMOD. Beijing, 809--820.
[38]
Harish, D., et al. 2007. On the Production of Anorexic Plan Diagrams. In VLDB. Vienna, Austria, 1081--1092.
[39]
Harish, D., et al. 2008. Identifying Robust Plans through Plan Diagram Reduction. PVLDB, 1(1): 1124--1140.
[40]
Herodotou, H. and Babu, S. Xplus. 2010. A SQL-Tuning-Aware Query Optimizer. PVLDB, 3(1): 1149--1160.
[41]
Hong, W. and Stonebraker, M. 1993. Optimization of Parallel Query Execution Plans in XPRS. Distributed and Parallel Databases, 1(1): 9--32.
[42]
Ioannidis, Y. and Christodoulakis, S. 1991. On the Propagation of Errors in the Size of Join Results. In SIGMOD. Denver, USA, 168--177.
[43]
Ioannidis, Y. 2003. The History of Histograms (abridged). In VLDB. Berlin, Germany, 19--30.
[44]
Ives, Z. G., et al. 1999. An Adaptive Query Execution System for Data Integration. In SIGMOD. Philadelphia, USA, 299--310.
[45]
Ives, Z.G., et al. 2004. Adapting to Source Properties in Processing Data Integration Queries. In SIGMOD. Paris, France, 395--406.
[46]
Kabra, N. and DeWitt, D. 1998. Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution Plans. In SIGMOD. Seattle, USA, 106--117.
[47]
Larson, P., et al. 2007. Cardinality Estimation Using Sample Views with Quality Assurance. In SIGMOD. Beijing, 175--186.
[48]
Lipton, R.J., et al. 1990. Practical Selectivity Estimation through Adaptive Sampling. In SIGMOD. Atlantic City, 1--11.
[49]
Mannino, M., et al. 1988. Statistical profile estimation in database systems. ACM Computing Surveys, 20(3): 191--221.
[50]
Markl, V., et al. 2004. Robust Query Processing through Progressive Optimization. SIGMOD. Paris, France, 659--670.
[51]
Markl, V., et al. 2007. Consistent Selectivity Estimation via Maximum Entropy. VLDB Journal, 16(1): 55--76.
[52]
Morvan, F. and Hameurlain, A. 2009. Dynamic Query Optimization: Towards Decentralized Methods. International Journal of Intelligent Information and Database Systems, 4(3): 461--482.
[53]
Neumann, T. and Calindo-Legaria, C. 2013.Taking the Edge off Cardinality Estimation Errors using Incremental Execution. In BTW. Magdeburg, Germany, 73--92.
[54]
Nehme, R.V., et al. 2009. Query Mesh: Multi-Route Query Processing Technology (Demo). PVLDB, 2(2): 1530--1533.
[55]
Nehme, R.V., et al. 2013. Multi-Route Query Processing and Optimization. Journal of Computer and System Sciences, 79(3): 312--329.
[56]
Olken, F. and Rotem, D. 1986. Simple Random Sampling from Relational Databases. In VLDB. Kyoto, Japan, 160--169.
[57]
Picasso Database Query Optimizer Visualizer. http://dsl.serc.iisc.ernet.in/projects/PICASSO/
[58]
Polyzotis, N. 2005. Selectivity-based partitioning: a divideand-union paradigm for effective query optimization. In CIKM. Bremen, Germany, 720--727.
[59]
Raman, V., et al. 2003. Using State Modules for Adaptive Query Processing. In ICDE. Bangalore, India, 353--364.
[60]
Reddy, N. and Harista, J. 2005. Analyzing Plan Diagrams of Database Query Optimizers. In VLDB. Trondheim, Norway, 1228--1239.
[61]
Selinger, P.G., et al. 1979. Access Path Selection in a Relational DBMS. In SIGMOD. Boston, USA, 23--34.
[62]
Srivastava, Uet al. 2006. ISOMER: Consistent Histogram Construction Using Query Feedback. In ICDE. Atlanta, 39.
[63]
Stillger, M., et al.2001. LEO-DB2's Learning Optimizer. VLDB. Roma, Italy, 19--28.
[64]
Tian, F. and DeWitt, D.J. 2003. Tuple Routing Strategies for Distributed Eddies. In VLDB. Berlin, Germany, 333--344.
[65]
Tzoumas, K., et al. 2010. Sharing-Aware Horizontal Partitioning for Exploiting Correlations during Query Processing. PVLDB, 3(1): 542--553.
[66]
Tzoumas, K., et al. 2011. Lightweight Graphical Models for Selectivity Estimation without Independence Assumptions. PVLDB, 4(11): 852--863.
[67]
Tzoumas, K., et al. 2013. Efficient Adapting Graphical Models for Selectivity Estimation. VLDB Journal, 22(1): 3--27.
[68]
Wilschut, A. N. and Apers, P. M. G. 1991. Dataflow Query Execution in a Parallel Main-Memory Environment. In PDIS, Miami Beach, USA, 68--77.
[69]
Wiener, J.L., et al. 2009. Benchmarking Query Execution Robustness. In TPC Technology Conference on Performance Evaluation & Benchmarking. Lyon, France, 153--166.
[70]
Zhou, Y., et al. 2005. An Adaptable Distributed Query Processing Architecture. Knowledge and Data Engineering. 53(3): 283--309.

Cited By

View all
  • (2024)Pasta: A Cost-Based Optimizer for Generating Pipelining Schedules for Dataflow DAGsProceedings of the ACM on Management of Data10.1145/36988322:6(1-26)Online publication date: 20-Dec-2024
  • (2024)Robust Query Optimization in the Era of Machine Learning: State-of-the-Art and Future Directions2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00408(5371-5375)Online publication date: 13-May-2024
  • (2024)RelJoin: Relative-cost-based selection of distributed join methods for query plan optimizationInformation Sciences10.1016/j.ins.2023.120022658(120022)Online publication date: Feb-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGMOD Record
ACM SIGMOD Record  Volume 44, Issue 3
September 2015
57 pages
ISSN:0163-5808
DOI:10.1145/2854006
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 December 2015
Published in SIGMOD Volume 44, Issue 3

Check for updates

Qualifiers

  • Review-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)37
  • Downloads (Last 6 weeks)9
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Pasta: A Cost-Based Optimizer for Generating Pipelining Schedules for Dataflow DAGsProceedings of the ACM on Management of Data10.1145/36988322:6(1-26)Online publication date: 20-Dec-2024
  • (2024)Robust Query Optimization in the Era of Machine Learning: State-of-the-Art and Future Directions2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00408(5371-5375)Online publication date: 13-May-2024
  • (2024)RelJoin: Relative-cost-based selection of distributed join methods for query plan optimizationInformation Sciences10.1016/j.ins.2023.120022658(120022)Online publication date: Feb-2024
  • (2023)Rethink Query Optimization in HTAP DatabasesProceedings of the ACM on Management of Data10.1145/36267501:4(1-27)Online publication date: 12-Dec-2023
  • (2023)Tempura: a general cost-based optimizer framework for incremental data processing (Journal Version)The VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00785-132:6(1315-1342)Online publication date: 20-Mar-2023
  • (2022)Robust query processing for linked data fragmentsSemantic Web10.3233/SW-21288813:4(623-657)Online publication date: 31-May-2022
  • (2021)Robust Cardinality: a novel approach for cardinality prediction in SQL queriesJournal of the Brazilian Computer Society10.1186/s13173-021-00115-927:1Online publication date: 1-Sep-2021
  • (2020)TempuraProceedings of the VLDB Endowment10.14778/3421424.342142714:1(14-27)Online publication date: 1-Sep-2020
  • (2020)Cost- and Robustness-Based Query Optimization for Linked Data FragmentsThe Semantic Web – ISWC 202010.1007/978-3-030-62419-4_14(238-257)Online publication date: 2-Nov-2020
  • (2019)Deterministic and non‐deterministic query optimization techniques in the cloud computingConcurrency and Computation: Practice and Experience10.1002/cpe.524031:17Online publication date: 5-Mar-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media