Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Efficiently Computing Join Orders with Heuristic Search

Published: 30 May 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Join order optimization is one of the most fundamental problems in processing queries on relational data. It has been studied extensively for almost four decades now. Still, because of its NP hardness, no generally efficient solution exists and the problem remains an important topic of research. The scope of algorithms to compute join orders ranges from exhaustive enumeration, to combinatorics based on graph properties, to greedy search, to genetic algorithms, to recently investigated machine learning. A few works exist that use heuristic search to compute join orders. However, a theoretical argument why and how heuristic search is applicable to join order optimization is lacking.
    In this work, we investigate join order optimization via heuristic search. In particular, we provide a strong theoretical framework, in which we reduce join order optimization to the shortest path problem. We then thoroughly analyze the properties of this problem and the applicability of heuristic search. We devise crucial optimizations to make heuristic search tractable. We implement join ordering via heuristic search in a real DBMS and conduct an extensive empirical study. Our findings show that for star- and clique-shaped queries, heuristic search finds optimal plans an order of magnitude faster than current state of the art. Our suboptimal solutions further extend the cost/time Pareto frontier.

    Supplemental Material

    MP4 File
    Video Presentation of "Efficiently Computing Join Orders with Heuristic Search" for SIGMOD 2023
    PDF File
    Read me
    ZIP File
    Source Code

    References

    [1]
    Brian Babcock and Surajit Chaudhuri. 2005. Towards a robust query optimizer: a principled and practical approach. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data. 119--130.
    [2]
    Surajit Chaudhuri, Ravi Krishnamurthy, Spyros Potamianos, and Kyuseok Shim. 1995. Optimizing queries with materialized views. In Proceedings of the Eleventh International Conference on Data Engineering. IEEE, 190--200.
    [3]
    Sophie Cluet and Guido Moerkotte. 1995. On the complexity of generating optimal left-deep processing trees with cross products. In International Conference on Database Theory. Springer, 54--67.
    [4]
    Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2016. Introduction to Algorithms. The MIT Press.
    [5]
    Rina Dechter and Judea Pearl. 1985. Generalized best-first search strategies and the optimality of A. Journal of the ACM (JACM), Vol. 32, 3 (1985), 505--536.
    [6]
    David DeHaan and Frank Wm Tompa. 2007. Optimal top-down join enumeration. In Proceedings of the 2007 ACM SIGMOD international conference on Management of data. 785--796.
    [7]
    Edsger W Dijkstra et al. 1959. A note on two problems in connexion with graphs. Numerische mathematik, Vol. 1, 1 (1959), 269--271.
    [8]
    Stefan Edelkamp and Stefan Schrodl. 2011. Heuristic search: theory and applications. Elsevier.
    [9]
    Leonidas Fegaras. 1998. A new heuristic for optimizing large queries. In International Conference on Database and Expert Systems Applications. Springer, 726--735.
    [10]
    Pit Fender. 2014. Algorithms for Efficient Top-Down Join Enumeration. (2014).
    [11]
    Pit Fender and Guido Moerkotte. 2011a. A new, highly efficient, and easy to implement top-down join enumeration algorithm. In 2011 IEEE 27th International Conference on Data Engineering. IEEE, 864--875.
    [12]
    Pit Fender and Guido Moerkotte. 2011b. Reassessing top-down join enumeration. IEEE Transactions on Knowledge and Data Engineering, Vol. 24, 10 (2011), 1803--1818.
    [13]
    Frank Gray. 1953. Pulse code communication. US Patent 2,632,058.
    [14]
    Immanuel Haffner, Marcel Maltry, Joris Nix, Jens Dittrich, and Luca Gretscher. 2023. normalfont μtable. https://mutable.uni-saarland.de
    [15]
    Peter E. Hart, Nils J. Nilsson, and Bertram Raphael. 1968. A formal basis for the heuristic determination of minimum cost paths. IEEE transactions on Systems Science and Cybernetics, Vol. 4, 2 (1968), 100--107.
    [16]
    Toshihide Ibaraki and Tiko Kameda. 1984. On the optimal nesting order for computing n-relational joins. ACM Transactions on Database Systems (TODS), Vol. 9, 3 (1984), 482--502.
    [17]
    Navin Kabra and David J DeWitt. 1998. Efficient mid-query re-optimization of sub-optimal query execution plans. In Proceedings of the 1998 ACM SIGMOD international conference on Management of data. 106--117.
    [18]
    Ravi Krishnamurthy, Haran Boral, and Carlo Zaniolo. 1986. Optimization of Nonrecursive Queries. In VLDB, Vol. 86. 128--137.
    [19]
    Viktor Leis, Andrey Gubichev, Atanas Mirchev, Peter Boncz, Alfons Kemper, and Thomas Neumann. 2015. How good are query optimizers, really? Proceedings of the VLDB Endowment, Vol. 9, 3 (2015), 204--215.
    [20]
    Ryan Marcus, Parimarjan Negi, Hongzi Mao, Nesime Tatbul, Mohammad Alizadeh, and Tim Kraska. 2021. Bao: Making learned query optimization practical. In Proceedings of the 2021 International Conference on Management of Data. 1275--1288.
    [21]
    Ryan Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, and Nesime Tatbul23. 2019. Neo: A Learned Query Optimizer. Proceedings of the VLDB Endowment, Vol. 12, 11 (2019).
    [22]
    Guido Moerkotte and Thomas Neumann. 2006. Analysis of two existing and one new dynamic programming algorithm for the generation of optimal bushy join trees without cross products. In Proceedings of the 32nd international conference on Very large data bases. Citeseer, 930--941.
    [23]
    Parimarjan Negi, Matteo Interlandi, Ryan Marcus, Mohammad Alizadeh, Tim Kraska, Marc Friedman, and Alekh Jindal. 2021. Steering query optimizers: A practical take on big data workloads. In Proceedings of the 2021 International Conference on Management of Data. 2557--2569.
    [24]
    Thomas Neumann. 2009. Query simplification: graceful degradation for join-order optimization. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. 403--414.
    [25]
    Thomas Neumann and Bernhard Radke. 2018. Adaptive optimization of very large join queries. In Proceedings of the 2018 International Conference on Management of Data. 677--692.
    [26]
    Kenneth W Ng, Zhenghao Wang, Richard R Muntz, and Silvia Nittel. 1999. Dynamic query re-optimization. In Proceedings. Eleventh International Conference on Scientific and Statistical Database Management. IEEE, 264--273.
    [27]
    Kiyoshi Ono and Guy M. Lohman. 1990. Measuring the Complexity of Join Enumeration in Query Optimization. In VLDB, Vol. 97. 314--325.
    [28]
    Matthew Perron, Zeyuan Shang, Tim Kraska, and Michael Stonebraker. 2019. How I learned to stop worrying and love re-optimization. In 2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, 1758--1761.
    [29]
    Stuart Russell and Peter Norvig. 2020. Artificial Intelligence: A Modern Approach 4 ed.). Prentice Hall.
    [30]
    Sa"id Salhi. 2017. Heuristic search: The emerging science of problem solving. Springer.
    [31]
    Alexander Schrijver. 2004. Combinatorial optimization: Polyhedra and efficiency (algorithms and combinatorics). Journal-Operational Research Society, Vol. 55, 9 (2004), 1018--1018.
    [32]
    P. Griffiths Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, and Thomas G. Price. 1989. Access Path Selection in a Relational Database Management System. In Readings in Artificial Intelligence and Databases. Elsevier, 511--522.
    [33]
    Timos K Sellis. 1988. Multiple-query optimization. ACM Transactions on Database Systems (TODS), Vol. 13, 1 (1988), 23--52.
    [34]
    Michael Steinbrunn, Guido Moerkotte, and Alfons Kemper. 1997. Heuristic and randomized optimization for the join ordering problem. The VLDB Journal, Vol. 6, 3 (1997), 191--208.
    [35]
    Bennet Vance. 1998. Join-order optimization with Cartesian products. Oregon Graduate Institute of Science and Technology.
    [36]
    Bennet Vance and David Maier. 1996. Rapid bushy join-order optimization with cartesian products. ACM SIGMOD Record, Vol. 25, 2 (1996), 35--46.
    [37]
    Florian Waas and Arjan Pellenkoft. 2000. Join order selection (good enough is easy). In British National Conference on Databases. Springer, 51--67.
    [38]
    Wentao Wu, Jeffrey F Naughton, and Harneet Singh. 2016. Sampling-based query re-optimization. In Proceedings of the 2016 International Conference on Management of Data. 1721--1736.

    Cited By

    View all
    • (2024)POLAR: Adaptive and Non-invasive Join Order Selection via Plans of Least ResistanceProceedings of the VLDB Endowment10.14778/3648160.364817517:6(1350-1363)Online publication date: 3-May-2024

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 1, Issue 1
    PACMMOD
    May 2023
    2807 pages
    EISSN:2836-6573
    DOI:10.1145/3603164
    Issue’s Table of Contents
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 May 2023
    Published in PACMMOD Volume 1, Issue 1

    Badges

    Author Tags

    1. join ordering
    2. query optimization
    3. query planning

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)403
    • Downloads (Last 6 weeks)48
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)POLAR: Adaptive and Non-invasive Join Order Selection via Plans of Least ResistanceProceedings of the VLDB Endowment10.14778/3648160.364817517:6(1350-1363)Online publication date: 3-May-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media