Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3410566.3410582acmotherconferencesArticle/Chapter ViewAbstractPublication PagesideasConference Proceedingsconference-collections
research-article

HYRAQ: optimizing large-scale analytical queries through dynamic hypergraphs

Published: 25 August 2020 Publication History
  • Get Citation Alerts
  • Abstract

    In critical situations, making quick and precise decisions requires a rapid execution of a large amount of concurrent navigational and exploratory queries over collected data stored in repositories such as data warehouses. To satisfy the decision-maker's requirement, a deep understanding of the properties of these queries is necessary. In addition to their <u>large-scale</u>, they are <u>ad-hoc</u>, <u>dynamic</u> and <u>highly interacted</u>. By a quick analysis of these properties, we figure out that the first three are factual whereas the last one is behavioral. The literature has widely reported that the interaction of analytical queries has a crucial impact on selecting optimization structures (e.g., materialized views) in data storage systems. By keeping these four properties in mind, it becomes a necessity to find scalable and efficient data structures to simultaneously model them for better optimization of large-scale queries. In this paper, we first show the crucial role of the interaction phenomenon in optimizing concurrent data and mining queries by identifying its limited capacity in considering all factual properties. Secondly, we propose a dynamic hypergraph as a data structure to manage the four above properties and we show its great contribution in selecting materialized views. Finally, intensive experiments are conducted to evaluate the efficiency of our proposal and its connectivity with a commercial DBMS.

    References

    [1]
    R.Agrawal and R.Srikant. Fast algorithms for mining association rules in large databases. In VLDB, pages 487--499. ACM, 1994.
    [2]
    L.Bellatreche and A.Kerkad. Query interaction based approach for horizontal data partitioning. IJDWM., 11(2): 44--61, 2015
    [3]
    D.Benvenuto, M.Giovanetti, L. Vassallo, S. Angeletti and M. Ciccozzi. Application of the arima model on the covid-2019 epidemic dataset. Data in Brief, 2020.
    [4]
    P.Boinski, M.Wojciechowski and M.Zakrzewicz. A greedy approach to concurrent processing of frequent itemset queries. In DaWaK, pages 292--301. Springer, 2006.
    [5]
    A.Boukorca. Hypergraphs in the Service of Very Large Scale Query Optimization. Application. Phd thesis, Ecole nationale supérieure de mécanique et d'aérotechnique, Chasseneuil-du-Poitou, Poitiers, 2016.
    [6]
    A.Boukorca, L.Bellatreche and A.Cuzzocrea. SLEMAS: an approach for selecting materialized views under query scheduling constraints. In COMAD, pages 66--73. Computer Society of India, 2014.
    [7]
    A.Boukorca, L.Bellatreche, S.B.Senouci and Z.Faget. Coupling materialized view selection to multi query optimization: Hyper graph approach. IJDWM., 11(2): 62--84, 2015.
    [8]
    A.Bretto. Hypergraph Theory: An Introduction. Springer, 2013.
    [9]
    J.Camacho-Rodríguez, et al. Apache hive: From mapreduce to enterprise-grade big data warehousing. In SIGMOD, pages 1773--1786. ACM, 2019.
    [10]
    V.Ü.Çatalyürek and C.Aykanat. Decomposing irregularly sparse matrices for parallel matrix-vector multiplication. In Third International Workshop on Parallel Algorithms for Irregularly Structured Problems (IRREGULAR), pages 75--86. Springer, 1996.
    [11]
    U.S.Chakravarthy and J.Minker. Multiple query processing in deductive databases using query graphs. In VLDB, pages 384--391. ACM, 1986.
    [12]
    J.Chen and H.Wang. Guest editorial: Big data infrastructure I. IEEE Trans. Big Data., 4(2): 148--149, 2018.
    [13]
    E.Dong, H.Du, and L.Gardner. An interactive web-based dashboard to track covid-19 in real time. The Lancet infectious diseases., 2020.
    [14]
    H.Gupta and I.S.Mumick. Selection of views to materialize under a maintenance cost constraint. In ICDT, pages 453--470. Springer, 1999.
    [15]
    V.Harinarayan, A.Rajaraman and J.D.Ullman. Implementing data cubes efficiently. In ACM SIGMOD, pages 205--216. ACM, 1996.
    [16]
    S.Jain, D.Moritz, D.Halperin, B.Howe and E.Lazowska. Sqlshare: Results from a multi-year sql-as-a-service experiment. In ACM SIGMOD, pages 281--293. ACM, 2016.
    [17]
    A.Kerkad, L.Bellatreche and D.Geniet. Queen-bee: Query interaction-aware for buffer allocation and scheduling problem. In DaWaK, pages 156--167. Springer, 2012.
    [18]
    Y.Kotidis and N.Roussopoulos. Dynamat: A dynamic view management system for data warehouses. In ACM SIGMOD, pages 371--382. ACM, 1999.
    [19]
    L.Liu and M.T. Özsu., Eds. Encyclopedia of Database Systems, 2nd Edition. Springer, 2018.
    [20]
    T.Phan and W.Li. Dynamic materialization of query views for data warehouse workloads. In ICDE, pages 436--445. IEEE, 2008.
    [21]
    R.Bouchakri and L.Bellatreche. On simplifying integrated physical database design. In ADBIS, pages 333--346. Springer, 2011.
    [22]
    A.Roukh, L.Bellatreche, S.Bouarar, and A.Boukorca. Eco-physic: Eco-physical design initiative for very large databases. Information Systems, 68: 44--63, 2017.
    [23]
    P.Roy and S.Sudarshan. Multi-query optimization. In in [19], 2018.
    [24]
    T.Sellis. Multiple-query optimization. ACM TODS, 13(1): 23--52, 1988.
    [25]
    T.Sellis and S.Ghosh. On the multiple query optimization problem. IEEE Transactions on Knowledge and Data Engineering, 2(2): 262--266, 1990.
    [26]
    K.Shim, T.K.Sellis and D.S.Nau. Improvements on a heuristic algorithm for multiple-query optimization. Data Knowl. Eng. 12(2): 197--222, 1994
    [27]
    B.Skiera, L.Jrgensmeier, K.Stowe and I.Gurevych. How to best predict the daily number of new infections of covid-19. arXiv e-prints, 2020.
    [28]
    M.K.Sohrabi and H.Azgomi. Evolutionary game theory approach to materialized view selection in data warehouses. Knowl. Based Syst. 63: 558--571, 2019
    [29]
    M.Wojciechowski, K.Galecki and K.Gawronek. Three strategies for concurrent processing of frequent itemset queries using fp-growth. In KDID, pages 240--258. Springer, 2006.
    [30]
    J.Yang, K.Karlapalem and Q.Li. Algorithms for materialized view design in data warehousing environment. In VLDB, pages 136--145, 1997
    [31]
    A.Yzelman and R.H.Bisseling. Cache-oblivious sparse matrix-vector multiplication by using sparse matrix partitioning methods. SIAM Journal on Scientific Computing. 31(4): 3128--3154, 2009.

    Cited By

    View all
    • (2022)ProRes: Proactive re-selection of materialized viewsComputer Science and Information Systems10.2298/CSIS210606003M19:2(735-762)Online publication date: 2022
    • (2022)AutoView: An Autonomous Materialized View Management System with Encoder-ReducerIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3163195(1-1)Online publication date: 2022
    • (2021)Selecting Subexpressions to Materialize for Dynamic Large-Scale WorkloadsBig Data Analytics and Knowledge Discovery10.1007/978-3-030-86534-4_4(39-51)Online publication date: 5-Sep-2021

    Index Terms

    1. HYRAQ: optimizing large-scale analytical queries through dynamic hypergraphs

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      IDEAS '20: Proceedings of the 24th Symposium on International Database Engineering & Applications
      August 2020
      252 pages
      ISBN:9781450375030
      DOI:10.1145/3410566
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 August 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. dynamic hypergraphs & views selection
      2. query interaction
      3. query volume

      Qualifiers

      • Research-article

      Conference

      IDEAS 2020

      Acceptance Rates

      IDEAS '20 Paper Acceptance Rate 27 of 57 submissions, 47%;
      Overall Acceptance Rate 74 of 210 submissions, 35%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)11
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 27 Jul 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)ProRes: Proactive re-selection of materialized viewsComputer Science and Information Systems10.2298/CSIS210606003M19:2(735-762)Online publication date: 2022
      • (2022)AutoView: An Autonomous Materialized View Management System with Encoder-ReducerIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3163195(1-1)Online publication date: 2022
      • (2021)Selecting Subexpressions to Materialize for Dynamic Large-Scale WorkloadsBig Data Analytics and Knowledge Discovery10.1007/978-3-030-86534-4_4(39-51)Online publication date: 5-Sep-2021

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media