Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3410566.3410582acmotherconferencesArticle/Chapter ViewAbstractPublication PagesideasConference Proceedingsconference-collections
research-article

HYRAQ: optimizing large-scale analytical queries through dynamic hypergraphs

Published: 25 August 2020 Publication History

Abstract

In critical situations, making quick and precise decisions requires a rapid execution of a large amount of concurrent navigational and exploratory queries over collected data stored in repositories such as data warehouses. To satisfy the decision-maker's requirement, a deep understanding of the properties of these queries is necessary. In addition to their <u>large-scale</u>, they are <u>ad-hoc</u>, <u>dynamic</u> and <u>highly interacted</u>. By a quick analysis of these properties, we figure out that the first three are factual whereas the last one is behavioral. The literature has widely reported that the interaction of analytical queries has a crucial impact on selecting optimization structures (e.g., materialized views) in data storage systems. By keeping these four properties in mind, it becomes a necessity to find scalable and efficient data structures to simultaneously model them for better optimization of large-scale queries. In this paper, we first show the crucial role of the interaction phenomenon in optimizing concurrent data and mining queries by identifying its limited capacity in considering all factual properties. Secondly, we propose a dynamic hypergraph as a data structure to manage the four above properties and we show its great contribution in selecting materialized views. Finally, intensive experiments are conducted to evaluate the efficiency of our proposal and its connectivity with a commercial DBMS.

References

[1]
R.Agrawal and R.Srikant. Fast algorithms for mining association rules in large databases. In VLDB, pages 487--499. ACM, 1994.
[2]
L.Bellatreche and A.Kerkad. Query interaction based approach for horizontal data partitioning. IJDWM., 11(2): 44--61, 2015
[3]
D.Benvenuto, M.Giovanetti, L. Vassallo, S. Angeletti and M. Ciccozzi. Application of the arima model on the covid-2019 epidemic dataset. Data in Brief, 2020.
[4]
P.Boinski, M.Wojciechowski and M.Zakrzewicz. A greedy approach to concurrent processing of frequent itemset queries. In DaWaK, pages 292--301. Springer, 2006.
[5]
A.Boukorca. Hypergraphs in the Service of Very Large Scale Query Optimization. Application. Phd thesis, Ecole nationale supérieure de mécanique et d'aérotechnique, Chasseneuil-du-Poitou, Poitiers, 2016.
[6]
A.Boukorca, L.Bellatreche and A.Cuzzocrea. SLEMAS: an approach for selecting materialized views under query scheduling constraints. In COMAD, pages 66--73. Computer Society of India, 2014.
[7]
A.Boukorca, L.Bellatreche, S.B.Senouci and Z.Faget. Coupling materialized view selection to multi query optimization: Hyper graph approach. IJDWM., 11(2): 62--84, 2015.
[8]
A.Bretto. Hypergraph Theory: An Introduction. Springer, 2013.
[9]
J.Camacho-Rodríguez, et al. Apache hive: From mapreduce to enterprise-grade big data warehousing. In SIGMOD, pages 1773--1786. ACM, 2019.
[10]
V.Ü.Çatalyürek and C.Aykanat. Decomposing irregularly sparse matrices for parallel matrix-vector multiplication. In Third International Workshop on Parallel Algorithms for Irregularly Structured Problems (IRREGULAR), pages 75--86. Springer, 1996.
[11]
U.S.Chakravarthy and J.Minker. Multiple query processing in deductive databases using query graphs. In VLDB, pages 384--391. ACM, 1986.
[12]
J.Chen and H.Wang. Guest editorial: Big data infrastructure I. IEEE Trans. Big Data., 4(2): 148--149, 2018.
[13]
E.Dong, H.Du, and L.Gardner. An interactive web-based dashboard to track covid-19 in real time. The Lancet infectious diseases., 2020.
[14]
H.Gupta and I.S.Mumick. Selection of views to materialize under a maintenance cost constraint. In ICDT, pages 453--470. Springer, 1999.
[15]
V.Harinarayan, A.Rajaraman and J.D.Ullman. Implementing data cubes efficiently. In ACM SIGMOD, pages 205--216. ACM, 1996.
[16]
S.Jain, D.Moritz, D.Halperin, B.Howe and E.Lazowska. Sqlshare: Results from a multi-year sql-as-a-service experiment. In ACM SIGMOD, pages 281--293. ACM, 2016.
[17]
A.Kerkad, L.Bellatreche and D.Geniet. Queen-bee: Query interaction-aware for buffer allocation and scheduling problem. In DaWaK, pages 156--167. Springer, 2012.
[18]
Y.Kotidis and N.Roussopoulos. Dynamat: A dynamic view management system for data warehouses. In ACM SIGMOD, pages 371--382. ACM, 1999.
[19]
L.Liu and M.T. Özsu., Eds. Encyclopedia of Database Systems, 2nd Edition. Springer, 2018.
[20]
T.Phan and W.Li. Dynamic materialization of query views for data warehouse workloads. In ICDE, pages 436--445. IEEE, 2008.
[21]
R.Bouchakri and L.Bellatreche. On simplifying integrated physical database design. In ADBIS, pages 333--346. Springer, 2011.
[22]
A.Roukh, L.Bellatreche, S.Bouarar, and A.Boukorca. Eco-physic: Eco-physical design initiative for very large databases. Information Systems, 68: 44--63, 2017.
[23]
P.Roy and S.Sudarshan. Multi-query optimization. In in [19], 2018.
[24]
T.Sellis. Multiple-query optimization. ACM TODS, 13(1): 23--52, 1988.
[25]
T.Sellis and S.Ghosh. On the multiple query optimization problem. IEEE Transactions on Knowledge and Data Engineering, 2(2): 262--266, 1990.
[26]
K.Shim, T.K.Sellis and D.S.Nau. Improvements on a heuristic algorithm for multiple-query optimization. Data Knowl. Eng. 12(2): 197--222, 1994
[27]
B.Skiera, L.Jrgensmeier, K.Stowe and I.Gurevych. How to best predict the daily number of new infections of covid-19. arXiv e-prints, 2020.
[28]
M.K.Sohrabi and H.Azgomi. Evolutionary game theory approach to materialized view selection in data warehouses. Knowl. Based Syst. 63: 558--571, 2019
[29]
M.Wojciechowski, K.Galecki and K.Gawronek. Three strategies for concurrent processing of frequent itemset queries using fp-growth. In KDID, pages 240--258. Springer, 2006.
[30]
J.Yang, K.Karlapalem and Q.Li. Algorithms for materialized view design in data warehousing environment. In VLDB, pages 136--145, 1997
[31]
A.Yzelman and R.H.Bisseling. Cache-oblivious sparse matrix-vector multiplication by using sparse matrix partitioning methods. SIAM Journal on Scientific Computing. 31(4): 3128--3154, 2009.

Cited By

View all
  • (2022)ProRes: Proactive re-selection of materialized viewsComputer Science and Information Systems10.2298/CSIS210606003M19:2(735-762)Online publication date: 2022
  • (2022)AutoView: An Autonomous Materialized View Management System with Encoder-ReducerIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3163195(1-1)Online publication date: 2022
  • (2021)Selecting Subexpressions to Materialize for Dynamic Large-Scale WorkloadsBig Data Analytics and Knowledge Discovery10.1007/978-3-030-86534-4_4(39-51)Online publication date: 5-Sep-2021

Index Terms

  1. HYRAQ: optimizing large-scale analytical queries through dynamic hypergraphs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    IDEAS '20: Proceedings of the 24th Symposium on International Database Engineering & Applications
    August 2020
    252 pages
    ISBN:9781450375030
    DOI:10.1145/3410566
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 August 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. dynamic hypergraphs & views selection
    2. query interaction
    3. query volume

    Qualifiers

    • Research-article

    Conference

    IDEAS 2020

    Acceptance Rates

    IDEAS '20 Paper Acceptance Rate 27 of 57 submissions, 47%;
    Overall Acceptance Rate 74 of 210 submissions, 35%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)10
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 30 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)ProRes: Proactive re-selection of materialized viewsComputer Science and Information Systems10.2298/CSIS210606003M19:2(735-762)Online publication date: 2022
    • (2022)AutoView: An Autonomous Materialized View Management System with Encoder-ReducerIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3163195(1-1)Online publication date: 2022
    • (2021)Selecting Subexpressions to Materialize for Dynamic Large-Scale WorkloadsBig Data Analytics and Knowledge Discovery10.1007/978-3-030-86534-4_4(39-51)Online publication date: 5-Sep-2021

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media