Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2065003.2065015acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Aggregation strategies for columnar in-memory databases in a mixed workload

Published: 28 October 2011 Publication History
  • Get Citation Alerts
  • Abstract

    The recent trend towards analytics on operational data has led to an approach of reunifying online transactional processing and online analytical processing in one single database. The advent of columnar in-memory databases makes this viable and feasible as expensive join and aggregation operations can be performed with superior performance compared to traditional row-oriented databases. This has led to the radical proposal of abandoning materialized aggregate tables and calculate all aggregations on the fly.
    This PhD research project investigates factors that have an influence on the aggregation performance in columnar in-memory databases. Based on the identified factors, we aim to evaluate different cost model approaches, that are subject to validation with real-life data of large industry customers and their mixed workloads. The goal of this project is the design and implementation of an aggregation engine that decides, based on the data and application characteristics, the historic and current workload and other cost-relevant factors, whether it is beneficial with regards to query performance, but also considering aggregation view maintenance costs, to materialize an aggregate or not.

    References

    [1]
    D. Abadi, S. Madden, and M. Ferreira. Integrating compression and execution in column-oriented database systems. SIGMOD, 2006.
    [2]
    D. Abadi, S. Madden, and N. Hachem. Column-stores vs. row-stores: how different are they really? SIGMOD, 2008.
    [3]
    D. Abadi, D. Myers, D. DeWitt, and S. Madden. Materialization strategies in a column-oriented DBMS. In ICDE, pages 466--475, 2007.
    [4]
    D. Agrawal, A. El Abbadi, A. Singh, and T. Yurek. Efficient view maintenance at data warehouses. In SIGMOD, 1997.
    [5]
    A. Ailamaki, D. DeWitt, M. Hill, and D. Wood. DBMSs on a Modern Processor: Where Does Time Go? In VLDB, 1999.
    [6]
    P. Boncz, M. Kersten, and S. Manegold. Breaking the memory wall in MonetDB. Communications of the ACM, 51:77--85, 2008.
    [7]
    S. Chaudhuri and U. Dayal. An overview of data warehousing and OLAP technology. ACM SIGMOD Record, 26(1):65--74, 1997.
    [8]
    J. Cieslewicz and K. A. Ross. Adaptive aggregation on chip multiprocessors. In VLDB, 2007.
    [9]
    E. Codd. A relational model of data for large shared data banks. Communications of the ACM, 1970.
    [10]
    U. Dayal, H. Kuno, J. Wiener, K. Wilkinson, A. Ganapathi, and S. Krompass. Managing operational business intelligence workloads. In ACM SIGOPS, 2009.
    [11]
    A. Ganapathi, H. Kuno, U. Dayal, J. Wiener, A. Fox, M. Jordan, and D. Patterson. Predicting multiple metrics for queries: Better decisions enabled by machine learning. In ICDE, pages 592--603, 2009.
    [12]
    H. Garcia-Molina and K. Salem. Main memory database systems: an overview. Transactions on Knowledge and Data Engineering, 4(6):509--516, 1992.
    [13]
    G. Graefe. Query evaluation techniques for large databases. ACM Computing Surveys, 25(2):73--169, 1993.
    [14]
    J. Gray and Bosworth. Data cube: a relational aggregation operator generalizing GROUP-BY, CROSS-TAB, and SUB-TOTALS. In ICDE, pages 152--159, 1996.
    [15]
    M. Grund, J. Krüger, H. Plattner, A. Zeier, P. Cudre-Mauroux, and S. Madden. HYRISE: a main memory hybrid storage engine. In PVLDB, 2010.
    [16]
    A. Gupta, V. Harinarayan, and D. Quass. Aggregate-query processing in data warehousing environments. VLDB, 1995.
    [17]
    H. Gupta and S. Mumick. Selection of views to materialize under a maintenance cost constraint. ICDT, 1999.
    [18]
    A. Y. Halevy. Answering queries using views: A survey. The VLDB Journal, 10(4):270--294, 2001.
    [19]
    J. Hellerstein and P. Haas. Online aggregation. In SIGMOD, 1997.
    [20]
    W. Hou and G. Ozsoyoglu. Processing aggregate relational queries with hard time constraints. ACM SIGMOD Record, 1989.
    [21]
    H. Kuno, U. Dayal, J. Wiener, and K. Wilkinson. Managing Dynamic Mixed Workloads for Operational Business Intelligence. In DNIS, pages 11--26, 2010.
    [22]
    J. Li and D. Rotem. Aggregation algorithms for very large compressed data warehouses. In VLDB, 1999.
    [23]
    S. Listgarten and M.-A. Naimat. Modelling Costs for a MM-DBMS. In Real-Time Databases, Issues and Applications (RTDB), pages 72--78, 1996.
    [24]
    S. Manegold, P. Boncz, and M. Kersten. Generic database cost models for hierarchical memory systems. In VLDB, 2002.
    [25]
    V. Markl and G. Lohman. Learning table access cardinalities with LEO. In SIGMOD, 2002.
    [26]
    H. Plattner. A common database approach for OLTP and OLAP using an in-memory column database. In SIGMOD, 2009.
    [27]
    H. Plattner and A. Zeier. In-Memory Data Management: An Inection Point for Enterprise Applications. Springer, 2011.
    [28]
    J. Smith and D. Smith. Database abstractions: aggregation. ACM Transactions on Database Systems, 1977.
    [29]
    D. Srivastava, S. Dar, H. Jagadish, and A. Levy. Answering queries with aggregation using views. In VLDB, 1996.
    [30]
    D. Taniar, C. Leung, J. Rahayu, and S. Goel. High-Performance Parallel Database Processing and Grid Databases. John Wiley & Sons, 2008.
    [31]
    C. Tinnefeld, S. Müller, H. Kaltegärtner, S. Hillig, L. Butzmann, D. Eickhoff, S. Klkauck, D. Taschik, B. Wagner, O. Xylander, A. Zeier, H. Plattner, and C. Tosun. Available-To-Promise on an In-Memory Column Store. In BTW, pages 667--686, 2011.
    [32]
    N. Zhang, P. J. Haas, V. Josifovski, G. M. Lohman, and C. Zhang. Statistical learning techniques for costing XML queries. In VLDB, 2005.

    Cited By

    View all
    • (2015)New Research Directions in Knowledge Discovery and Allied SpheresACM SIGKDD Explorations Newsletter10.1145/2783702.278370816:2(46-49)Online publication date: 21-May-2015
    • (2011)PIKM 2011Proceedings of the 20th ACM international conference on Information and knowledge management10.1145/2063576.2064049(2633-2634)Online publication date: 24-Oct-2011

    Index Terms

    1. Aggregation strategies for columnar in-memory databases in a mixed workload

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      PIKM '11: Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
      October 2011
      100 pages
      ISBN:9781450309530
      DOI:10.1145/2065003
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 28 October 2011

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. column store
      2. cost model
      3. data aggregation
      4. in-memory database
      5. materialized view
      6. olap
      7. oltp

      Qualifiers

      • Research-article

      Conference

      CIKM '11
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 25 of 62 submissions, 40%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)13
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 11 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2015)New Research Directions in Knowledge Discovery and Allied SpheresACM SIGKDD Explorations Newsletter10.1145/2783702.278370816:2(46-49)Online publication date: 21-May-2015
      • (2011)PIKM 2011Proceedings of the 20th ACM international conference on Information and knowledge management10.1145/2063576.2064049(2633-2634)Online publication date: 24-Oct-2011

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media