Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/543613.543637acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
Article

Fast algorithms for hierarchical range histogram construction

Published: 03 June 2002 Publication History

Abstract

Data Warehousing and OLAP applications typically view data an having multiple logical dimensions (e.g., product, location) with natural hierarchies defined on each dimension. OLAP queries usually involve hierarchical selections on some of the dimensions, and often aggregate measure attributes (e.g., sales, volume). Accurately estimating the distribution of measure attributes, under hierarchical selections, is important in a variety of scenarios, including approximate query evaluation and cost-based optimization of queries.In this paper, we propose fast (near linear time) algorithms for the problem of approximating the distribution of measure attributes with hierarchies defined on them, using histograms. Our algorithms are based on dynamic programming and a novel notion of sparse intervals that we introduce, and are the first practical algorithms for this problem. They effectively trade space for construction time without compromising histogram accuracy. We complement our analytical contributions with an experimental evaluation using real data sets, demonstrating the superiority of our approach.

References

[1]
A. C. Gilbert, Y. Kotidis, S. Muthukrishnan, and M. Strauss. Optimal and approximate computation of summary statistics for range aggregates. In Proceedings of the ACM Symposium on Principles of Database Systems, 2001.
[2]
Y. Ioannidis. Universality of serial histograms. In Proceedings of the International Conference on Very Large Databases, pages 256-267, 1993.
[3]
Y. Ioannidis and S. Christodoulakis. Optimal histograms for limiting worst-case error propagation in the size of join results. ACM Trans. Database Syst., 18(4):709-748, Dec. 1993.
[4]
Y. Ioannidis and V. Poosala. Balancing histogram optimality and practicality for query result size estimation. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 233-244, 1995.
[5]
H. V. Jagadish, N. Koudas, S. Muthukrishnan, V. Poosala, K. Sevcik, and T. Suel. Optimal histograms with quality guarantees. In Proceedings of the International Conference on Very Large Databases, pages 275-286, 1998.
[6]
H. V. Jagadish, L. V. S. Lakshmanan, and D. Srivastava. What can hierarchies do for data warehouses? In Proceedings of the International Conference on Very Large Databases, Edinburgh, Scotland, UK, Sept. 1999.
[7]
N. Koudas, S. Muthukrishnan, and D. Srivastava. Optimal histograms for hierarchical range queries. In Proceedings of the ACM Symposium on Principles of Database Systems, pages 196-204, 2000.
[8]
M. V. Mannino, P. Chu, and T. Sager. Statistical profile estimation in database sysems. ACM Computing Surveys, 20(3):191-221, Sept. 1988.
[9]
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 448-459, 1998.
[10]
M. Muralikrishna and D. Dewitt. Equi-depth histograms for estimating for selectivity factors for multi-dimensional queries. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 28-36, 1988.
[11]
V. Poosala, Y. Ioannidis, P. Haas, and E. Shekita. Improved histograms for selectivity estimation of range queries. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 294-305, 1996.
[12]
P. G. Selinger, M. Astrahan, D. Chamberlin, R. Lorie, and T. Price. Access path selection in a relational database management system. In Proceedings of the ACM SIGMOD Conference on Management of Data, June 1979.

Cited By

View all
  • (2020)QuickSel: Quick Selectivity Learning with Mixture ModelsProceedings of the 2020 ACM SIGMOD International Conference on Management of Data10.1145/3318464.3389727(1017-1033)Online publication date: 11-Jun-2020
  • (2020)Approximate Query Processing for Data Exploration using Deep Generative Models2020 IEEE 36th International Conference on Data Engineering (ICDE)10.1109/ICDE48307.2020.00117(1309-1320)Online publication date: Apr-2020
  • (2011)Approximate dynamic programming using halfspace queries and multiscale Monge decompositionProceedings of the twenty-second annual ACM-SIAM symposium on Discrete algorithms10.5555/2133036.2133165(1675-1682)Online publication date: 23-Jan-2011
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '02: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
June 2002
311 pages
ISBN:1581135076
DOI:10.1145/543613
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2002

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGMOD/PODS02

Acceptance Rates

PODS '02 Paper Acceptance Rate 24 of 109 submissions, 22%;
Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2020)QuickSel: Quick Selectivity Learning with Mixture ModelsProceedings of the 2020 ACM SIGMOD International Conference on Management of Data10.1145/3318464.3389727(1017-1033)Online publication date: 11-Jun-2020
  • (2020)Approximate Query Processing for Data Exploration using Deep Generative Models2020 IEEE 36th International Conference on Data Engineering (ICDE)10.1109/ICDE48307.2020.00117(1309-1320)Online publication date: Apr-2020
  • (2011)Approximate dynamic programming using halfspace queries and multiscale Monge decompositionProceedings of the twenty-second annual ACM-SIAM symposium on Discrete algorithms10.5555/2133036.2133165(1675-1682)Online publication date: 23-Jan-2011
  • (2009)Histograms for OLAP and Data-Stream QueriesEncyclopedia of Data Warehousing and Mining, Second Edition10.4018/978-1-60566-010-3.ch151(976-981)Online publication date: 2009
  • (2009)Consistent histograms in the presence of distinct value countsProceedings of the VLDB Endowment10.14778/1687627.16877232:1(850-861)Online publication date: 1-Aug-2009
  • (2009)Multiplicative synopses for relative-error metricsProceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology10.1145/1516360.1516447(756-767)Online publication date: 24-Mar-2009
  • (2008)Hierarchical synopses with optimal error guaranteesACM Transactions on Database Systems10.1145/1386118.138612433:3(1-53)Online publication date: 3-Sep-2008
  • (2008)Ad-hoc aggregations of ranked lists in the presence of hierarchiesProceedings of the 2008 ACM SIGMOD international conference on Management of data10.1145/1376616.1376626(67-78)Online publication date: 9-Jun-2008
  • (2008)Extreme visualizationProceedings of the 2008 ACM SIGMOD international conference on Management of data10.1145/1376616.1376618(3-12)Online publication date: 9-Jun-2008
  • (2008)On the space---time of optimal, approximate and streaming algorithms for synopsis construction problemsThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-007-0083-917:6(1509-1535)Online publication date: 1-Nov-2008
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media