Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1287369.1287436dlproceedingsArticle/Chapter ViewAbstractPublication PagesvldbConference Proceedingsconference-collections
Article

Quotient cube: how to summarize the semantics of a data cube

Published: 20 August 2002 Publication History

Abstract

Partitioning a data cube into sets of cells with "similar behavior" often better exposes the semantics in the cube. E.g., if we find that average boots sales in the West 10th store of Walmart was the same for winter as for the whole year, it signifies something interesting about the trend of boots sales in that location in that year. In this paper, we are interested in finding succinct summaries of the data cube, exploiting regularities present in the cube, with a clear basis. We would like the summary: (i) to be as concise as possible, (ii) to itself form a lattice preserving the rollup/drilldown semantics of the cube, and (iii) to allow the original cube to be fully recovered. We illustrate the utility of solving this problem and discuss the inherent challenges. We develop techniques for partitioning cube cells for obtaining succinct summaries, and introduce the quotient cube. We give efficient algorithms for computing it from a base table. For monotone aggregate functions (e.g., COUNT, MIN, MAX, SUM on non-negative measures, etc.), our solution is optimal (i.e., quotient cube of the least size). For nonmonotone functions (e.g., AVG), we obtain a locally optimal solution. We experimentally demonstrate the efficacy of our ideas and techniques and the scalability of our algorithms.

References

[1]
{1} R. Agrawal & R. Srikant. Fast algorithms for mining association rules in large databases. VLDB'94:487-499.
[2]
{2} S. Agarwal, et al. On the computation of multidimensional aggergates. VLDB'96:506-521.
[3]
{3} D. Barbara & M. Sullivan. Quasi-cubes: Exploiting approximation in multidimensional databases. SIGMOD Record, 26:12-17, 1997.
[4]
{4} D. Barbara & X. Wu. Using loglinear models to compress datacube. WAIM'00:311-322.
[5]
{5} K. Beyer & R. Ramakrishnan. Bottom-up computation of sparse and iceberg cubes. SIGMOD'99:359-370.
[6]
{6} G. Brikhoff, Lattice Theory, 2nd ed., New York, American Math. Soc. (Col. Pub. vol. 25), 1948.
[7]
{7} S. Geffner et al. Relative prefix sums: An efficient approach for querying dynamic OLAP data cubes. ICDE'99:328-335.
[8]
{8} J. Gray et al., Data Cube. A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. ICDE'96:152-159.
[9]
{9} C. Hahn et al. Edited synoptic cloud reports from ships and land stations over the globe, 1982-1991. cdiac.est.ornl.gov/ftp/ndp026b/SEP85L.Z, 1994.
[10]
{10} C-T. Ho et al. Partial-sum queries in data cubes using covering codes. PODS'97:228-237.
[11]
{11} V. Harinarayan et al. Implementing data cubes efficiently. SIGMOD'96:205-216.
[12]
{12} T. Imielinski et al. Cubegrades: Generalizing Association Rules. Tec. Rep., Rutgers U., Aug. 2000.
[13]
{13} L. V. S. Lakshmanan et al., Quotient Cube: How to summarize the semantics of a data cube. Tech. Rep. UBC, Nov.'01.
[14]
{14} K. Ross & D. Srivastava. Fast computation of sparse datacubes. VLDB'97:116-125.
[15]
{15} G. Sathe & S. Sarawagi. Intelligent Rollups in Multi-dimensional OLAP Data. VLDB'01:531-540.
[16]
{16} J. Shanmugasundaram et al. Compressed Data Cubes for OLAP Aggreagate Query Approximation on Continuous Dimensions. KDD'99:223-232.
[17]
{17} Y. Sismanis et al. Dwarf: Shrinking the Petacube. SIGMOD'02.
[18]
{18} J. S. Vitter et al. Data cube approximation and histograms via wavelets. CIKM'98:96-104.
[19]
{19} W. Wang et al. Condensed cube: An effecitve approach to reducing data cube size. ICDE'02.
[20]
{20} Y. Zhao et al. An arry-based algorithms for simultaneous multidimensional aggregates. SIGMOD'97.
[21]
{21} G. K. Zipf. Human Behavior and The Principle of Least Effort. Addison-Wesley, 1949.

Cited By

View all
  • (2022)High-Dimensional Data CubesProceedings of the VLDB Endowment10.14778/3565838.356583915:13(3828-3840)Online publication date: 1-Sep-2022
  • (2017)Multidimensional Business Benchmarking Analysis on Data WarehousesInternational Journal of Data Warehousing and Mining10.5555/3077757.307776013:1(51-75)Online publication date: 1-Jan-2017
  • (2017)PopUp-CubingProceedings of the Fourth IEEE/ACM International Conference on Big Data Computing, Applications and Technologies10.1145/3148055.3148061(11-20)Online publication date: 5-Dec-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
VLDB '02: Proceedings of the 28th international conference on Very Large Data Bases
August 2002
1110 pages

Publisher

VLDB Endowment

Publication History

Published: 20 August 2002

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)High-Dimensional Data CubesProceedings of the VLDB Endowment10.14778/3565838.356583915:13(3828-3840)Online publication date: 1-Sep-2022
  • (2017)Multidimensional Business Benchmarking Analysis on Data WarehousesInternational Journal of Data Warehousing and Mining10.5555/3077757.307776013:1(51-75)Online publication date: 1-Jan-2017
  • (2017)PopUp-CubingProceedings of the Fourth IEEE/ACM International Conference on Big Data Computing, Applications and Technologies10.1145/3148055.3148061(11-20)Online publication date: 5-Dec-2017
  • (2017)Frag-shells cube based on hierarchical dimension encoding treeProceedings of the 11th International Conference on Ubiquitous Information Management and Communication10.1145/3022227.3022229(1-9)Online publication date: 5-Jan-2017
  • (2016)Partial materialization for online analytical processing over multi-tagged document collectionsKnowledge and Information Systems10.1007/s10115-015-0871-247:3(697-732)Online publication date: 1-Jun-2016
  • (2014)MesaProceedings of the VLDB Endowment10.14778/2732977.27329997:12(1259-1270)Online publication date: 1-Aug-2014
  • (2013)Mining multidimensional contextual outliers from categorical relational dataProceedings of the 25th International Conference on Scientific and Statistical Database Management10.1145/2484838.2484883(1-4)Online publication date: 29-Jul-2013
  • (2012)Searching semantic data warehousesProceedings of the 2nd International Workshop on Semantic Search over the Web10.1145/2494068.2494074(1-5)Online publication date: 27-Aug-2012
  • (2012)Using functional dependencies for reducing the size of a data cubeProceedings of the 7th international conference on Foundations of Information and Knowledge Systems10.1007/978-3-642-28472-4_9(144-163)Online publication date: 5-Mar-2012
  • (2011)Parallel data cubes on multi-core processors with multiple disksProceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research10.5555/2093889.2093901(99-106)Online publication date: 7-Nov-2011
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media