Approximate Query Answering Using Data Warehouse Striping

Bernardino, Jorge; Furtado co, Pedro; Madeira, Henrique

doi:10.1007/3-540-44801-2_34

Jorge Bernardino⁷,
Pedro Furtado co⁸ &
Henrique Madeira⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2114))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

897 Accesses
3 Citations

Abstract

This paper presents an approach to implement large data warehouses on an arbitrary number of computers, achieving very high query execution performance and scalability. The data is distributed and processed in a potentially large number of autonomous computers using our technique called data warehouse striping (DWS). The major problem of DWS technique is that it would require a very expensive cluster of computers with fault tolerant capabilities to prevent a fault in a single computer to stop the whole system. In this paper, we propose a radically different approach to deal with the problem of the unavailability of one or more computers in the cluster, allowing the use of DWS with a very large number of inexpensive computers. The proposed approach is based on approximate query answering techniques that make it possible to deliver an approximate answer to the user even when one or more computers in the cluster are not available. The evaluation presented in the paper shows both analytically and experimentally that the approximate results obtained this way have a very small error that can be negligible in most of the cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

PatchIndex: exploiting approximate constraints in distributed databases

Article Open access 06 March 2021

SPIN: Concurrent Workload Scaling over Data Warehouses

Distributed Data Warehouse Resource Monitoring

References

Acharaya, S., Gibbons, P.., Poosala, V.: Congressional Samples for Approximate Answering of Group-By Queries. ACM SIGMOD Int. Conf on Management of Data, (2000) 487–498
Google Scholar
Albrecht, J., Gunzel, H., Lehner, W.: An Architecture for Distributed OLAP. Int. Conference on Parallel and Distributed Processing Techniques and Applications PDPTA, (1998)
Google Scholar
APB-1 Benchmark, Olap Council, November 1998, http://www.olpacouncil.org
Barbara, D., et al.: The New Jersey data reduction report. Bulletin of the Technical Committee on Data Engineering, 20(4) (1997) 3–45
Google Scholar
Bernardino, J., Madeira, H.: A New Technique to Speedup Queries in Data Warehousing. In Proc. of Chalenges ADBIS-DASFAA, Prague (2000) 21–32
Google Scholar
Chauduri, S., Dayal, U.: An overview of data warehousing and OLAP technology. SIGMOD Record, 26(1), (1997) 65–74
Article Google Scholar
Cochran, William G.: Sampling Techniques, 3rd edn, John Wiley & Sons, New York, 1977.
MATH Google Scholar
Codd, E.F., Codd, S.B., Salley, C.T.: Providing OLAP (online analitycal processing) to useranalysts: An IT mandate. Technical report, E.F. Codd & Associates (1993)
Google Scholar
Gibbons, P.B., Matias Y.: New sampling-based summary statistics for improving approximate query answers. ACM SIGMOD Int. Conf. on Management of Data (1998) 331–342
Google Scholar
Haas, P.J.: Large-sample and deterministic confidence intervals for online aggregation. In Proc. 9th Int. Conference on Scientific and Statistical Database Management (1997) 51–62
Google Scholar
Hellerstein, J.M., Haas, P.J., Wang, H.J.: Online aggregation. ACM SIGMOD Int. Conference on Management of Data (1997) 171–182
Google Scholar
Kimball, Ralph: The Data Warehouse Toolkit. Ed. J. Wiley & Sons, Inc (1996)
Google Scholar
Kimball, Ralph, Reeves, L., Ross, M., Thornthwalte, W.: The Data Warehouse Lifecycle Toolkit. Ed. J. Wiley & Sons, Inc (1998)
Google Scholar
Selinger, P., et al.: Access Path Selection in a Relational Database Management System. ACM SIGMOD Int. Conf. on Management of Data (1979) 23–34
Google Scholar
TPC Benchmark H, Transaction Processing Council, June 1999, http://www.tpc.org
Vitter, J., Wang, M.: Approximate computation of multidimensional aggregates of sparse data using wavelets. ACM SIGMOD Int. Conf. on Management of Data (1999) 193–204
Google Scholar

Download references

Author information

Authors and Affiliations

Institute Polytechnic of Coimbra, ISEC, DEIS, Apt. 10057, P, 3030-601, Coimbra, Portugal
Jorge Bernardino
University of Coimbra, DEI, Pólo II, P, 3030-290, Coimbra, Portugal
Pedro Furtado co & Henrique Madeira

Authors

Jorge Bernardino
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Furtado co
View author publications
You can also search for this author in PubMed Google Scholar
Henrique Madeira
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Kyoto University, Kyoto, 606-8501, Japan
Yahiko Kambayashi
EC3, Siebensterngasse 21/3, 1070, Wien
Werner Winiwarter
Center for Spatial Information Science (CSIS), University of Tokyo, 4-6-1, Komaba Meguro-ku, Tokyo, 153-8904, Japan
Masatoshi Arikawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bernardino, J., Furtado co, P., Madeira, H. (2001). Approximate Query Answering Using Data Warehouse Striping. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2001. Lecture Notes in Computer Science, vol 2114. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44801-2_34

Download citation

DOI: https://doi.org/10.1007/3-540-44801-2_34
Published: 28 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42553-3
Online ISBN: 978-3-540-44801-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Approximate Query Answering Using Data Warehouse Striping

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

PatchIndex: exploiting approximate constraints in distributed databases

SPIN: Concurrent Workload Scaling over Data Warehouses

Distributed Data Warehouse Resource Monitoring

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Approximate Query Answering Using Data Warehouse Striping

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

PatchIndex: exploiting approximate constraints in distributed databases

SPIN: Concurrent Workload Scaling over Data Warehouses

Distributed Data Warehouse Resource Monitoring

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation