Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2513591.2513650acmotherconferencesArticle/Chapter ViewAbstractPublication PagesideasConference Proceedingsconference-collections
research-article

Near real-time with traditional data warehouse architectures: factors and how-to

Published: 09 October 2013 Publication History

Abstract

Traditional data warehouses integrate new data during lengthy offline periods, with indexes being dropped and rebuilt for efficiency reasons. There is the idea that these and other factors make them unfit for realtime warehousing. We analyze how a set of factors influence near-realtime and frequent loading capabilities, and what can be done to improve near-realtime capacity using a traditional architecture. We analyze how the query workload affects and is affected by the ETL process and the influence of factors such as the type of load strategy, the size of the load data, indexing, integrity constraints, refresh activity over summary data, and fact table partitioning. We evaluate the factors experimentally and show that partitioning is an important factor to deliver near-realtime capacity.

References

[1]
O'Neil, P., O'Neil, E., Chen, X., Revilak, S.: Star Schema Benchmark. In: R. Nambiar and M. Poess (eds.): TPCTC 2009. LNCS 5895. Springer-Verlag, Berlin Heidelberg (2009) 237--252.
[2]
Wyatt, L., Caufield, B., Pol, D.: Principles for an ETL Benchmark. In: R. Nambiar and M. Poess (eds.): TPCTC 2009. LNCS 5895. Springer-Verlag, Berlin Heidelberg (2009) 183--198.
[3]
Simitsis, S., Vassiliadis, P., Dayal, U., Karagiannis, A., Tziovara, V.: Benchmarking ETL Workflow. In: R. Nambiar and M. Poess (eds.): TPCTC 2009. LNCS 5895. Springer-Verlag, Berlin Heidelberg (2009) 199--220.
[4]
Jedrzejczak, J., Koszlajda, T., Wrembel, R.: RTDW-bench: Benchmark for Testing Refreshing Performance of Real-Time Data Warehouse. In: S. W. Liddle et al. (eds.): DAXA 2012, Part II, LNCS 7447. Springer-Verlag, Berlin Heidelberg (2012) 199--206.
[5]
Oracle, SQL em Oracle, http://aserlorenzo.com/manSQL/Oracle/ddl/indices.htm.
[6]
Wikipedia, Bitmap index, http://en.wikipedia.org/wiki/Bitmap_index.
[7]
Wikipedia, Foreign Key, http://en.wikipedia.org/wiki/Foreign_key.
[8]
Oracle, Partitioned Tables and Indexes, http://docs.oracle.com/cd/B10501_01/server.920/a96524/c12parti.htm.
[9]
Waas, F., Wrembel, R., Freudenreich, T., Thiele, M., Koncilia, C., Furtado, P.: On-Demand ETL Architecture for Right-Time BI. In: Proceedings of the 6th International Workshop on Business Intelligence for the Real-Time Enterprise (BIRTE), Istanbul (2012).
[10]
Waas, F., Wrembel, R., Freudenreich, T., Thiele, M., Koncilia, C., Furtado, P.: On-Demand ELT Architecture for Right-Time BI: Extending the Vision. In: International Journal of Data Warehousing and Mining (IJDWM), volume 9 number 2 (2013).
[11]
Santos, R., Bernardino, J.: Optimizing Data Warehouse Loading Procedures for Enabling Useful-Time Data Warehousing. In: International Database Engineering & Applications Symposium (IDEAS), pp. 292--299, 2009.
[12]
Wikipedia, Materialized View, http://en.wikipedia.org/wiki/Materialized_view.
[13]
Ferreira, N.: "Realtime Warehouses: Architecture and Evaluation", MSc Thesis, U. Coimbra, June 2013.

Cited By

View all
  • (2024)Challenges and Solutions of Real-Time Data Integration Techniques by ETL ApplicationBig Data Analytics Techniques for Market Intelligence10.4018/979-8-3693-0413-6.ch014(348-371)Online publication date: 4-Jan-2024
  • (2023)A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environmentsHeliyon10.1016/j.heliyon.2023.e15728(e15728)Online publication date: Apr-2023
  • (2019)Veri Ambarı Projelerinde ETL Performansını Etkileyen Faktörlerin BelirlenmesiDetermination Of Factors Affecting ETL Performance In Data Warehouse ProjectsSelçuk Üniversitesi Sosyal Bilimler Meslek Yüksekokulu Dergisi10.29249/selcuksbmyd.58042422:2(965-990)Online publication date: 30-Nov-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
IDEAS '13: Proceedings of the 17th International Database Engineering & Applications Symposium
October 2013
222 pages
ISBN:9781450320252
DOI:10.1145/2513591
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • UPC: Technical University of Catalunya
  • BytePress
  • Concordia University: Concordia University

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 October 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ETLR
  2. data warehouse
  3. near real-time

Qualifiers

  • Research-article

Conference

IDEAS '13
Sponsor:
  • UPC
  • Concordia University

Acceptance Rates

IDEAS '13 Paper Acceptance Rate 9 of 51 submissions, 18%;
Overall Acceptance Rate 74 of 210 submissions, 35%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)2
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Challenges and Solutions of Real-Time Data Integration Techniques by ETL ApplicationBig Data Analytics Techniques for Market Intelligence10.4018/979-8-3693-0413-6.ch014(348-371)Online publication date: 4-Jan-2024
  • (2023)A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environmentsHeliyon10.1016/j.heliyon.2023.e15728(e15728)Online publication date: Apr-2023
  • (2019)Veri Ambarı Projelerinde ETL Performansını Etkileyen Faktörlerin BelirlenmesiDetermination Of Factors Affecting ETL Performance In Data Warehouse ProjectsSelçuk Üniversitesi Sosyal Bilimler Meslek Yüksekokulu Dergisi10.29249/selcuksbmyd.58042422:2(965-990)Online publication date: 30-Nov-2019
  • (2018)Study of Meta-Data Enrichment Methods to Achieve Near Real Time ETLData Analytics and Learning10.1007/978-981-13-2514-4_32(387-402)Online publication date: 5-Nov-2018
  • (2018)An Innovative Lambda-Architecture-Based Data Warehouse Maintenance Framework for Effective and Efficient Near-Real-Time OLAP over Big DataBig Data – BigData 201810.1007/978-3-319-94301-5_12(149-165)Online publication date: 21-Jun-2018
  • (2017)Scalability and Realtime on Big Data, MapReduce, NoSQL and SparkBusiness Intelligence10.1007/978-3-319-61164-8_4(79-104)Online publication date: 4-Jul-2017
  • (undefined)A Non-Intrusive and Reactive Architecture to Support Real-Time ETL Processes in Data Warehousing EnvironmentsSSRN Electronic Journal10.2139/ssrn.3969703

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media