Data Warehousing & Data Mining
Data Warehousing & Data Mining
Data Warehousing & Data Mining
• Real-time partition
architecture
Data Sources
Warehouse
Operation
al
System
Data
Cleaning
Real-Tim
Pipeline e „Static“
…
Operation On each Partition Fact
al transactio Table
System n Periodicall
y
Flat
files
• Architectur
e:
Data Warehous
Sources e
Copying the fact table is slow.
But the OLAP queries can
Operation continue. Just the ETL is
al paused.
System
Fact …
Table Fact
Operation Copy Table
al
System
• Architectur
e:
Data Warehous Query
Sources e Process
or Quer
Operation y
al Nightl
System y
Stagin …
g Fact
Operation Table Table
al
System
Flat
files ERDC
Temp
.
• Advantages of JIM
– Less scalability problems as the real-time
data is brought into the DW only on
request
– Query contention is not a problem as the
data in the temporary tables are snapshots
and do not change while queried
• The memory:
– Retains memory even without power
– Slower than DRAM solutions
– Wears down!
• The controller:
– Error correction, wear leveling, bad block
mapping, read and write caching, encryption,
garbage collection
• System goals:
– Process OLTP transactions at rate of
tens of thousands per second, and at
the same time
– Process OLAP queries on up-to-date snapshots of
the transactional data
DW & DM – Wolf-Tilo Balke – Institut für Informationssysteme – TU 45
8.4HyPer: OLTP Processing
Column
Store
Row
Store