2.data Warehouse and OLAP
2.data Warehouse and OLAP
2.data Warehouse and OLAP
Characteristics:
Subject-Oriented:
A data warehouse never put emphasis only current operations. Instead, it focuses on
demonstrating and analysis of data to make various decision. It also delivers an easy
and precise demonstration around particular theme by eliminating data which is not
required to make the decisions.
Integrated: A data warehouse combines data from various sources. These may
include a cloud, relational databases, flat files, structured and semi-structured data,
metadata, and master data. The sources are combined in a manner that’s consistent,
relatable, and ideally certifiable, providing a business with confidence in the data’s
quality.
Non-Volatile: As the name defines the data resided in data warehouse is permanent.
It also means that data is not erased or deleted when new data is inserted. It includes
the mammoth quantity of data that is inserted into modification between the selected
quantity on logical business. It evaluates the analysis within the technologies of
warehouse.
Note: Each dimension has only one-dimension table and each table holds a set
of attributes. For example, the location dimension table contains the attribute
set {location_key, street, city, province_or_state, country}. This constraint
may cause data redundancy. For example, "Vancouver" and "Victoria" both
the cities are in the Canadian province of British Columbia. The entries for
such cities may cause data redundancy along the attributes province_or_state
and country.
b) Snowflake Schema
In drill-down operation, the less detailed data is converted into highly detailed data.
It can be done by:
Moving down in the concept hierarchy
Adding a new dimension
In the cube given in overview section, the drill down operation is performed by
moving down in the concept hierarchy of Time dimension (Quarter -> Month).
b) Roll up
It is just opposite of the drill-down operation. It performs aggregation on the OLAP
cube. It can be done by:
Climbing up in the concept hierarchy
Reducing the dimensions
In the cube given in the overview section, the roll-up operation is performed by
climbing up in the concept hierarchy of Location dimension (City -> Country).
c) Dice
It selects a sub-cube from the OLAP cube by selecting two or more dimensions. In
the cube given in the overview section, a sub-cube is selected by selecting following
dimensions with criteria:
Location = “Delhi” or “Kolkata”
Time = “Q1” or “Q2”
Item = “Car” or “Bus”
d) Slice
It selects a single dimension from the OLAP cube which results in a new sub-
cube creation. In the cube given in the overview section, Slice is performed on
the dimension Time = “Q1”.
e) Pivot
It is also known as rotation operation as it rotates the current view to get a new view
of the representation. In the sub-cube obtained after the slice operation, performing
pivot operation gives a new view of it.
2.6 OLAP Servers:
Online Analytical Processing Server (OLAP) is based on the multidimensional data model.
It allows managers, and analysts to get an insight of the information through fast,
consistent, and interactive access to information.
There are different types of OLAP Servers:
a) ROLAP
Relational On-Line Analytical Processing (ROLAP) is primarily used for data stored
in a relational database, where both the base data and dimension tables are stored as
relational tables. ROLAP servers are used to bridge the gap between the relational
back-end server and the client’s front-end tools. ROLAP servers store and manage
warehouse data using RDBMS, and OLAP middleware fills in the gaps.
Benefits:
It is compatible with data warehouses and OLTP systems.
Highly scalable.
The data size limitation of ROLAP technology is determined by the
underlying RDBMS. As a result, ROLAP does not limit the amount of
data that can be stored.
Limitations:
SQL functionality is constrained.
It’s difficult to keep aggregate tables up to date.
Required experienced.
b) MOLAP
Benefits:
Very easy to use.
Suitable for slicing and dicing operations.
Information retrieval is fast.
Capable of performing complex calculations.
Limitations:
It is difficult to change the dimensions without re-aggregating.
DBMS facility is weak.
Since all calculations are performed when the cube is built, a large amount of
data cannot be stored in the cube itself.
c) HOLAP
Benefits:
HOLAP combines the benefits of MOLAP and ROLAP.
Provide quick access at all aggregation levels.
Limitations:
HOLAP architecture is extremely complex.
There is a greater likelihood of overlap, particularly in their functionalities.
2.7 Data warehouse architecture:
A data warehouse architecture is a method of defining the overall architecture of
data communication processing and presentation that exist for end-clients computing
within the enterprise. Each data warehouse is different, but all are characterized by
standard vital components.
A set of data that defines and gives information about other data. Meta Data
summarizes necessary information about data, which can make finding and work
with particular instances of data more accessible. For example, author, data build,
and data changed, and file size are examples of very basic document metadata
Data warehouses and their architectures very depending upon the elements of an
organization's situation.