02 DW
02 DW
02 DW
1
A High-Level Architecture of BI
Data Warehouse Business Analytics Performance and
Environment Environment Strategy
Data Technical staff Business users Managers / executives
Sources Built the data warehouse Access
Data
ü Organizing Warehouse BPM strategy
ü Summarizing Manipulation
ü Standardizing Results
User Interface
Future component - browser
intelligent systems - portal
- dashboard
2
The Architecture of BI
• A BI system has four major components
1. A data warehouse, with its source data
2. Business analytics, a collection of tools for manipulating,
mining, and analyzing the data in the data warehouse
3. Business performance management (BPM) for monitoring
and analyzing performance
4. A user interface (e.g., dashboard)
3
Components in a BI Architecture
1. Data Warehouse
• Cornerstone of any medium-to-large BI
system.
• Originally, only historical data
• Today access to current data as well, so
they can provide real-time decision
support
4
Components in a BI Architecture
2. Business analytics
tools and techniques that help the user transform
data into knowledge
1. Reports and Queries
» Static, dynamic, multidimensional, drill-down
2. Advanced Analytics
Financial data visualization , Web Analytics, GIS
3. Data, Text, Web mining and other sophisticated
mathematical/ statistical modeling
» Cross sell / up sell
» forensic audit in case of high profile wilful defaulters
5
6
Components in a BI Architecture
3. Business performance management (BPM) /CPM –
Corporate Performance Mgmt
An advanced performance measurement and
analysis approach that embraces planning and
strategy
– BPM extends the monitoring, measuring, and comparing
of sales, profit, cost, profitability, and other performance
indicators by introducing the concept of management and
feedback
– BPM provides a top-down enforcement of corporate-wide
strategy
7
Components in a BI Architecture
4. User Interface (i.e., dashboards)
• provide a comprehensive graphical/pictorial view of
corporate performance measures, trends, and
exceptions.
8
Data Warehousing
9
9
A Simple Definition
A data warehouse is a collection of
data created to support decision-
making applications.
10
10
Data warehouse
A physical repository where relational data are
specially organized to provide enterprise-wide,
cleansed data in a standardized format
11
11
Data Warehouse Characteristics
• Subject oriented
– data -organized around sales, products, employees , etc.
• Integrated
– data - integrated to provide a comprehensive view
• Time variant (think time series)
– historical data is maintained
• Nonvolatile
– data is not updated by users
12
12
13
13
14
14
Three Types of Data Warehouse
• Data Marts
• Operational data stores (ODSs)
• Enterprise data warehouses (EDWs).
15
15
Data Mart
A departmental small-scale “DW” that
stores only limited/relevant data
Dependent data mart
A subset that is created directly from a data
warehouse
Independent data mart
A small data warehouse designed for a
strategic business unit or a department
16
16
Types of Datawarehouses
Enterprise Data Warehouse -
Operational data stores (ODS) EDW
• A type of database often • A data warehouse for the
used as an interim staging enterprise.
area for a data warehouse • Large scale
• CRM,SCM,BPM
• Short term decision making
–mission critical
applications
17
Other DW Components
Metadata
Data about data. In a data warehouse, metadata
describe the contents of a data warehouse and the
manner of its acquisition and use
18
18
19
19
20
Source: Inmon
20
Data Warehouse Vs. Data
Warehousing
21
21
A Generic DW Framework
No data marts option
Data Applications
Sources (Visualization)
Access
Routine
ERP Business
ETL
Reporting
Process Data mart
(Marketing)
Select
/ Middleware
Legacy Metadata Data/text
Extract mining
Data mart
(Engineering)
Transform Enterprise
POS Data warehouse
OLAP,
Integrate
API
Data mart Dashboard,
(Finance) Web
Other Load
OLTP/wEB
Replication Data mart
(...) Custom built
External
applications
data
22
22
Components of the Data Warehousing
Process
1. Data Sources 4. Comprehensive
– Legacy, External Data database -EDW
Providers 5. Metadata
2. Data extraction and 6. Middleware Tools
transformation – Enable access to the DW
– custom-written or – e.g. BO
commercial software
called ETL
3. Data loading
23
The Extraction, Transformation, and Load
(ETL) Process
Packaged Transient
application data source
Data
warehouse
Legacy
Extract Transform Cleanse Load
system
Data mart
Other internal
applications
24
24
Corrected Data
Parsed Data First Name: Beth
First Name: Beth Middle Name: Christine
Middle Name: Christine Last Name: Parker
Last Name: Parker Title: SLS MGR
Title: SLS MGR Firm: Regional Port Authority
Firm: Regional Port Authority Location: Federal Building
Location: Federal Building Number: 12800
Number: 12800 Street: South Butler Drive
Street: Lake Calumet City: Chicago
City: Hedgewisch State: IL
State: IL Zip: 60633
Zip+Four: 2398
25
Standardisation
26
Matching
27
Consolidation
Account No.
83451234 Policy No.
ME309451-2
Transaction
B498/97
Account No.
83451234 Policy No.
ME309451-2
Transaction
B498/97 28
Data Warehousing Architecture
29
29
DW Architecture
• Three-tier architecture
1. Data acquisition software (back-end)
2. The data warehouse that contains the data & software
3. Client (front-end) software that allows users to access and analyze
data from the warehouse
4. Easy to create datamarts
5. Functionally separate
• Two-tier architecture
First two tiers in three-tier architecture is combined into one
… sometimes there is only one tier?
30
30
DW Architectures
Tier 1: Tier 2:
Client workstation Application & database server
31
31
REPRESENTATION OF DATA IN DW
X axis (periods 2001, 2003, and 2003), the Y axis (regions), and the 33
33
Z axis (products--front to back).
Data – Dimension
34
34
Dimensions
• What is a dimension ?
– Dimensions are perspectives with respect to
which an organization wants to keep record.
– products, salespeople, market segments, business
units, geographical locations, distribution
channels, country, or industry
• Time is usually an important dimension by default
35
Measures
• Measures are used to report the values of the
particular variable with respect to a given set
of dimensions.
money, sales volume, head count, inventory profit, actual versus
forecast
36
Multidimensionality
The ability to organize, present, and analyze data by
several dimensions, such as sales by region, by product,
by salesperson, and by time (four dimensions)
• Multidimensional presentation
– Dimensions: products, salespeople, market segments,
business units, geographical locations, distribution channels,
country, or industry
– Measures: money, sales volume, head count, inventory
profit, actual versus forecast
– Time: daily, weekly, monthly, quarterly, or yearly
Representation of Data in DW
• Dimensional Modeling
– A retrieval-based system that supports high-volume query
access
• Star schema
– The most commonly used and the simplest style of
dimensional modeling
– Contain a fact table surrounded by and connected to
several dimension tables
• Snowflakes schema
– An extension of star schema where the diagram resembles
a snowflake in shape
Example of Star Schema
time
time_key item
day item_key
day_of_the_week Sales Fact Table item_name
month brand
quarter time_key type
year supplier_type
item_key
branch_key
branch location
location_key
branch_key location_key
branch_name units_sold street
branch_type city
dollars_sold province_or_street
country
avg_sales
Measures
39
39
Han: Data Cubes 39
Example of Snowflake Schema
time
time_key item
day item_key supplier
day_of_the_week Sales Fact Table item_name supplier_key
month brand supplier_type
quarter time_key type
year item_key supplier_key
branch_key
branch location
location_key
location_key
branch_key
units_sold street
branch_name
city_key city
branch_type
dollars_sold
city_key
avg_sales city
province_or_street
Measures country
40
40
ACCESSING DATA IN DW
Why Data Warehouse ?
• Everybody stores data in a database..
42
42
Data and Decisions
• Database • Data warehouse
– Throughput high for – Analysis / Analytics
transactional data.. – Higher order decision making
– IRCTC – IRCTC
• Who has booked tickets in the • Which route can benefit using
tatkal pricing ?
last week ?
• Which route is optimum for freight
– ICICI Bank carriers
• Does the Kelambakkam ATM – MakeMyTrip
have enough cash ? • Who can I cross sell an
international holiday to
– ICICI Bank
• What is the frequency at which
ATMs in different locations need to
be loaded?
• Optimal Routes
• Fraud Protection
43
43
Data and Decisions
• Apollo hospitals
– Demographic Profile of patients in the south
Chennai hospitals
– Geography and Asthma relationship
• How can the Government of India use a
datawarehouse ?
44
44
45
OLAP vs. OLTP
OLAP Operations
47
47
1. Slice
Slice is a subset of a
multidimensional array
Usually 2D
48
48
A 3-dimensional
OLAP cube with Sales volumes of
OLAP
slicing
operations
a specific Product
on variable Time
and Region
Slicing
Operations on a
e
m
Ti
Simple Three- Product
Dimensional
Geography
Cells are filled
Sales volumes of
Data Cube
with numbers
representing a specific Region
sales volumes on variable Time
and Products
Sales volumes of
a specific Time on
variable Region
and Products
49
49
2. Dice
Slice on more
than two
dimensions of a
data cube
50
50
Drill down
51
51
Roll up
Summarize data by
climbing up hierarchy or
by dimension reduction
52
52
Pivot
It is used to
change the
dimensional
orientation of a
report or an ad
hoc
53
53
Business Performance
Management
54
A High-Level Architecture of BI
Data Warehouse Business Analytics Performance and
Environment Environment Strategy
Data Technical staff Business users Managers / executives
Sources Built the data warehouse Access
Data
ü Organizing Warehouse BPM strategy
ü Summarizing Manipulation
ü Standardizing Results
User Interface
Future component - browser
intelligent systems - portal
- dashboard
55
Business Performance Management
(BPM) Definition
• What is it ?
– Framework that helps organizations deliver critical
insight to improve financial and operational
performance
• What does the framework specify ?
– Framework for organizing, automating, and
analyzing business methodologies, metrics,
processes, and systems
56
Business Performance Management
(BPM) Overview
• BPM and BI Compared
– BPM is an outgrowth of BI and incorporates many
of its technologies, applications, and techniques
– BPM is an enterprisewide strategy that seeks to
prevent organizations from optimizing local
business at the expense of overall corporate
performance
– BPM is part of the daily work of managers
57
Major BPM Processes
• Strategize
• Setting goals and objectives
• Plan
– Establishing initiatives and plans to achieve those
goals
• Monitor
– Monitoring actual performance against the goals
and objectives
• Act and Adjust
– Taking corrective action
58
Business Performance Management
(BPM) Overview
Strategize:
Where Do We Want to Go?
• Strategic planning
– Tasks common to the strategic planning process:
1. Conduct a current situation analysis
2. Determine the planning horizon
3. Conduct an environment scan
4. Identify critical success factors
5. Complete a gap analysis
6. Create a strategic vision
7. Develop a business strategy
8. Identify strategic objectives and goals
Strategize:
Where Do We Want to Go?
• Strategic planning
– Strategic objective
A broad statement or general course of action
prescribing targeted directions for an organization
– Strategic goal
A quantified objective with a designated time
period
Strategize:
Where Do We Want to Go?
• Strategic planning
– Strategic vision
– A picture or mental image of what the
organization should look like in the future
– Critical success factors (CSF)
Key factors that delineate the things that an
organization must excel at to be successful in its
market space
Strategize:
Where Do We Want to Go?
• The strategy gap
– Four sources for the gap between strategy and
execution:
1. Vision
2. People
3. Management
4. Resources
Plan:
How Do We Get There?
• Operational planning
– Operational plan
Plan that translates an organization’s strategic
objectives and goals into a set of well-defined
tactics and initiatives, resources requirements,
and expected results
Plan:
How Do We Get There?
• Operational planning
– Tactic-centric plan—tactics are established to
meet the objectives and targets established in the
strategic plan (used by best practices
organizations
– Budget-centric plan—a financial plan or budget is
established that sums to the targeted financial
values
Plan:
How Do We Get There?
• Financial planning and budgeting
– An organization’s strategic objectives and key
metrics should serve as top-down drivers for the
allocation of an organization’s tangible and
intangible assets
– Resource allocations should be carefully aligned
with the organization’s strategic objectives and
tactics in order to achieve strategic success
Monitor:
How Are We Doing?
• A comprehensive framework for monitoring
performance should address two key issues:
– What to monitor
– How to monitor
Monitor:
How Are We Doing?
• Diagnostic control system
– Inputs, a process for transforming inputs into
outputs, a standard against which to compare
these outputs, and a feedback channel allowing
information on variances to be communicated so
that it can be acted upon.
• Management by exception
– Don’t have to constantly monitor
– Scheduled and Exception reports
68
Monitor:
How Are We Doing?
Business
Strategy
Strategic
Objective
Strategic
Goal
Feedback
82
BPM Methodologies
Strategy map
A visual display that delineates
the relationships among the key
organizational objectives for all four
BSC perspectives
83
BPM Methodologies
• Six Sigma
– More of a “process improvement” model
– The DMAIC performance model
A closed-loop business improvement model that
encompasses the steps of defining, measuring,
analyzing, improving, and controlling a process