Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
10 views

Module 5_Dimensional Modeling

The document provides lecture notes on dimensional modeling in data warehousing, outlining learning outcomes, the importance of fact and dimension tables, and the process of converting E/R models to dimensional models. It emphasizes the advantages of dimensional modeling over relational modeling, such as understandability and performance, and details the structure and purpose of fact and dimension tables. Additionally, it introduces the Dimensional Normal Form methodology for designing dimensional models.

Uploaded by

Dom Balseen
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Module 5_Dimensional Modeling

The document provides lecture notes on dimensional modeling in data warehousing, outlining learning outcomes, the importance of fact and dimension tables, and the process of converting E/R models to dimensional models. It emphasizes the advantages of dimensional modeling over relational modeling, such as understandability and performance, and details the structure and purpose of fact and dimension tables. Additionally, it introduces the Dimensional Normal Form methodology for designing dimensional models.

Uploaded by

Dom Balseen
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Professorial Lecturer: Module 5_Dimensional Modeling Lecture Notes in

Dr. Domingo T. Balse, Jr, LPT Data Warehousing

Dimensional Modeling
1. Learning outcomes
• Explain the concept of dimensional modeling
• Discuss fact tables and dimensional tables
• Understand the conversion of the E/R model to a dimensional model using
• Dimensional Normal Form (DNF) methodology

2. Games
-related to the topic

3. Introduction
We have learned in Module 4 that the Data Track stream in the Kimball Lifecycle involves
dimensional modeling.
Dimensional modeling is a logical design technique for structuring data so that it is
intuitive for business users and delivers fast query performance.
We will take a closer look at the process involved here in this module.

Relational modeling is widely used in databases nowadays. However, dimensional


modeling has two advantages over relational modeling. These are understandability and
performance. The model must be easily understood by business users while representing the
complexities of the business.

It must also have fast response to queries that summarize millions of rows.

Dimensional models also have the following benefits:


1. Predictable, Standard Framework
2. Gracefully Extensible to Accommodate Change
3. Star Join Schema is Symmetrical
4. Has Standard Approaches for Common Modeling Situations
5. Aggregate Management

To design a dimensional model, we must perform the following steps:


1. Establishing Naming Conventions
2. Do the Four-Step Dimensional Modeling Process
3. Document the High Level Data Model Diagram
4. Define the Data Sources
5. Document the Detailed Table Designs
6. Develop Detailed Bus Matrix
7. Identify, Track, and Resolve Issues

Let us now dig deeper into dimensional modeling and discuss fact tables and dimensional
tables.

Page 1 of 4
Professorial Lecturer: Module 5_Dimensional Modeling Lecture Notes in
Dr. Domingo T. Balse, Jr, LPT Data Warehousing

4. Fact Tables
Let first determine what makes up a “fact”. Measurements are numeric values called facts.
Examples are sales amount and count of attendance. Dimensions, meanwhile, describe the
“who, what, where, when, why, and how” of the facts. For example, dimensions for sales amount
would be sales by quarter and sales by product.
A dimensional model consists of a fact table containing measurements surrounded by a
halo of dimension tables containing textual context. It is known as a star join and as a star
schema when stored in a relational database.
Fact tables contain the descriptive attributes (numerical values) needed to perform decision
analysis and query reporting in the star schema.

Here are some more fact table facts:


1. A fact is a performance measure. For example, "Sales of Product X".
2. Fact values are not known in advance. They are only known when event measurement
occurs.
3. Facts are numeric.
4. The most useful facts are numeric and additive.

Fact tables are usually the largest tables. A single fact table can contain either detail or
summarized data. They are primarily joined to dimension tables through foreign keys.
The business definition of the measurement event that produces the fact table is called the
fact table's grain. Declaring the grain means a fact table row represents the blank in this
statement: “A fact row is created when ____ occurs.”

5.Dimension Tables
In a star schema, dimension tables contain classification and aggregation information
about the values in the fact table.
Dimension tables contain the parameters by which the fact table measures are analyzed.
For example, the amount sold is analyzed by day, month, quarter, or year. Or the amount sold
on sunny days vs. rainy days, and so on.
Dimension tables provide the context to the fact table measures they describe. They also
contain descriptors of the business, utilizing business terminology. They have many large
columns, contain textual and discrete data, and are usually smaller than fact tables.

Page 2 of 4
Professorial Lecturer: Module 5_Dimensional Modeling Lecture Notes in
Dr. Domingo T. Balse, Jr, LPT Data Warehousing

Have a single column surrogate primary key (called the warehouse dimension key) and are
joined to a fact table through a foreign key reference to their primary key. Dimension tables can
contain one or more hierarchies. These hierarchies are de-normalized into the dimension tables.
Dimensional tables can be classified into the following:
1. Date Based
2. Time Based
3. Business Entities
4. Analytical Profiles
5. Correlated Entities
6. Versions of Business Entities
7. Flags and Indicators
8. Degenerate Dimensions

Now how do we generate dimensional models? The Dimensional Normal Form is a


creative and practical approach originated by Mike Schmitz to design Dimension Table Families.
Here, fact tables are highly normalized for maintainability and flexibility.
Dimensions have their hierarchies de-normalized into them for usability and performance.
Its schema is limited to two levels. These are a single first level or central highly normalized table
called a fact table and multiple second level tables called dimension tables linked to the first level
table in primarily one to many relationships.

6. Quiz / Activity
References

Book References:
Corr, Lawrence & Jim Stagnitto (2011). Agile Data Warehouse Design: Collaborative Dimensional
Modeling, from Whiteboard to Star Schema
Jarke , Matthias, Maurizio Lenzerini , Yannis Vassiliou & Panos Vassiliadis (2003). Fundamentals
of Data Warehouses. Springer Berlin Heidelberg Publishing. ISBNs 978-3-54-042089-7,
978-3-64-207564-3, 978-3-66-205153-5. DOI 10.1007/978-3-662-05153-5
Jukic,Nenad, Susan Vrbsky & Svetlozar Nestorov (2016). Database Systems: Introduction to
Databases and Data Warehouses.
Kimball, Ralp (2013). The Data Warehouse Toolkit: The Definitive Guide to Dimensional
Modeling, 3rd Edition

Page 3 of 4
Professorial Lecturer: Module 5_Dimensional Modeling Lecture Notes in
Dr. Domingo T. Balse, Jr, LPT Data Warehousing

Linstedt, Daniel & Michael Olschimke (2015). Building a Scalable Data Warehouse with Data
Vault 2.0
Ponniah, Paulraj (2001). Data Warehousing Fundamentals: A Comprehensive Guide for IT
Professionals, 1st Edition. Wiley-Interscience Publishing

Page 4 of 4

You might also like