0% found this document useful (0 votes)

5 views

Advanced Data Modeling (2)

Uploaded by

gigesa39

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Advanced Data Modeling (2)

Uploaded by

gigesa39

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Power BI L200 Data Modeling

Instructor: Roshan Oganiya

Email: roganiya@dstrat.com
Making sense of your data
We Love Data
• 20 years of experience in analytics & business intelligence.
• Based in the GTA & service clients worldwide.
• Award winning Microsoft Partner with 70+ Employees.
Our clients
Report Development Flow
Every component is equally important to produce robust solution

Power Query

Data Close &

Apply
Modeling

DAX

(Prep data for Data

Model)
Visualizations
COURSE OBJECTIVES

By the end of this course, you will be able to:

• Understand basic concepts of Data Modeling

• Understand the consequences of data model design decisions

• Understand concepts of calculated columns and measures

* Times are approximate and will be fluid with the class.

Agenda
9:00 - 9:15 Initial remarks and Introduction to the course
Section A
9:15 - 10:15 Intro to Data Preparation

Section B
10:15 - 11:00 Data Model Schemas, Normalization, Calculated Columns and Measures
11:00 - 11:15 Break
11:15 - 11:45 Lab 1
Section C
11:45 - 12:15 Data Storage in Power BI
12:15 - 12:30 Best Practices, Q&A
Section A
Intro to Data Preparation
Why Prepare our data?
• Power BI is powerful enough to compile and analyze
data, but..
• If the data is not prepared properly, these
compilations will be slower and reduce the report’s
analytical efficiency
• Data needs to cater to the technology of the
compression engine being used by PBI to develop a
robust data model
What is a Data model?
The Technology behind Power BI
The VertiPaq Engine:
Columnar Database Engine - Columns & Segments
How many distinct products sold in 2017-Q1 , only Product and Date columns are used

Compresses data to distinct values (Encoding)

VertiPaq Engine
In-Memory mode for tabular architecture

Pro Tip – Have your Queries/Tables be as “Narrow” as possible

Columnar Database

First Name Last Name Sales First Name Last Name Sales

John Smith $10 John Smith $10

Jane Doe $25 Jane Doe $25

Hardy B $35 Hardy B $35

each row separately each column separately

• Columnar databases are well suited for analytics

In-Memory Database
Data stored in RAM (in memory) when the file is open

Sales Fact 145.0 MB Data Model 13.0 MB

Dimensions 7.0 MB
Int’l Sales 128.0 MB
Total Data 280.0 MB Almost 21X
Compression!!

Query Metadata 14 KB
Entities
Dimension Table:
Contain descriptive information used to slice and dice data from Fact Tables (eg:
branch_name, branch_type)
branch_key
Also holds Relationship/Key Fields used to connect the dimension to the fact table
(eg: branch_key)
Wider tables with small amount of rows

Fact Table:
Contain facts/details which are fields used as values in a visualization (eg: dollars_sold,
units_sold)
Also holds Relationship/Key fields used to connect the dimension to the fact table
(eg: time_key, item_key, branch_key, location_key)
Narrow tables with large amount of rows
• H

Golden Rule:
Avoid using a single table that includes everything (both facts and dimensions)
Relationships
• Connections between a 2 tables (usually
fact & Dim tables) using columns from
each are called Relationships

• Once you have two tables connected, you

can work with the data in both as if they
were in a single table
• A Relationship is analogous to how an
Excel VLOOKUP function brings two tables
together
• Power BI automatically sets the Cardinality,
Cross Filter Direction and Active
relationship when you load queries onto
PBI.
Cardinality
One to One (1 : 1) relationship
- Takes place when you connect columns with the same, distinct values.
- For such a relationship, you can merge the two tables together in Power Query Editor and
disable loading the original to avoid redundancy

One to Many (1 : *) relationship

- The most common type of cardinality used
- Takes place when you connect a field with unique values to another table with the same field
but repeating values
Cardinality

Many to Many ( * : * ) relationship

- Takes place when there are multiple records of the same value in the joining field of the two
tables being joined.
- Considered to be a weak relationship; causes a lot of issues. Can be resolved by creating a
shared dimension and creating one to many relationships with the shared dimension.
- Avoid Many to Many relationships when possible as it is laborious to maintain
Cross Filter Direction
The direction of a relationship is called the cross-filter direction as it sets up the way a filter propagates
through your data

Uni -Directional Relationship

- Used when a dimension table to filters through fact table data as the filter direction moves
from the dimension to the fact table with the connecting field (ProductID)

Bi-Directional Relationship
‐ Allow you to pass filters in both directions
‐ This is different than Many to Many
‐ There is a significant performance penalty for Bi-Directional filtering
Section B
Data Model Schemas, Normalization, DAX Calculated
Columns and Measures
Phases in Building a Power BI Desktop File
Data Model Brings Facts and Dimensions Together

Data Models

Flat or Snowflake
Star Schema
Denormalized Schema
Flat or
Denormalized
Schema

• All attributes for model exist in a

single table

• Highly inefficient

• Model has extra copies of data >

slow performance

• Size of a flat table can blow up

quickly as data model becomes
complex
Star Schema 1 Many 1 1

• Simple, easy to understand, fewer

joins
• Comprises of a single Fact table in
the middle branching outwards to
connect to various dimension tables
• Fact table is the “Many” side of the
(one to many) relationship
• Consumes more space than the
snowflake schema (not always a bad
thing as Power BI is powerful enough)
Snowflake Dims Facts Dims Flake

Schema

• Dimension tables are Normalized in

Snowflake schema

• Dimensions “snowflake” off of other

Dimensions

• Dim or Fact tables can be the

“Many” side of the relationship
Granularity &
Multiple Fact Tables

• Grain (granularity) measures the level of

detail in a table

Example:
One row per order or per Item
Daily or Monthly date grain

• If your facts have very different

granularities, split them into Multiple Fact
tables & connect them to shared
Sales (Daily by Product)
dimensions at the lowest common
granularity. Budget (Monthly by Product Category & Product Segment
Normalization
• Process of organizing database to make it more flexible by eliminating redundancy and inconsistent
dependency
• Deals with creating separate tables for values that can apply to multiple records (dimension tables) and
relating these tables with some sort of a foreign key.
- This involves studying the dataset to see what fields can be grouped together to form dimension tables that
could be used by other fact tables

• Next, try to figure out how

the new dimension tables could
be related to the fact table with
the help of a simple or
compound key
DAX Foundations
Calculated Columns and Measures are both written in the DAX Language

A Calculated Column is evaluated as a new column in the table in which it resides and will not change value until the
underlying data is refreshed.

Measures are calculations which do not have a result until they are used in a visualization.
They may use sums, averages, minimum or maximum values, counts, or more advanced calculations; and they change
value in response to your interaction with your reports.

Calculated Column
What is a Calculated Column?

Calculated Column
Best Practices – Calculated Columns
What is a Measure?

[Total Sales]=SUM(Sales[Sales Amount])

Calculated Column vs. Measure: When to Use What
Rule of Thumb
Calculated Column – Use in Page, Report & Visual Filters as well as Slicers, Rows and Columns
Measures - Use in Values section

Columns

Values

Slicer

Rows
Designing good data models
Key takeaways to design a good Power BI Desktop data model

• RAM is precious !!!!!

• If a fact table contains an ID field which is unique for each record, remove it unless needed as a connector key
• Ex. Transaction ID

• Sort columns before bringing them into a Power BI data model

• The DateTime data type is usually not needed, unless you are specifically using the Time component
➢ If you really need Time, try splitting Date & Time into
Knowledge Check

1. What is a data model in the context of Power BI?

• A data model is a collection of tables and relationships

2. What are some advantages of a star schema over a flat or denormalized model?
• Dimension tables save space by reducing the amount of data that needs to be repeated over and
over in every row
• Relationships between tables can be leveraged for more complex measures

3. How might you improve the performance of a Power BI model?

• Try using a star schema instead of a flat or denormalized model
• Remove unnecessary columns
• Set appropriate data types
Break
Lab 1
Section C
Data Storage in Power BI
Data Mode Types in Power BI
How can I tell what Data Model Type I have?
Connection: Live Connect
Choosing storage mode: LiveConnect
Connection: DirectQuery to Relational Source
Import Mode

- Most widely used connection and the default type when

connecting to most sources.

- The connection will ingest/pull all the data from the source and
make it a part of the PBI
Choosing storage mode: Import vs DirectQuery
Best Practices
Data Modeling

An inefficient model can completely slow down a report, even with very small data
volumes

GOALS:

• Make the model as small as possible

• Schema supports the analysis

• Relationships are built purposefully and thoughtfully

Move calculations to the source
Scenario
• Many DAX calculated columns with high cardinality

Why is it undesired?
• Calculated columns don’t compress as well as physical columns

Proposed Solution
• Perform calc in Power Query, ideally push down
Remove unused tables and columns
Scenario
• Model contains tables/columns that are not used for reporting/analysis or
calculations

Why is it undesired?
• Increases model size
• Increases time to load into memory
• Increases refresh time
• May affect usability
Avoid high precision/cardinality columns
Scenario
• Model contains columns at a higher precision than needed for analysis e.g. datetime
in milliseconds, weight to 6 decimal places
• Model contains columns that are highly unique

Why is it undesired?
• Less compression with high precision/cardinality
• Increases time to load into memory
• Increases refresh time

Proposed Solution
• Remove if not needed
• Reduce precision
• Split datetime into date and time
Use integers instead of strings
Why is it undesired?
• Strings use dictionary encoding, integers use run length encoding which is more
efficient

Proposed Solution
• Check data types and set to integer if known to be numerical
Be careful with bi-directional relationships
Scenario
• Most relationships in the model are set to bi-
directional

Why is it undesired?
• Applying filters/slicers traverses many
relationships and can be slower
• Some filter chains unlikely to add business
value

Proposed Solution
• Only use bi-di where the business scenario
requires it
Set Default Summarization
Scenario
• Numeric columns in model that are purely
informational (e.g. Account ID)
• Default summarization is Sum

Why is it undesired?
• Power BI will try to sum the number when
dropped into visuals.
• Detailed tables/matrixes can be slower

Proposed Solution
• Set the default summarization to None
Q&A

PL 300T00A ENU Powerpoint 03
No ratings yet
PL 300T00A ENU Powerpoint 03
40 pages
Booster Pump Calculation
93% (15)
Booster Pump Calculation
3 pages
Certificate PDF
No ratings yet
Certificate PDF
1 page
Learn SAP BI in 24 Hours
From Everand
Learn SAP BI in 24 Hours
Alex Nordeen
3/5 (1)
Problem Sheet - 1: Figure 1. Conduction Band Figure 2. Valance Band
0% (1)
Problem Sheet - 1: Figure 1. Conduction Band Figure 2. Valance Band
2 pages
Data Modeling Best Practices
No ratings yet
Data Modeling Best Practices
41 pages
Power BI Week 2
No ratings yet
Power BI Week 2
43 pages
3.Model Data in Power BI
No ratings yet
3.Model Data in Power BI
55 pages
data science
No ratings yet
data science
6 pages
Data Modeling in Power BI
100% (1)
Data Modeling in Power BI
15 pages
9AM Dec - Power Pivot
No ratings yet
9AM Dec - Power Pivot
25 pages
Freshers Interview QnA - Power BI
No ratings yet
Freshers Interview QnA - Power BI
55 pages
Mayuri Dandekar DATA MODELING
No ratings yet
Mayuri Dandekar DATA MODELING
26 pages
Power_BI_Q_A_1738910215
No ratings yet
Power_BI_Q_A_1738910215
11 pages
Power BI Modeling with DAX
No ratings yet
Power BI Modeling with DAX
47 pages
06: Modelling Data in Power BI Desktop: Chris Webb Chris@crossjoin - Co.uk
No ratings yet
06: Modelling Data in Power BI Desktop: Chris Webb Chris@crossjoin - Co.uk
13 pages
Trainocate Free Power BI Training Slides
No ratings yet
Trainocate Free Power BI Training Slides
124 pages
Marc Lelijveld Jeroen Ter Heerdt
No ratings yet
Marc Lelijveld Jeroen Ter Heerdt
40 pages
PowerBI Compressed
No ratings yet
PowerBI Compressed
42 pages
Interview Question - Power BI-Part5
No ratings yet
Interview Question - Power BI-Part5
4 pages
DAiB Week 4-DataVisualization PBI
No ratings yet
DAiB Week 4-DataVisualization PBI
99 pages
Powerbi-439327-Benchmark Respreviewresults 20241107 222925
No ratings yet
Powerbi-439327-Benchmark Respreviewresults 20241107 222925
5 pages
An Introduction To Data Models in Power BI (Slides)
No ratings yet
An Introduction To Data Models in Power BI (Slides)
13 pages
PL300 Data Analyst-PowerBI
No ratings yet
PL300 Data Analyst-PowerBI
56 pages
Automate Financial Reporting - Power BI Basics
No ratings yet
Automate Financial Reporting - Power BI Basics
15 pages
Power Bi
No ratings yet
Power Bi
16 pages
Takeaways Advanced1
No ratings yet
Takeaways Advanced1
5 pages
Intro - Data - Modeling
No ratings yet
Intro - Data - Modeling
4 pages
DAX1
No ratings yet
DAX1
62 pages
Deloitte Interview Insights for a Power BI Developer
No ratings yet
Deloitte Interview Insights for a Power BI Developer
26 pages
PL 300T00A ENU Powerpoint04
No ratings yet
PL 300T00A ENU Powerpoint04
24 pages
Model Data in Power BI: Angeles University Foundation College of Computer Studies
No ratings yet
Model Data in Power BI: Angeles University Foundation College of Computer Studies
31 pages
Power Bi Interview Question Asked in Tech Mahindra 1721390502
No ratings yet
Power Bi Interview Question Asked in Tech Mahindra 1721390502
15 pages
power_bi_1741405224
No ratings yet
power_bi_1741405224
33 pages
Power BI Notes - PGDMBDI4-1 Ritika Deshmukh - Outlook
No ratings yet
Power BI Notes - PGDMBDI4-1 Ritika Deshmukh - Outlook
4 pages
Power BI
No ratings yet
Power BI
101 pages
Newton School
No ratings yet
Newton School
21 pages
Advanced Data Modeling in Power BI
No ratings yet
Advanced Data Modeling in Power BI
31 pages
Powerbifeb
No ratings yet
Powerbifeb
21 pages
Power BI DAX Training manual
No ratings yet
Power BI DAX Training manual
12 pages
Power BI Interview Questions
No ratings yet
Power BI Interview Questions
16 pages
IDEAS BID_DTA LAB 2
No ratings yet
IDEAS BID_DTA LAB 2
19 pages
Lab 03 - Design a Data Model in Power BI
No ratings yet
Lab 03 - Design a Data Model in Power BI
23 pages
POWER BI - Student
No ratings yet
POWER BI - Student
63 pages
DP-500T00A-ENU-PowerPoint_05
No ratings yet
DP-500T00A-ENU-PowerPoint_05
58 pages
2. POWERBI - 1688813655261
No ratings yet
2. POWERBI - 1688813655261
21 pages
Chapter 4
No ratings yet
Chapter 4
36 pages
Power Bi Imp Q&A
No ratings yet
Power Bi Imp Q&A
10 pages
25 Questions For Power Bi
No ratings yet
25 Questions For Power Bi
8 pages
PBI
No ratings yet
PBI
111 pages
Data Modelling
No ratings yet
Data Modelling
24 pages
Chapter 3
No ratings yet
Chapter 3
24 pages
Interview Question - Power BI-Part1
No ratings yet
Interview Question - Power BI-Part1
5 pages
Power Bi Data Modelling
No ratings yet
Power Bi Data Modelling
18 pages
07 Introduction To DAX
No ratings yet
07 Introduction To DAX
10 pages
Power Bi
No ratings yet
Power Bi
30 pages
3 Data Visualization with PowerBI
No ratings yet
3 Data Visualization with PowerBI
20 pages
BI Sceberio Q
No ratings yet
BI Sceberio Q
16 pages
GIT Questions
No ratings yet
GIT Questions
29 pages
PowerBI - AdvModeling - ClassSlides Attendee
100% (1)
PowerBI - AdvModeling - ClassSlides Attendee
161 pages
Slides
No ratings yet
Slides
67 pages
Power BI For Data Modelling
100% (2)
Power BI For Data Modelling
25 pages
Power BI
From Everand
Power BI
Vishal Mehra
No ratings yet
BAHL
No ratings yet
BAHL
2 pages
PhuBia-WPS-AWS D1.1-8-6-2011
100% (1)
PhuBia-WPS-AWS D1.1-8-6-2011
3 pages
Digital Business Innovation
No ratings yet
Digital Business Innovation
7 pages
Placements Jobs-Sept-23
No ratings yet
Placements Jobs-Sept-23
6 pages
EasyEDA-Tutorials v6.3.53
No ratings yet
EasyEDA-Tutorials v6.3.53
256 pages
Specification of Skeleton Trailer
No ratings yet
Specification of Skeleton Trailer
4 pages
Gene 240 Revision Test 2
100% (1)
Gene 240 Revision Test 2
7 pages
Final Presentation 2
No ratings yet
Final Presentation 2
19 pages
Communication Skills Among University Students
No ratings yet
Communication Skills Among University Students
6 pages
SDLC
0% (1)
SDLC
4 pages
6.2 Cem Plan Final Edit2
No ratings yet
6.2 Cem Plan Final Edit2
36 pages
Search Warrant in Lehi City Investigation
No ratings yet
Search Warrant in Lehi City Investigation
8 pages
Fema P-795
No ratings yet
Fema P-795
292 pages
BADENAS y AURELL, 2004 - Sea Level Changes, Jabaloyas
No ratings yet
BADENAS y AURELL, 2004 - Sea Level Changes, Jabaloyas
17 pages
BIM TO FIM Stanford Health Care
100% (2)
BIM TO FIM Stanford Health Care
41 pages
Quiz #1 in Electrical Apparatus
No ratings yet
Quiz #1 in Electrical Apparatus
5 pages
A New Radar Waveform Design Algorithm With Improved Feasibility For Spectral Coexistence
No ratings yet
A New Radar Waveform Design Algorithm With Improved Feasibility For Spectral Coexistence
10 pages
The Present Perfect Tense and Past Perfect
No ratings yet
The Present Perfect Tense and Past Perfect
5 pages
The Management of Productivity and Technology in Manufacturing PDF
100% (2)
The Management of Productivity and Technology in Manufacturing PDF
333 pages
Homework Oh Homework Poetry
100% (1)
Homework Oh Homework Poetry
6 pages
KDS 14 20 64 Design Standard For Structural Plain Concrete
No ratings yet
KDS 14 20 64 Design Standard For Structural Plain Concrete
11 pages
6160CR 2 Teclado
No ratings yet
6160CR 2 Teclado
1 page
Elements of Story
No ratings yet
Elements of Story
6 pages
Air Breathing Propulsion Unit-1
No ratings yet
Air Breathing Propulsion Unit-1
37 pages
Chapter 8 Solution Manual Accounting Information Systems
No ratings yet
Chapter 8 Solution Manual Accounting Information Systems
19 pages
Offline Peg A
No ratings yet
Offline Peg A
12 pages
Quest For The Frozen Flame Player's Guide
0% (1)
Quest For The Frozen Flame Player's Guide
12 pages