Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
43 views

E DAB 07 DataModeling

The document discusses data modeling and analysis techniques in Excel including VLOOKUP functions, Power Query, and Power Pivot. It provides examples of using each tool to gather and converge raw data from multiple sources into single normalized data sets that can be used for analysis and reporting.

Uploaded by

mdbedare
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

E DAB 07 DataModeling

The document discusses data modeling and analysis techniques in Excel including VLOOKUP functions, Power Query, and Power Pivot. It provides examples of using each tool to gather and converge raw data from multiple sources into single normalized data sets that can be used for analysis and reporting.

Uploaded by

mdbedare
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Data Analysis & Business Intelligence Made Easy with Excel Power Tools

Excel Data Analysis Basics = E-DAB


Notes for Video:
E-DAB 07: Excel Data Analysis & BI Basics: Data Modeling: Excel Formulas, Power Query,
Power Pivot?
Outcomes for Video:
1. Data Modeling................................................................................................................................................................. 2
2. VLOOKUP Function.......................................................................................................................................................... 6
3. Multiple Tables: Fact Tables and Dimension Tables (Lookup Tables)........................................................................... 10
4. Power Pivot is just one of Many Tools in Excel ............................................................................................................. 11
5. Excel Power Pivot and Data Model PivotTables ........................................................................................................... 12
2) Show Power Pivot Ribbon Tab in Excel ..................................................................................................................... 12
4) Excel Power Pivot provides 3 Data Tools .................................................................................................................. 13
5) Why the name Power Pivot? .................................................................................................................................... 13
8) Basic Advantages of Excel Power Pivot ..................................................................................................................... 14
9) Relationship feature works in versions of Excel 2013 or later .................................................................................. 14
10) DAX Formulas ........................................................................................................................................................ 14
6. Implicit vs. Explicit DAX Measures : .............................................................................................................................. 15
7. Power Query Merge feature ......................................................................................................................................... 17
8. Data Types in Power Query........................................................................................................................................... 18
9. Overview of Three Examples in Video .......................................................................................................................... 19
10. VLOOKUP Video Example .......................................................................................................................................... 20
11. Power Query Video Example .................................................................................................................................... 21
12. Power Pivot Relationships feature & Implicit Measure feature ............................................................................... 22

Page 1 of 22
1. Data Modeling
1) Define Data Modeling:
1. Configuring Raw Data into Proper Data Sets that can be used for creating information easily with tools like PivotTables, Power Pivot,
Power BI Desktop and other tools.
2) Tools we use for Data Modeling:
1. Data Modeling can be accomplished with many different tools such as Excel Spreadsheet Formulas, Excel features such as Text To
Columns, Flash Fill, DAX Formulas and more, but the main tool we use in Excel and in Power BI is Power Query.
3) So far in this class, we have performed Data Modeling to convert Raw Data into a single Proper Data Set, such as:

1. In Video #6, we used Power Query to Split by Delimiter to create a single Proper Data Set that we used as the source data for a
PivotTable Report, as seen here:

2. In Video #6, we used Power Query to append multiple Text Files into a single Proper Data Set:

3.

4. In example #1 this video, Video #7, we will use Spreadsheet Functions to gather the raw data from three different tables and converge it
into a single Proper Data Set that we can use to build a requested report, as seen on the next page:

Page 2 of 22
Page 3 of 22
5. In example #2 this video, Video #7, we will use Power Query to gather the raw data from two different tables in an Access database abd
one table from an Excel Sheet and converge it into a single Proper Data Set that we can use to build a requested report, as seen below:

Page 4 of 22
6. In example #3 this video, Video #7, we will use Power Pivot’s Relationship feature and Implicit Measure feature to show three tables
(from an Excel Sheet) in the PivotTable Field List and create our desired report, as seen here:

Page 5 of 22
2. VLOOKUP Function
1) Looking things up in Lookup Tables is a common task in business, accounting and other professions.
2) Almost all Lookup Tables are Vertical because the first column contains the item that we try to match, and items
are listed vertically.
i. Examples of Looking up items in a Vertical Lookup Table:
1. This is a Price Lookup Table:

2. This is a Commission Bonus % Lookup Table:

3. This is an Employee Lookup Table:

Page 6 of 22
4. This is a Tax Lookup Table:

5. This is a Region Lookup Table:

6. This is a Commission Bonus $ Lookup Table:

7. This is a Sales Category Table:

Page 7 of 22
3) What does VLOOKUP Function do?
i. VLOOKUP tries to find a match of an item in the first column of the Lookup Table and then retrieves
(goes and gets) something from one of the other columns in the table and bring it back to the cell or
formula.
ii. In VLOOKUP the V means Vertical.

iii. Example: VLOOKUP can find a match for the Sales Number 17,382 in the sorted first column of the
Lookup Table and retrieve the correct Bonus Commission %, 1.00%, from the 2nd column and bring it
back to the cell C30, like in this picture:

iv. Example: VLOOKUP can find a match for the Product “Quad” in the first column of the Lookup Table and
retrieve the Quad’s Price, 43.95, from the 3rd column and bring it back to the cell F23.

Page 8 of 22
4) VLOOKUP Function arguments:

=VLOOKUP( lookup_value , table_array , col_index_num , [range_lookup] )

i. lookup_value = Item that you are trying to find in first column of lookup table.
ii. table_array = Vertical table = Lookup table. First Column contains items you want to “match” with the
lookup_value.
iii. col_index_num = Which column in the lookup table has the items that you want to go and get and bring
back to the cell? You have to count to determine which columns contains the items you want to
retrieve: is it column 2, or column 3, or column 4, and so on.
iv. [range_lookup] = Because there are two different types of lookup, we must tell VLOOKUP which of the
two lookups we want it to do: either: Approximate Match Lookup or Exact Match Lookup. This argument
tells VLOOKUP how to find a match in the first column of the Lookup Table.
1. Approximate Match:
• For “Approximate Match” we must put = TRUE or 1 or omitted.

• How Approximate Match works:


1. For Approximate Match the VLOOKUP table MUST be sorted on the first
column: Ascending, A to Z (Small to Big).
2. This is how Approximate Match Lookup works:
i. It will look through the first column:
1. If the first value in the table is smaller than the lookup_value,
VLOOKUP returns a Not Available Error: #N/A!
2. Then it looks through the first column until it bumps into the
first value bigger than it and then jump back one row. When it
finds a match, it knows what row it should look in.
i. It actually does a “binary search”, which is a technical
computer term for “Approximate Match”. “Binary
Search” calculates quickly compared to “Exact Match”.
3. If the lookup_value is bigger than the last value, it stops at the
last row.
2. Exact Match:
• For “Exact Match” we must put = FALSE or 0.

• How Exact Match works:


1. VLOOKUP will look though each item in the first column of the VLOOKUP table
and try to find a match. When it finds a match, it knows what row it should look
in.
2. If VLOOKUP cannot find a match it will it will give an #N/A! error that tells you it
did not find a match “it is not available”.
• Note about Exact Match: If you have very large data sets, Exact Match Lookup may
cause formula to calculate slowly because “Exact Match” Lookup must look through
every item, one-by-one, until it finds a match.
Page 9 of 22
3. Multiple Tables: Fact Tables and Dimension Tables (Lookup Tables)
1) In much of Data Analysis, Business Intelligence and Data Warehousing, we usually refer to our Proper
Data Sets as either a Fact Table or a Dimension Table, as defined here:
1. Fact Table (Also known as Transaction Table or Sales Table)
i. A Fact Table is a table that has numbers we need to summarize (like Sales or Units or
dates or times). The word "Fact" equals a measurement of business activities (like
amount of sales, or how many units sold, or clicks on web links)
ii. A Fact Table has Foreign Key columns that we will use in relationships (like Date,
SalesRepKey, ProductKey)
2. Dimension Table (also known as Lookup Table or Entity Table)
i. A Dimension Table is a table that has the first column as a Primary Key (Unique
Identifier) for the Entity (Product, Sales Rep and so on) used in relationship with Fact
Table.
ii. Remaining columns are attributes that we can use as:
1. Criteria / Filters / Categories / Report Labels for our Reports & Dashboards.
These columsn are often referred to as “Filtering Column”.
2. Values we can lookup (like Price).
3. Helper Columns (like Sort Helper Columns or Intermediate calculations).
iii. Synonyms for Filtering columns: Attribute, Criteria, Filters, Category, Categorical
attributes, Constraints, Groupings, Report labels for the reports and analytics.
2) Fact Tables are usually large. Dimension Tables are comparatively small.
3) Example:

Page 10 of 22
4. Power Pivot is just one of Many Tools in Excel
1) Excel is a program with many tools
2) Power Pivot is just one of the tools in Excel
3) The tool Power Pivot has two main parts:
i. Data Model
ii. Data Model PivotTables
4) The Data Model is made up of three main parts:
i. Columnar Database
ii. Relationships
iii. DAX Formulas
5) From the Data Model, we make Data Model PivotTables.
6) Picture:

Page 11 of 22
5. Excel Power Pivot and Data Model PivotTables
1) Excel Power Pivot comes in Office 365.
2) Show Power Pivot Ribbon Tab in Excel
i. Click on the File Tab, then click on Options, then on the left, click on Add-ins, then in the Manage textbox dropdown, select “Com Add-
ins”, then check the check Box for Power Pivot.
ii. In Excel, the Power Ribbon Tab looks like this:

Click the “Manage Data Model” button to open Power Pivot Window to look at the Data Model

Use this button to add Excel Tables from an Excel Worksheet to the Data Model

iii. When you open the Power Pivot Window, and click on Design View, you can see the Data Model that you created with the Relationships
feature, as seen here:

Power Pivot Window

Diagram View to see


tables and Relationships

Page 12 of 22
3) Excel Power Pivot allows us to build “Data Model” PivotTables, as opposed to “Standard PivotTables”
i. Reminder from E-DAB video #4:
Standard PivotTables:
• Use when you have about 50,000 rows of data or less.
• Use when you have one Proper Data Set with all your Data.
• You don’t mind applying Number Formatting every time you make a PivotTable
Calculation.
• PivotTable Calculations are sufficient.
Data Model PivotTables:
• Big Data. Good for two reasons when you have large data sets:
i. File size is reduced when your Data is in the Data Model.
ii. You can easily build reports from millions of rows of data (Excel
Spreadsheet only allows 1 million rows)
• Relationships for Multiple Tables. Great when you have more than one Proper
Data Set as the source Data. Allows you to use the Relationship feature rather
than the VLOOKUP function when you need to connect tables.
• DAX Formulas.
i. Has more options for calculations than a Standard PivotTable.
ii. Allows you to add Number Formatting to Formulas.
iii. Can use the same formula over and over.
4) Excel Power Pivot provides 3 Data Tools
i. Columnar Database = Behind the scenes in RAM Memory Efficient Big Data Analytics Database
ii. Relationships Between Tables = replace VLOOKUP and allow criteria and filters to affect reports
and visualizations from one table to another.
iii. DAX Formulas:
1. Efficiently Calculate Over Big Data.
2. Many More Calculations than in Standard PivotTable
3. Build One Formula that can work in many reports
4. Add Number Formatting to Formulas
5) Why the name Power Pivot?
i. Because Microsoft wanted to use the same amazing PivotTable user interface to drag and drop
fields to make reports but with more Power.
ii. The “Power” part of the name means:
1. We can make PivotTables from “Big Data”
2. We can make PivotTables from multiple Tables
3. We can use DAX Formulas, which can process over big data efficiently and which allows
us more varied calculations than in a Standard PivotTable.
iii. The “Pivot” part of the name means we can use a PivotTable user interface, that we all know
and love!
6) Data Model = Name for Power Pivot’s 3 Data Tools :
i. The Columnar Database, Relationships and DAX Formulas together are called the “Data Model”.
7) From the Data Model we make Data Model PivotTables.
i. Synonyms for Data Model PivotTable:
1. Power Pivot Report
2. Power Pivot PivotTable
ii. Data Model PivotTables create summary reports with one or more calculations based on
conditions / criteria / filters

Page 13 of 22
8) Basic Advantages of Excel Power Pivot
i. Can work on Millions of rows of data
ii. Can Reduce file size on data sets with less than a million rows
iii. Can use Relationships and Multiple Tables rather than VLOOKUP and a single Flat Table.
iv. DAX formulas provide more variety that in a Standard PivotTable and can work efficiently on Big
Data that is stored in the Columnar Database.
9) Relationship feature works in versions of Excel 2013 or later
i. This means that if you have Excel 2013 or 2016, but you do not have the correct version with the
Power Pivot Com Add-in, you can still use the Relationships feature to add two or more tables to
a PivotTable field list and then make a PivotTable based on multiple tables. However, if you do
not have the correct version with the Com-Add-in, you will not be able to work in the Power
Pivot Data Model Window.
10) DAX Formulas
i. DAX = Data Analysis Expressions
ii. Types of DAX Formulas:
1. Calculated Column = New Columns add to tables in the Data Model. Video #8 will
demonstrate Calculated Columns.
2. Measures = Formulas used in Data Model PivotTables.
• Two types of Measures:
i. Implicit Measures = formulas automatically created by Power Pivot.
Video #7 will demonstrate Implicit Measures.
ii. Explicit Measures = formulas that Data Modeler creates. Video #8 will
demonstrate Explicit Measures.
• DAX Measures are different that the built-in calculation sin a Standard
PivotTable, like “Summarize Values By” and “Show Values As”.
• When you create a DAX Measure you create a formula using DAX Functions like
SUM, SUMX, AVERAGEX, CALCULATE, REALTED and others.
3. Table Formulas = deliver a table of values. Video #9 will demonstrate Table Formulas.

Page 14 of 22
6. Implicit vs. Explicit DAX Measures :
1) To show Implicit Measures:
i. In Excel, go to the Power Pivot Ribbon Tab, then in the Data Model group, click Manage button.
This opens up the Power Pivot Window.
ii. Then in the Power Pivot Window, go to the Advanced Ribbon Tab, then click on the Show
Show Implicit Measures Button. This will show the Implicit Measures in the Measure Grid (area below
Implicit
tables in Power Pivot Data Model).
Measure
iii. Here is a picture of the Implicit Measure sin the Measure Grid in the Power Pivot Data Model:
button

Power Pivot Window

Implicit Measures

Measure Grid

2) In general, it is okay to use Implicit Measures when you have a small data set (about 50,000 rows) and
the built-in calculations in a Standard PivotTable are sufficient.
3) Compare and contrast Implicit and Explicit Measures picture is on next page.

Page 15 of 22
Page 16 of 22
7. Power Query Merge feature
1) Power Query Has Six Types of Merges / Joins. This picture summarizes pictorially the six types of merges /
joins in Power Query:

2) What is a Merge / Join?


i. What does a Merge accomplish in Power Query?
• The Left Outer Merge is a substitute for VLOOKUP. Merge can help us to add a column to a
table by pulling matching values from a second table into the first table.
ii. Merge / Join Terminology:
• Merge is the word that we use in Power Query.
• Join is the word that is used in the SQL (Structured Query Language) and in other database
languages.
• Merge and Join will be synonyms for us.
3) Power Query Merges are similar to using VLOOKUP or Relationships.
i. VLOOKUP in Excel requires that you have two related columns if you want to lookup a value. We will
see how to do this Merge using a Left Outer Merge in Power Query.
ii. Relationships in the Excel Power Pivot or Power BI Desktop Data Model require that you have two
Related Columns. Relationships in Data Models allow us to accomplish many tasks, one of which is like
a Left Outer Merge.
4) Requirements for a Merge:
i. To Merge one or more queries, you must have the data imported into Power Query as a query.
ii. The Merge Feature is for Table Objects.
iii. Merges require Related Columns in one or more Table Objects.
Page 17 of 22
8. Data Types in Power Query.
1) Unlike Excel, we must properly Defined each Field with a Data Type. If we do not define the correct Data
Type, for example a dollar amount as Currency, then some of the calculations in Power Query, Excel, Power
Pivot and Power BI Desktop will not work correctly.
2) Here is a list of the Data Types in Power Query:
Data Types in Power Query Short Definition
Decimal Number Max 15 digits
Fixed Decimal Number Max 4 decimals to right of decimal
Whole Number No digits to right of decimal
Date/Time Date and Time together
Date Just Date
Time Just Time
Date/Time/Timezone Same as Date and Time
Duration Length of Time
Text Text - max length 268,435,456 Unicode characters
Ture/False Ture/False

Data Types in Power Query - with Long Definition


Decimal Number – Represents a 64 bit (eight-byte) floating point number. It’s the most common number type and corresponds to
numbers as you usually think of them. Although designed to handle numbers with fractional values, it also handles whole numbers.
The Decimal Number type can handle negative values from -1.79E +308 through -2.23E -308, 0, and positive values from 2.23E -
308 through 1.79E + 308. For example, numbers like 34, 34.01, and 34.000367063 are valid decimal numbers. The largest value
that can be represented in a Decimal Number type is 15 digits long. The decimal separator can occur anywhere in the number. The
Decimal Number type corresponds to how Excel stores its numbers.

Fixed Decimal Number – Has a fixed location for the decimal separator. The decimal separator always has four digits to its right and
allows for 19 digits of significance. The largest value it can represent is 922,337,203,685,477.5807 (positive or negative). The Fixed
Decimal Number type is useful in cases where rounding might introduce errors. When you work with many numbers that have
small fractional values, they can sometimes accumulate and force a number to be slightly off. Since the values past the four digits to
the right of decimal separator are truncated, the Fixed Decimal type can help you avoid these kinds of errors. If you’re familiar with
SQL Server, this data type corresponds to SQL Server’s Decimal (19,4), or the Currency Data type in Power Pivot.
Whole Number – Represents a 64 bit (eight-byte) integer value. Because it’s an integer, it has no digits to the right of the decimal
place. It allows for 19 digits; positive or negative whole numbers between -9,223,372,036,854,775,808 (-2^63) and
9,223,372,036,854,775,807 (2^63-1). It can represent the largest possible number of the various numeric data types. As with the
Fixed Decimal type, the Whole Number type can be useful in cases where you need to control rounding.
Date/Time  – Represents both a date and time value. Underneath the covers, the Date/Time value is stored as a Decimal Number
Type. So you can actually convert between the two. The time portion of a date is stored as a fraction to whole multiples of 1/300
seconds (3.33 ms). Dates between years 1900 and 9999 are supported.
Date  – Represents just a Date (no time portion). When converted into the model, a Date is the same as a Date/Time value with zero
for the fractional value.
Time  – Represents just Time (no Date portion). When converted into the model, a Time value is the same as a Date/Time value with
no digits to the left of the decimal place.
Date/Time/Timezone – Represents a UTC Date/Time. Currently, it’s converted into Date/Time when loaded into the model.
Duration – Represents a length of time. It’s converted into a Decimal Number Type when loaded into the model. As a Decimal
Number type it can be added or subtracted from a Date/Time field with correct results. As a Decimal Number type, you can easily
use it in visualizations that show magnitude.
Text - A Unicode character data string. Can be strings, numbers, or dates represented in a text format. Maximum string length is
268,435,456 Unicode characters (256 mega characters) or 536,870,912 bytes.
True/False – A Boolean value of either a True or False.

Page 18 of 22
9. Overview of Three Examples in Video

Page 19 of 22
10.VLOOKUP Video Example

Page 20 of 22
11. Power Query Video Example

Page 21 of 22
12.Power Pivot Relationships feature & Implicit Measure feature

Page 22 of 22

You might also like