An introductory session to DAX and common analytic patterns that we've built and used in enterprise environments. This session was originally presented at SQL Saturday Silicon Valley 2016.
4. Data Analysis Expressions (DAX)
What is DAX?
DAX is a language that allows us to write dynamic expressions for relataional
constructs, using familiar functions
Powerful dynamic data analysis tool for relational data
Expressions can traverse relationships!
Available in PowerPivot, PowerBI, and SSAS Tabular
Classic “Import Mode”
What isn’t DAX?
NOT a programming language
6. Measures
A measure is a formula/expression comprising functions applied to
columns and tables
Reusable aggregation evaluated differently, depending on how you use it
Measures can be nested
7. Functions
Logical
IF( logical test, <value if true>, <value if false> )
SWITCH( <expression>, <value>, <result>, … ) – evaluates an expression against a list of values, and returns the result corresponding to the first matching value
TRUE() – returns logical true
Aggregate
SUM( <column> ) – adds all the numbers in a column
DIVIDE( <numerator>, <denominator>, [, <alternateresult>] ) – basic division; optional value returned
Statistical
MAX( <column> ) – returns the largest numeric value in a <column>
MIN( <column> ) – returns the smallest value in a <column>
Text
BLANK() – returns a blank
Filter
FILTER(<table>, <filter>) – returns a table representing a subset of another table or expression
ALL( <table> | <column>) – returns all the rows in a table, ignoring any filter context
VALUES(<table or column>) – returns one column of the distinct values from the specified column or table
CALCULATE( <expression>, <filter1>, <filter2>, … ) – evaluates an expression in a context that is modified by the specified filters
8. Functions (cont’d)
Date and Time
TODAY() – returns the current date
NOW() – returns the current date and time in datetime format
Time Intelligence*
DATESBETWEEN(<dates>, <start date>, <end date>) – returns a table of <dates> starting with the <start date> and continues
until the <end date>.
NEXTDAY(<dates>) – returns a table that contains a column with the next dates following each of the <dates> passed
FIRSTDATE(<dates>) – returns the first date in the context of the specified column of dates
LASTDATE(<dates>) – returns the last date in the context of the specified column of dates
SAMEPERIODLASTYEAR(<dates>) – returns a table with a column of dates shifted one year back for each of the <dates> specified
LASTNONBLANK( <column>, <expression> ) – returns last value in the <column> where the <expression> returns blank
FIRSTNONBLANK( <column>, <expression> ) – returns the first value in the <column> where the <expression> returns blank
*Requires Date Table
9. Evaluation Contexts
Evaluation Contexts:
Filter Context
Four types of filter context:
1. Row Selection
2. Column Selection
3. Slicer Selection
4. Filter Selection
Defines the subset of data a measure is calculated using aka “Which rows are selected based on which attribute
values?”
Applied before anything else
Row Context
All the columns in the Current Row
“DAX is simple, it’s not easy, but it’s
simple”
- Alberto Ferrari
16. Cumulative Total Measures
Aggregates values of a column for the currently selected date and all previous dates within
the specified range
Can be used to derive balances from transactions eg.
Inventory Stock
Balances
Cumulative Balances
Does not require use of Time Intelligence functions
17. Cumulative Measure Demo - PowerBI
Cumulative Energy Generated (Checked) =
IF (
MIN ( 'Date'[Date] ) <= MAX ( ‘Output’[Date], ALL ( ‘Output’ ) ) ,
CALCULATE (
SUM ( 'Output'[Energy Generated] ),
FILTER ( ALL ( 'Date'[Date] ), 'Date'[Date] <= MAX ( 'Date'[Date] ) )
)
)
18. Year-to-Date Total
TOTALYTD function applies the expression for all data from the start of the year to the
currently selected date in the filter context
Year To Date =
TOTALYTD( <expression>, <dates> [, <filter>] [, <year end date>] )
19. Year-to-Date Total Demo - PowerBI
Total Energy Generated YTD =
TOTALYTD ( SUM ( 'Output'[Energy Generated] ), 'Date'[Date] )
20. Year Over Year
Use time intelligence to calculate an aggregate for the same period last year
“Last Year” measure is used to compare to “Current Year” measure, and/or to derive a
measure of the change year-over-year
21. Year-Over-Year Demo – PowerBI
Total Energy Generated Last Year =
CALCULATE ( [Total Energy Generated], SAMEPERIODLASTYEAR ( 'Date'[Date] ) )
22. Semi-Additive Measures
Snapshot Fact Table with balance values, such as Inventory or
Account Balances
These scenarios disallow us from summing across time
The solution is to sum across all attributes except for time by
filtering for only a single point in time (eg. last date in the period)
Several functions allow you to adjust filter context to a single
point in time, within the original context period
FIRSTDATE / LASTDATE
FIRSTNONBLANK / LASTNONBLANK
OPENING… / CLOSING…
23. Semi-Additive Measure Demo -
PowerPivot
Total On Hand Quantity LASTNONBLANK =
CALCULATE (
SUM ( Inventory[OnHandQuantity] ),
LASTNONBLANK ( 'Date'[Date],
CALCULATE ( SUM ( Inventory[OnHandQuantity] ) ) )
)
24. Disconnected Slicers
Allows you to use a slicer to modify measures
Measure Switching
Used to switch between a set of measure values in a container measure
25. Disconnected Slicers Demo - PowerBI
Setup Steps:
1. Create/identify your target measures (eg. [Energy Exported] & [Energy
Generated])
2. Create disconnected table to use in slicer selection
3. Create background “value selection measure” using MAX()
• Hide this!
4. Create SWITCH measure to use in visualizations
27. Value Binning
Used to group similar values, or to bin values for
analysis aka Histograms
Can bin values based on equality, or inequality
comparisons with the SWITCH() function
Use Cases:
Age groups
Product Groups
Any kind of frequency distributions
29. Summary
DAX is dynamic because you can write measures that correctly evaluate under
their current Evaluation Context
Filter Context
Row Context
Functions are the building blocks of our measures and perform a myriad of
tasks eg. altering Context, aggregating, logical operations, time intelligence,
etc
Time Intelligence functions require a Date table to operate
Understanding Tabular Data Modeling will go a long way towards helping your
understanding of DAX
How many people have worked with Excel formulas?
How many people have worked with PowerPivot?
Story:
Introduced to data by my Dad (scary DBA types)
Began learning Relational Databases and attending SQL Saturday’s
Indexing internals – Kalen Delaney
T-SQL – Kevin Boles
Merge Operators – Ami Levin
Developer/DBA
Really interested in BI (DW)
Realized my data analysis tool belt was lacking
So I decided to present on DAX
Introduction to DAX
Introduction to Measure and Analysis concepts
Introduction to Evaluation Context
Introduction to Measure and Calculated Column Patterns (which happen to often alter Evaluation Context)
“DAX is a language that allows us to write dynamic expressions for relational constructs, using familiar functions”
What is DAX?
though it shares functions with Excel formula language, it differs by being intended for analysis of relational data
There are some minor differences between PowerBI and PowerPivot eg. the colon
Review CALCULATE in detail
We need to understand Evaluation Context so that we can begin altering it.
Filter Context
Row Context
Iterator Functions… not going to cover these much
Point out the parts of the measure syntax
What sales does the measure sum?
Sum of all sales
Why do the numbers change when we add color to the rows?
The measure still evaluates sum of all sales
However now, it is operating under the context of a color, and can only sum sales for that color!
The value of a formula depends on it’s context.
What we put in a context filters the subset of data we can measure, and so we call it the Filter Context!
Diagram View:
Already setup data model ( imported tables )
Created relationships between tables; DAX can traverse these without an explicit join
Data Grid:
Created simple SUM
Shows SUM of all Sales
Add Continent to Rows; filter context changes
Add ‘Product’[Product Class] to Columns; filter context changes
Diagram View:
Create SUM that adds Asia sales | CALCULATE ( SUM ( ‘Sales’[SalesAmount] ) [Total Sales], ‘Geography’[Continent] = “Asia” )
Show results
Create SUM that only adds Asia sales; returns blank all others | CALCULATE ( [Total Sales], FILTER ( ‘Geography’, ‘Geography’[Continent] = “Asia” ) )
Demo: show basic measures in PowerPivot then PowerBI
Demo: CALCULATE with a simple argument against Geography
To avoid calculating values for dates greater than the max date in the transactions table (‘Output’), add a check that the minimum ‘Date’[DateKey] <= maximum ‘Transactions’[Date]
Can also replace the MAX(‘Transactions’[Date]) check with TODAY(), however this is not always valid as it assumes all data is “current”
Paste to exclude future dates:
IF ( MIN ( 'Date'[Date] ) <= CALCULATE ( MAX ( 'Output'[Date] ), ALL ( 'Output' ) ),
TOTALYTD function applies the expression for all data from the start of the year to the currently selected date in the filter context
TOTALYTD function applies the expression for all data from the start of the year to the currently selected date in the filter context
Paste to exclude future dates:
IF ( MIN ( 'Date'[Date] ) <= CALCULATE ( MAX ( 'Output'[Date] ), ALL ( 'Output' ) ),
DIVIDE ( [Sales] – [Last Year Sales], [Last Year Sales] )
Paste to exclude future dates:
IF ( MIN ( 'Date'[Date] ) <= CALCULATE ( MAX ( 'Output'[Date] ), ALL ( 'Output' ) ),
How can we alter the filter context to only a single point in time?
Using CALCULATE to override the context
First and Last Date will not return values - if the date a period ends on has no data
* Time intelligence functions in PowerBI require the DateKey to be a Date data type
* The expression argument in LASTNONBLANK must be wrapped in a CALCULATE, otherwise it will use the original filter context, and not the LASTNONBLANK column argument context
Setup Target Measures ( [Energy Export] & [Energy Generated] )
User defined table ( disconnected )
Value selection measure ( MAX ( ‘disconnected_table’[id] ) )
Switching measure
Demo: Inventory Aging
Demo: Inventory Aging
In this case, I am using the Inventory Semi-Additive Measure as well