Mastering DAX - Color 1 Slide Per Page
Mastering DAX - Color 1 Slide Per Page
com
We write We teach We provide We are recognized
Books Courses Consulting BI Experts
Remote
Consulting
Power BI/SSAS
Optimization
BI Architectural
Review
On-Site
Consulting
Custom Training
& Mentoring www.sq lb i.co m
The DAX language
o Language of
• Power BI
• Analysis Services Tabular
• Power Pivot
o DAX is simple, but it is not easy
o New programming concepts and patterns
Agenda (1)
• Evaluation Contexts
Hands-on 3
• CALCULATE
Introduction to DAX
What is DAX?
o Programming language
• Power BI
• Analysis Services Tabular
• Power Pivot
o Resembles Excel
• Because it was born with PowerPivot
• Important differences
• No concept of «row» and «column»
• Different type system
o Many new functions
o Designed for data models and business
Functional language
DAX is a functional language, the execution flows with function calls, here is an example of a
DAX formula
=SUMX (
FILTER (
VALUES ( 'Date'[Year] ),
'Date'[Year] < 2005
),
IF (
'Date'[Year] >= 2000,
[Sales Amount] * 100,
[Sales Amount] * 90
)
)
If it is not formatted, it is not DAX.
Code formatting is of paramount importance in DAX.
=SUMX(FILTER(VALUES('Date'[Year]),'Date'[Year]<2005),IF('Date'[Year]>=2000,[S
ales Amount]*100,[Sales Amount]*90)
=SUMX (
FILTER (
VALUES ( 'Date'[Year] ),
'Date'[Year] < 2005
),
IF (
'Date'[Year] >= 2000,
[Sales Amount] * 100,
[Sales Amount] * 90
) www.daxformatter.com
DAX types
o Numeric types
• Integer (64 bit)
• Decimal (floating point)
• Currency (money)
• Date (DateTime)
• TRUE / FALSE (Boolean)
o Other types
Format strings are not
• String
• Binary Objects
data types!
DAX type handling
o Operator Overloading
• Operators are not strongly typed
• The result depends on the inputs
o Example:
• "5" + "4" = 9
• 5 & 4 = "54"
o Conversion happens when needed
• That is: when you don’t want it to happen
DateTime
o Floating point value
o Integer part
• Number of days after December, 30, 1899
o Decimal part
• Seconds: 1 / ( 24 * 60 * 60 )
o DateTime Expressions
• Date + 1 = The day after
• Date - 1 = The day before
Column references
o The general format to reference a column
• 'TableName'[ColumnName]
o Quotes can be omitted
• If TableName does not contain spaces
• Do it: don’t use spaces, omit quotes
o TableName can be omitted
• Current table is searched for ColumnName
• Don’t do it, harder to understand
Column values / references
You use the same syntax to reference a column or its value. The semantics depends on the
context, that is the function that uses the column reference
--
-- Here you want the value of the columns for a specific row
--
IF (
Sales[Quantity] > 3,
Sales[Net Price] * Sales[Quantity]
)
--
-- Here you reference the entire column
--
DISTINCTCOUNT ( Sales[Quantity] )
Calculated columns
o Columns computed using DAX
o Always computed row by row
o Product[Price] means
• The value of the Price column (explicit)
• In the Product table (explicit, optional)
• For the current row (implicit)
• Different for each row
Calculated columns and measures
o GrossMargin = SalesAmount - ProductCost
• Calculated column
o GrossMargin% = GrossMargin / SalesAmount
• Cannot be computed row by row
o Measures needed
Measures
o Written using DAX
o Do not work row by row
o Instead, use tables and aggregators
o Do not have the «current row» concept
o Examples
• GrossMargin
• is a calculated column
• but can be a measure too
• GrossMargin %
• needs to be a measure
Naming convention
o Measures should not belong to a table
• Avoid table name
• [Margin%] instead of Sales[Margin%]
• Easier to move to another table
• Easier to identify as a measure
o Thus
• Calculated columns Table[Column]
• Measures [Measure]
Measures vs calculated columns
o Use a column when
• Need to slice or filter on the value
o Use a measure
• Calculate percentages
• Calculate ratios
• Need complex aggregations
o Space and CPU usage
• Columns consume memory
• Measures consume CPU
Aggregation functions
o Useful to aggregate values
• SUM
• AVERAGE
• MIN
• MAX
o Aggregate only one column
• SUM ( Orders[Price] )
• SUM ( Orders[Price] * Orders[Quantity] )
The «X» aggregation functions
o Iterators: useful to aggregate formulas
• SUMX
• AVERAGEX
• MINX
• MAXX
o Iterate over the table and evaluate the expression for
each row
o Always receive two parameters
• Table to iterate
• Formula to evaluate for each row
Example of SUMX
For each row in the Sales table, evaluates the formula, then sum up all the results
Inside the formula, there is a «current row»
SUMX (
Sales,
Sales[Price] * Sales[Quantity]
)
Handling errors
o SUMX ( Sales, Sales[Quantity] )
o fails if Sales[Quantity] is a string and cannot be converted
o Causes of errors
• Conversion errors
• Arithmetical operations
• Empty or missing values
ISERROR (Expression)
Evaluates Expression
Returns TRUE or FALSE, depending on the presence of an error during evaluation
ISERROR (
Sales[GrossMargin] / Sales[Amount]
)
IFERROR (Expression, Alternative)
In case of error returns Alternative
Useful to avoid writing expression twice
IFERROR (
Sales[GrossMargin] / Sales[Amount],
BLANK ()
)
Counting values
o Useful to count values
• COUNT/COUNTA (counts anything but blanks)
• COUNTBLANK (counts blanks)
• COUNTROWS (rows in a table)
• DISTINCTCOUNT (performs distinct count)
o COUNTROWS ( Sales )
= COUNT ( Sales[SalesID] )
+ COUNTBLANK ( Sales[SalesID] )
Logical functions
o Provide Boolean logic
• AND
• OR
• NOT
• IF
• IFERROR
o AND / OR can be expressed with operators:
AND ( A, B ) = A && B
OR ( A, B ) = A || B
IN operator
o Verify if the result of an expression is included in a list of
values:
Customer[State] IN { "WA", "NY", "CA" }
SizeDesc =
SWITCH (
Product[Size],
"S", "Small",
"M", "Medium",
"L", "Large",
"XL", "Extra Large",
"Other"
)
Switch to perform CASE WHEN
Creative usage of SWITCH might be very useful
DiscountPct =
SWITCH (
TRUE (),
Product[Size] = "S", 0.5,
AND ( Product[Size] = "L", Product[Price] < 100 ), 0.2,
Product[Size] = "L", 0.35,
0
)
Information functions
o Provide information about expressions
• ISBLANK
• ISNUMBER
• ISTEXT
• ISNONTEXT
• ISERROR
o Not very useful, you should know the type of the
expressions you are using
MAX and MIN 2015
--
-- Computes the maximum of sales amount
--
MAX ( Sales[SalesAmount] ) MAXX ( Sales, Sales[SalesAmount] )
--
-- Computes the maximum between amount and listprice
--
MAX ( Sales[Amount], Sales[ListPrice] ) IF (
Sales[Amount] > Sales[ListPrice],
Sales[Amount],
Sales[ListPrice
)
Mathematical functions
o Provide… math
• ABS, EXP
• FACT, LN
• LOG, LOG10
• MOD, PI
• POWER, QUOTIENT
• SIGN, SQRT
o Work exactly as you would expect
The DIVIDE function
Divide is useful to avoid using IF inside an expression to check for zero denominators
IF (
Sales[SalesAmount] <> 0,
Sales[GrossMargin] / Sales[SalesAmount],
0
)
DIVIDE (
Sales[GrossMargin],
Sales[SalesAmount],
0
)
Using variables 2015
VAR
TotalQuantity = SUM ( Sales[Quantity] )
RETURN
IF (
TotalQuantity > 1000,
TotalQuantity * 0.95,
TotalQuantity * 1.25
)
Rounding functions
o Many different rounding functions:
• FLOOR ( Value, 0.01 )
• TRUNC ( Value, 2 )
• ROUNDDOWN ( Value, 2 )
• MROUND ( Value, 0.01 )
• ROUND ( Value, 2 )
• CEILING ( Value, 0.01 )
• ISO.CEILING ( Value, 2 )
• ROUNDUP ( Value, 2 )
• INT ( Value )
Text functions
o Very similar to Excel ones
• CONCATENATE,
• FIND, FIXED, FORMAT,
• LEFT, LEN, LOWER, MID,
• REPLACE, REPT, RIGHT,
• SEARCH, SUBSTITUTE, TRIM,
• UPPER, VALUE, EXACT
• CONCATENATE, CONCATENATEX
Date functions
o Many useful functions
• DATE, DATEVALUE, DAY, EDATE,
• EOMONTH, HOUR, MINUTE,
• MONTH, NOW, SECOND, TIME,
• TIMEVALUE, TODAY, WEEKDAY,
• WEEKNUM, YEAR, YEARFRAC
Table Functions
Table functions
o Basic functions that work on full tables
• FILTER
• ALL
• VALUES
• DISTINCT
• RELATEDTABLE
o Their result is often used in other functions
o They can be combined together to form complex expressions
o We will discover many other table functions later in the course
Filtering a table
SUMX (
FILTER (
Orders,
Orders[Price] > 1
),
Orders[Quantity] * Orders[Price]
)
The FILTER function
o FILTER
• Adds a new condition
• Restricts the number of rows of a table
• Returns a table
• Can be iterated by an «X» function
o Needs a table as input
o The input can be another FILTER
Removing filters
SUMX (
ALL ( Orders ),
Orders[Quantity] * Orders[Price]
)
The ALL function
o ALL
• Returns all the rows of a table
• Ignores any filter
• Returns a table
• That can be iterated by an «X» function
o Needs a table as input
o Can be used with a single column
• ALL ( Customers[CustomerName] )
• The result contains a table with one column
ALL with many columns
Returns a table with all the values of all the columns passed as parameters
COUNTROWS (
ALL (
Orders[Channel],
Orders[Color], Columns of the same table
Orders[Size]
)
)
ALLEXCEPT
Returns a table with all existing combinations of the given columns
ALL (
Orders[Channel],
Orders[Color],
Orders[Size], Orders[City] not listed here
Orders[Quantity],
Orders[Price],
Orders[Amount]
)
NumOfProducts :=
COUNTROWS (
DISTINCT ( Product[ProductCode] )
)
How many values for a column?
Amount ProductId ProductId Product
25.00 1 1 Coffee
12.50 2 Relationship 2 Pasta
2.25 3 3 Tomato
2.50 3 BLANK BLANK
14.00 4
NumOfProducts :=
COUNTROWS (
VALUES ( Product[ProductCode] )
)
ALLNOBLANKROW
ALL returns the blank row, if it exists. ALLNOBLANKROW omits it
--
-- Returns only the existing products
--
= COUNTROWS (
ALLNOBLANKROW ( Products[ProductKey] )
)
Counting different values
VisibleValues =
CONCATENATEX (
VALUES ( Customer[Continent] ),
Customer[Continent],
", "
)
SelectedValues =
CONCATENATEX (
ALLSELECTED ( Customer[Continent] ),
Customer[Continent],
", "
)
RELATEDTABLE
Returns a table with all the rows related with the current one.
NumOfProducts =
COUNTROWS (
RELATEDTABLE ( Product )
)
Example of RELATEDTABLE
Compute the number of red products for a category. Build a calculated column in the Categories
table:
NumOfRedProducts =
COUNTROWS (
FILTER (
RELATEDTABLE ( Product ),
Product[Color] = "Red"
)
)
Tables and relationships
Result of table function inherits the relationships of their columns
=SUMX (
FILTER (
ProductCategory,
COUNTROWS (
RELATEDTABLE ( Product )
) > 10
),
SUMX (
RELATEDTABLE ( Sales ),
Sales[SalesAmount]
)
)
Variables can store tables too 2015
VAR
SalesGreaterThan10 = FILTER ( Sales, Sales[Quantity] > 10 )
RETURN
SUMX (
FILTER (
SalesGreaterThan10,
RELATED ( Product[Color] ) = "Red"
),
Sales[Amount]
)
Tables with one row and one column
When a table contains ONE row and ONE column, you can treat is as a scalar value
IF (
HASONEVALUE ( Customers[YearlyIncome] ),
DIVIDE (
SUM ( Sales[Sales Amount] ),
VALUES ( Customers[YearlyIncome] )
)
)
SELECTEDVALUE 2017
Retrieves the current value from the filter context, if only one is available, a default value
otherwise
DIVIDE (
SUM ( Sales[Sales Amount] ),
SELECTEDVALUE ( Customers[YearlyIncome] )
)
Equivalent to:
DIVIDE (
SUM ( Sales[Sales Amount] ),
IF (
HASONEVALUE ( Customers[YearlyIncome] ),
VALUES ( Customers[YearlyIncome] )
)
)
SELECTEDVALUE 2017
SELECTEDVALUE ( Customers[NumberOfChildrenAtHome], 0 )
Equivalent to:
IF (
HASONEVALUE ( Customers[NumberOfChildrenAtHome] ),
VALUES ( Customers[NumberOfChildrenAtHome] ),
0
)
Calculated tables 2015
Checks if a table is empty, might be faster than the equivalent expression with COUNTROWS
is equivalent to
Evaluation Contexts
Evaluation contexts
o Simple concepts, hard to learn
o Refine the concept over time
• First, this introduction
• Next, several examples
• Last, the unifying theory of expanded tables
o At the beginning, it looks very easy
o When you use it, it turns into a nightmare
What is an evaluation context?
Rows
Slicers
Example of a filter context
Filter context in Power BI
Calendar Year
CY 2007
Education
High School
Partial College
Brand
Contoso
Filter context
o Defined by
• Row Selection
• Column Selection
• Report Filters
• Slicers Selection
o Rows outside of the filter context
• Are not considered for the computation
o Defined automatically by the client, too
o Can also be created with specific functions
Row context
o Defined by
• Calculated column definition
• Defined automatically for each row
• Row Iteration functions
• SUMX, AVERAGEX …
• All «X» functions and iterators
• Defined by the user formulas
o Needed to evaluate column values, it is the concept of
“current row”
SUMX ( Orders, Orders[Quantity]*Orders[Price] )
SUM = 592
16x7=112
32x5=160
64x3=192
128x1=128
Channel
Internet
There are always two contexts
o Filter context
• Filters tables
• Might be empty
• All the tables are visible
• But this never happens in the real world
o Row context
• Iterates rows
• For the rows active in the filter context
• Might be empty
• There is no iteration running
o Both are «evaluation contexts»
Context errors
o Orders[Quantity] * Orders[Price]
o In a calculated column
• Works fine
o In a measure
• «Cannot be determined in the current context»
• You cannot evaluate a column value outside of a row
context
Working with evaluation contexts
o Evaluation contexts
• Modified by the user
• With the user interface
• Modified Programmatically
• Adding / Removing filters
• Creating row contexts
• Creating new filter contexts
o Using contexts is the key to many DAX advanced
formulas
Filtering a table
SUMX (
FILTER (
Orders,
Orders[Price] > 1
),
Orders[Quantity] * Orders[Price]
)
Channel Color
Internet Green
Ignoring filters
SUMX (
ALL ( Orders ),
Orders[Quantity] * Orders[Price]
)
Channel Color
Internet Green
Using RELATED in a row context
Starting from a row context, you can use RELATED to access columns in related tables.
SUMX (
Sales,
Sales[Quantity]
* RELATED ( Products[ListPrice] )
* RELATED ( Categories[Discount] )
)
SUMX (
Categories,
SUMX (
RELATEDTABLE ( Products ),
SUMX (
RELATEDTABLE ( Sales )
( Sales[Quantity] * Products[ListPrice] ) * Categories[Discount]
)
)
) Three row contexts:
• Categories
• Products of category
• Sales of product
Ranking by price
o Create a calculated column
o Ranks products
by list price
o Most expensive
product is
ranked 1
Nesting row contexts 2015
When you nest row contexts on the same table, you can use a variable to save the value of the
outer row context
Row context of the
calculated column
Products[RankOnPrice] =
VAR CurrentListPrice = Products[ListPrice]
RETURN Get ListPrice from the
COUNTROWS ( outer row context
FILTER (
Products,
Products[ListPrice] >= CurrentListPrice
)
) + 1
COUNTROWS (
FILTER (
Products,
Products[ListPrice] >= EARLIER ( Products[ListPrice] )
)
) + 1
CALCULATE
CALCULATE syntax
Filters are evaluated in the outer filter context, then combined together in AND and finally used
to build a new filter context into which DAX evaluates the expression
CALCULATE (
Expression,
Filter1,
… Repeated many times, as needed
Filtern
)
NumOfBigSales :=
CALCULATE (
SUM ( Sales[SalesAmount] ),
Sales[SalesAmount] > 100
)
CALCULATE (
SUM ( Sales[SalesAmount] ),
Sales[SalesAmount] > 100
)
Is equivalent to
CALCULATE (
SUM ( Sales[SalesAmount] ),
FILTER (
ALL ( Sales[SalesAmount] ),
Sales[SalesAmount] > 100
)
)
CALCULATE examples
Compute the sum of sales amount for all of the products, regardless of the user selection
SalesInAllColors :=
CALCULATE (
SUM ( Sales[SalesAmount] ),
ALL ( Product[Color] )
)
NumOfRedProducts :=
CALCULATE (
SUM ( Sales[SalesAmount] ),
Product[Color] = "Red"
)
Filter and SUM are
on different tables,
here CALCULATE is
your best option
What is a filter context?
CALCULATE (
…,
Product[Color] IN { "Red", "Black" },
FILTER (
ALL ( Date[Year], Date[Month] ),
OR (
AND ( Date[Year] = 2006, Date[Month] = "December" ),
AND ( Date[Year] = 2007, Date[Month] = "January" )
)
)
)
CALCULATE (
…,
Product[Color] IN { "Red", "Black" },
FILTER (
ALL ( Date[Year], Date[Month] ),
OR (
AND ( Date[Year] = 2006, Date[Month] = "December" ),
AND ( Date[Year] = 2007, Date[Month] = "January" )
)
)
)
Intersection of filter context
Used by CALCULATE to put filters in AND
INTERSECT
Black Blue
OVERWRITE
Color
Yellow
Black
KEEPFILTERS
KEEPFILTERS retains the previous filters, instead of replacing them
CALCULATE (
CALCULATE (
…,
KEEPFILTERS ( Product[Color] IN { "Yellow", "Black" } )
),
Product[Color] { "Black", "Blue" } Color Color
) Yellow Black
Black Blue
KEEPFILTERS
Color Color
At the end, only BLACK remains visible Yellow Black
Black Blue
ALL performs removal of filters
o ALL has a very special behavior
• ALL is a table function
• But, in a CALCULATE filter it is different
o Should be named REMOVEFILTER
• Removes the filters from all the columns contained in the
table it returns
o Only when used in CALCULATE – otherwise, it is a table
function
What seems to be happening
CALCULATE (
CALCULATE (
…
ALL ( Date[Year] )
),
Date[Year] = 2007
)
Year
Year
2005
2007
2006
OVERWRITE
2007
Year
2005
2006
2007
What really happens
CALCULATE (
CALCULATE (
…
ALL ( Date[Year] )
),
Date[Year] = 2007
)
2007
REMOVE
Empty Filter
CALCULATE options
o Overwrite a filter context at the individual columns level
o Remove previously existing filters (ALL)
o Add filters (KEEPFILTERS)
o In DAX you work by manipulating filters with the
following internal operators:
• INTERSECT (multiple filters in CALCULATE)
• OVERWRITE (nested CALCULATE)
• REMOVEFILTERS (using ALL)
• ADDFILTER (using KEEPFILTERS)
Use one column only in compact syntax
o Boolean filters can use only one column
• You cannot mix multiple columns in the same
boolean expression
CALCULATE (
Filters on tables are bad.
SUM( Orders[Amount] ), Not only bad… REALLY BAD!
Orders[Quantity] * 2 < Orders[Price]
)
Later we will see the details, for now
just remember: BAD BAD BAD!
CALCULATE (
SUM ( Orders[Amount] ),
FILTER (
Orders,
Orders[Quantity] * 2 < Orders[Price]
)
)
Overwriting filters on multiple columns
If you want to overwrite a filter on multiple columns, ALL with many columns is your best friend
CALCULATE (
SUM ( Orders[Amount] ),
FILTER (
ALL ( Orders[Quantity], Orders[Price] ),
Orders[Quantity] * 2 < Orders[Price]
)
)
Cannot use aggregators in compact syntax
Boolean filters cannot use aggregators
CALCULATE (
SUM( Orders[Amount] ),
Orders[Quantity] < SUM ( Orders[Quantity] ) / 100
)
ALLSELECTED ( table[column] )
ALLSELECTED ( table )
ALLSELECTED ()
ALLSELECTED: visual totals
CALCULATE (
SUM ( [SalesAmount] ),
ALLSELECTED ( DimCustomer[Occupation] )
)
= 11,467,188.64
ALLSELECTED ( )
CALCULATE (
SUM ( [SalesAmount] ),
ALLSELECTED ( )
)
= 22,918,596,36
Variables and evaluation contexts 2015
Variables are computed in the evaluation where they are defined, not in the one where they are
used
WrongRatio :=
VAR
Amt = SUM ( Sales[SalesAmount] )
RETURN
DIVIDE (
Amt,
CALCULATE ( Amt, ALL ( Sales ) )
)
Result is always 1
Context Transition
o Calculate performs another task
o If executed inside a row context
• It takes the row context
• Transforms it into an equivalent filter context
• Applies it to the data model
• Before computing its expression
o Very important and useful feature
• Better to learn it writing some code…
Automatic CALCULATE
Whenever a measure is computed, an automatic CALCULATE is added around the measure
SUMX (
Orders,
[Sales Amount]
)
SUMX (
Orders,
CALCULATE ( [Sales Amount] )
)
Equivalent filter context
The filter context created during context transition filters all equivalent rows.
By «equivalent» we mean all columns identical (possible duplicated rows).
TotalNetSales :=
SUMX ( Sales, Sales[Quantity] * Sales[Net Price] )
SalesWrong :=
SUMX (
FILTER ( Sales, Sales[Unit Discount] > 0 ),
[TotalNetSales]
)
SalesCorrect :=
SUMX (
FILTER ( Sales, Sales[Unit Discount] > 0 ),
Sales[Quantity] * Sales[Net Price]
)
Circular dependency
Circular dependency happens when two formulas depend one each other
DimProduct[ProfitPct] :=
DimProduct[Profit] / DimProduct[ListPrice]
DimProduct[Profit] :=
DimProduct[ProfitPct] * DimProduct[StandardCost]
CALCULATE dependencies
SumOfListPrice =
CALCULATE (
SUM ( Products[ListPrice] )
)
Sum the value of ListPrice for all the row in the DimProduct table which have
the same value for ProductKey, ProductAlternateKey, StandardCost and
ListPrice.
Circular dependencies
NewSumOfListPrice =
CALCULATE (
SUM ( Products[ListPrice] )
)
SumOfListPrice Sum the value of ListPrice for all the row in the DimProduct table which have the
same value for ProductKey, ProductAlternateKey, StandardCost, ListPrice and
NewSumOfListPrice.
NewSumOfListPrice Sum the value of ListPrice for all the row in the DimProduct table which have the
same value for ProductKey, ProductAlternateKey, StandardCost, ListPrice and
SumOfListPrice.
Circular dependency solution
o Either:
• Mark a column as the row identifier
• Create an incoming relationship
• Use ALLEXCEPT to remove dependencies
o With a row identifier
• CALCULATE columns depend on the row id
• All of them, no circular dependency
o Row Identifiers are expensive for fact tables
• Maximum number of distinct values
• Use with care, avoid if possible
Evaluation Contexts / part 2
o Time to write some DAX code by yourself.
o Next exercise session is focuses on some DAX code that
uses evaluation contexts.
o Please refer to lab number 3 on the hands-on manual –
just complete these exercises:
• Finding the Best Customers (p. 11-12)
• Understanding CALCULATE (p. 15-18)
Time to start thinking in DAX
There are many useful iterators, they all behave in the same way: iterate on a table, compute an
expression and aggregate its value
MAXX
MINX
AVERAGEX
SUMX
PRODUCTX
CONCATENATEX
VARX
STDEVX
PERCENTILEX.EXC | .INC
GEOMEANX
MIN-MAX sales per customer
Iterators can be used to compute values at a different granularity than the one set in the report
MinSalesPerCustomer :=
MaxSalesPerCustomer :=
VAR.S ( <column> )
VAR.P ( <column> )
VARX.S ( <table>, <expression> )
VARX.P ( <table>, <expression> )
STDEV.S ( <column> )
STDEV.P ( <column> )
STDEVX.S ( <table>, <expression> )
STDEVX.P ( <table>, <expression> )
PERCENTILE functions 2015
Returns the k-th percentile of values in a range, where k is in the range 0..1 (Exclusive or
Inclusive)
PERCENTILE.EXC
PERCENTILE.INC
PERCENTILEX.EXC
PERCENTILEX.INC
MEDIAN
MEDIANX
GEOMEAN
GEOMEANX
XIRR
XNPV
PRODUCTX
The RANKX function
Useful to compute ranking, not so easy to master…
--
-- Syntax is easy…
--
Ranking :=
RANKX (
Products,
Products[ListPrice]
)
How RANKX works
RANKX ( Table, Expression )
Build Lookup
1
Table Row Expression
1 120
Computed in a row context
3 350 on Table, one value for each
2 490 row during iteration and
sorted by Expression
4 560
RankOnSales :=
RANKX (
ALL ( Products ),
CALCULATE ( SUM ( Sales[Sales] ) )
)
RANKX 3° parameter
o Expression
• Evaluated in the row context of <table>
o Value
• Evaluated in the row context of the caller
• Defaults to <Expression>
• But can be a different one
RANKX (
Table,
Expression, Row Context of table
Value Context of the caller
)
RANKX 3° parameter need
Sometimes the expression can be invalid in one of the contexts.
RankOnPrice:=
IF (
HASONEVALUE ( Products[Product] ),
RANKX (
ALLSELECTED ( Products ),
Products[Price],
VALUES ( Products[Price] )
)
)
Working with iterators
Time to write some DAX code by yourself.
Next exercise session is focuses on some simple and not-
so-simple DAX code. Choose the exercise that best fit your
skills.
Please refer to lab number 4 on the hands-on manual.
Probably the most important table in your model
Automatically creates a calendar table based on the database content. Optionally you can specify the
last month (useful for fiscal years)
--
-- The parameter is the last month
-- of the fiscal year
--
= CALENDARAUTO (
6
) Beware: CALENDARAUTO uses
all the dates in your model,
excluding only calculated
columns and tables
CALENDAR 2015
Returns a table with a single column named “Date” containing a contiguous set of dates in the given
range, inclusive.
CALENDAR (
DATE ( 2005, 1, 1 ),
DATE ( 2015, 12, 31 )
)
CALENDAR (
MIN ( Sales[Order Date] ),
MAX ( Sales[Order Date] )
)
CALENDAR 2015
If you have multiple fact tables, you need to compute the correct values
=CALENDAR (
MIN (
MIN ( Sales[Order Date] ),
MIN ( Purchases[Purchase Date] )
),
MAX (
MAX ( Sales[Order Date] ),
MAX ( Purchases[Purchase Date] )
)
)
Mark as date table
o Need to mark the calendar as date table
o Set the column containing the date
o Needed to make time intelligence works if relationship
does not use a Date column
o Multiple tables can be marked as date table
o Used by client tools as metadata information
• Q&A
• Excel
Set sorting options
o Month names do not sort alphabetically
• April is not the first month of the year
o Use Sort By Column
o Set all sorting options in the proper way
o Beware of sorting granularity
• 1:1 between names and sort keys
Multiple dates
o Date is often a role dimension
• Many roles for a date
• Many date tables
o How many date tables?
• Try to use only one table
• Use many, only if needed by the model
• Many date tables lead to confusion
• And issues when slicing
o Use proper naming convention
Time intelligence functions
SalesAmount20150515 :=
CALCULATE (
SUM ( Sales[SalesAmount] ),
FILTER (
ALL ( 'Date'[Date] ),
AND (
'Date'[Date] >= DATE ( 2015, 1, 1 ),
'Date'[Date] <= DATE ( 2015, 5, 15 )
)
)
)
Sales 2015 up to 05-15 (v2)
You can replace the FILTER with DATESBETWEEN.
The result is always a table with a column.
SalesAmount20150515 :=
CALCULATE (
SUM ( Sales[SalesAmount] ),
DATESBETWEEN (
'Date'[Date],
DATE ( 2015, 1, 1 ),
DATE ( 2015, 5, 15 )
)
)
Sales Year-To-Date (v1)
Replace the static dates using DAX expressions that retrieve the last day in the current filter
SalesAmountYTD :=
CALCULATE (
SUM ( Sales[SalesAmount] ),
DATESBETWEEN (
'Date'[Date],
DATE ( YEAR ( MAX ( 'Date'[Date] ) ), 1, 1 ),
MAX ( 'Date'[Date] )
)
)
Year to date (Time Intelligence)
DATESYTD makes filtering much easier
SalesAmountYTD :=
CALCULATE (
SUM ( Sales[SalesAmount] ),
DATESYTD ( 'Date'[Date] )
)
Year to date: the easy way
TOTALYTD: the “DAX for dummies” version
It hides the presence of CALCULATE, so we suggest not to use it
SalesAmountYTD :=
TOTALYTD (
SUM ( Sales[SalesAmount] ),
'Date'[Date]
)
Use the correct parameter
The parameter is the date column in the Calendar table, not the Sales[OrderDate]. Otherwise, you get
wrong results
LineTotalYTD :=
TOTALYTD (
SUM ( Sales[SalesAmount] ),
Sales[OrderDate]
)
Handling fiscal year
The last, optional, parameter is the end of the fiscal year
Default: 12-31 (or 31/12 - locale dependent)
SalesAmountYTD :=
TOTALYTD (
SUM ( Sales[SalesAmount] ),
'Date'[Date],
"06-30"
)
SalesAmountYTD :=
CALCULATE (
SUM ( Sales[SalesAmount] ),
DATESYTD ( 'Date'[Date], "06-30" )
)
Same period last year
Same period in previous year. CALCULATE is needed
Specialized version of DATEADD.
Sales SPLY :=
CALCULATE (
SUM ( Sales[SalesAmount] ),
SAMEPERIODLASTYEAR ( 'Date'[Date] )
)
Mixing time intelligence functions
YTD on the previous year. In DAX, it is very simple, just mix the functions to obtain the result
Sales YTDLY :=
CALCULATE (
SUM ( Sales[SalesAmount] ),
DATESYTD (
SAMEPERIODLASTYEAR ( 'Date'[Date] )
)
)
Parameters of time intelligence
functions are tables.
Using a column – as we did so
far – is only syntax sugaring
DATEADD
Similar as SAMEPERIODLASTYEAR, used to calculate different periods: YEAR, MONTH, DAY …
Does not sum dates, it shifts periods over time
Sales SPLY :=
CALCULATE (
SUM( Sales[SalesAmount] ),
DATEADD ( 'Date'[Date] , -1, YEAR )
)
PARALLELPERIOD
Returns a set of dates (a table) shifted in time
The whole period is returned, regardless dates in the first parameter
Sales_PPLY :=
CALCULATE (
SUM ( Sales[SalesAmount] ),
PARALLELPERIOD ( 'Date'[Date] , -1, YEAR )
)
-- same as
Sales_PPLY :=
CALCULATE (
SUM ( Sales[SalesAmount] ),
PREVIOUSYEAR ( 'Date'[Date] )
)
Running total
Running total requires an explicit filter, we use a variable to store the last visible date in the
current filter context
SalesAmountRT :=
RETURN
CALCULATE (
SUM ( Sales[SalesAmount] ),
FILTER (
ALL ( 'Date' ),
'Date'[Date] <= LastVisibleDate
)
)
Moving annual total (v1)
Moving window from the current date back one year
CALCULATE (
SUM ( Sales[SalesAmount] ),
DATESBETWEEN (
'Date'[Date],
NEXTDAY (
SAMEPERIODLASTYEAR (
LASTDATE ( 'Date'[Date] )
)
),
LASTDATE ( 'Date'[Date] )
)
)
Beware of function order!
Time intelligence functions return sets of dates, and the set of dates need to exist.
CALCULATE (
SUM ( Sales[SalesAmount] ),
DATESBETWEEN (
'Date'[Date],
SAMEPERIODLASTYEAR (
NEXTDAY (
LASTDATE ( 'Date'[Date] )
)
),
LASTDATE ( 'Date'[Date] )
)
)
Moving annual total (v2)
DATESINPERIOD makes everything much easier.
CALCULATE (
SUM ( Sales[SalesAmount] ),
DATESINPERIOD (
'Date'[Date],
LASTDATE ( 'Date'[Date] ),
-1,
YEAR
)
)
Semiadditive measures
Semi additive measures
o Additive Measure
• SUM over all dimensions
o Semi Additive Measure
• SUM over some dimensions
• Different function over other dimensions
• Time is the standard exception for aggregations
• Examples
• Warehouse stocking
• Current account balance
Current account balance
LastBalance :=
CALCULATE (
SUM ( Balances[Balance] ),
LASTDATE ( Date[Date] )
)
LASTNONBLANK
LASTNONBLANK iterates Date searching the last value for which its second parameter is not a
BLANK. Thus, it searches for the last date related to any row in the fact table
LastBalanceNonBlank :=
CALCULATE (
SUM ( Balances[Balance] ),
LASTNONBLANK (
Date[Date],
CALCULATE ( COUNTROWS ( Balances ) )
)
)
LASTNONBLANK by customer
Using a filter context with multiple columns, you can also compute the lastnonblank on a
customer-by-customer basis
LastBalanceNonBlankPerCustomer :=
CALCULATE (
SUM ( Balances[Balance] ),
TREATAS (
ADDCOLUMNS (
VALUES ( Balances[Name] ),
"LastAvailableDate", CALCULATE ( MAX ( Balances[Date] ) )
),
Balances[Name],
Date[Date]
)
)
There are many week scenarios, depending on what you mean by «week»… CALCULATE is
your best friend here
IF (
HASONEVALUE ( 'Date'[ISO Year] ),
CALCULATE (
[Sales Amount],
ALL ( 'Date' ),
FILTER ( ALL ( 'Date'[Date] ), 'Date'[Date] <= MAX ( 'Date'[Date] ) ,
VALUES ( 'Date'[ISO Year] )
)
) For more calculations see www.daxpatterns.com and
www.sqlbi.com/articles/week-based-time-intelligence-in-dax/
Time intelligence: conclusions
o Based on evaluation contexts
• Replace filter on date
• Many predefined functions
• You can author your own functions
o Basic Time Intelligence
o Semi Additive Measures
o Working with ISO weeks
Time intelligence
Next exercise session if focused on some common
scenarios where time intelligence is required, along with
some evaluation context skills.
Please refer to lab number 5 on the exercise book.
Querying with DAX, and a lot of new table functions
DEFINE
START AT <value>, …
EVALUATE example
The easiest EVALUATE: query a full table
EVALUATE
Store
ORDER BY
Store[Continent],
Store[Country],
Store[City]
EVALUATE
FILTER (
Product,
CALCULATE (
SUM ( Sales[Quantity] ) > 100000
)
)
CALCULATETABLE
CALCULATETABLE is often used to apply filters to the query
EVALUATE
CALCULATETABLE (
Product,
Product[Color] = "Red"
)
Evaluation order of CALCULATE
CALCULATE and CALCULATETABLE evaluate the first parameter only after the latter ones have
been computed
EVALUATE
CALCULATETABLE (
CALCULATETABLE (
Product, 3
ALL ( Product[Color] ) 2
),
Product[Color] = "Red" 1
)
EVALUATE
ADDCOLUMNS (
'Product Category',
"Sales",
CALCULATE ( SUM ( Sales[Quantity] ) )
)
Remember the automatic
CALCULATE which surrounds
any measure calculation
SUMMARIZE
Performs GROUP BY in DAX and optionally computes subtotals.
WARNING: function almost deprecated for aggregating data (more on this later).
EVALUATE
Source Table
SUMMARIZE (
Sales, GROUP BY Column(s)
'Product Category'[Category],
"Sales", SUM ( Sales[Quantity] )
)
Result Expression(s)
added to the source table
Beware of SUMMARIZE
Never, ever use SUMMARIZE to compute calculated columns, always use a mix of SUMMARIZE and
ADDCOLUMNS.
EVALUATE
ADDCOLUMNS (
SUMMARIZE (
Sales,
'Product Category'[Category],
'Product Subcategory'[Subcategory]
),
"Sales", CALCULATE ( SUM ( Sales[Quantity] ) )
)
SUMMARIZECOLUMNS 2015
EVALUATE EVALUATE
FILTER (
SUMMARIZE (
SUMMARIZECOLUMNS ( CROSSJOIN (
'Date'[Calendar Year], VALUES ( 'Date'[Calendar Year] ),
VALUES ( Product[Color] )
Product[Color], ),
"Sales", 'Date'[Calendar Year],
SUM ( Sales[Quantity] ) Product[Color],
) "Sales", SUM ( Sales[Quantity] )
),
NOT ( ISBLANK ( [Sales] ) )
)
SUMMARIZECOLUMNS 2015
o Simpler syntax
o Empty rows are automatically removed
o Multiple fact tables scanned at once
o Some limits when used in a measure:
• http://www.sqlbi.com/articles/introducing-summarizecolumns/
o Few additional features
• IGNORE
• ROLLUPGROUP
• ROLLUPADDISSUBTOTAL
SUMMARIZECOLUMNS 2015
SUMMARIZECOLUMNS (
Customer[Company Name],
ROLLUPADDISSUBTOTAL (
ROLLUPGROUP ( Customer[City] ),
"CityTotal"
),
FILTER (
Customer,
Customer[Country] = "France"
),
"Sales", SUM ( Sales[Quantity] )
)
SELECTCOLUMNS 2015
EVALUATE
SELECTCOLUMNS (
Product,
"Color", Product[Color],
"ProductKey", Product[ProductKey]
)
Using CROSSJOIN
CROSSJOIN does what its name suggests: performs a crossjoin between two tables, very useful in
querying Tabular
EVALUATE
ADDCOLUMNS (
CROSSJOIN (
DISTINCT ( 'Product'[Color] ),
DISTINCT ( 'Product'[Size] )
),
"Products",
COUNTROWS ( RELATEDTABLE ( Product ) )
)
Using GENERATE
GENERATE is the equivalent of APPLY in SQL, will come handy when using many-to-many relationships
EVALUATE
GENERATE (
VALUES ( 'Product Category'[Category] ),
SELECTCOLUMNS (
RELATEDTABLE ( 'Product Subcategory' ),
"Subcategory", 'Product Subcategory'[Subcategory]
)
)
ORDER BY [Category], [Subcategory]
Tables and relationships
Remember, tables resulting from table functions inherits relationships from the columns they use
EVALUATE
ADDCOLUMNS (
CROSSJOIN (
VALUES ( 'Date'[Calendar Year] ),
VALUES ( Product[Color] )
),
"Sales Amount",
CALCULATE (
SUM ( Sales[Quantity] ) Which sales?
)
)
Sales of the given year and color
Columns and expressions
Expressions are not columns.
Columns have relationships, expressions do not
EVALUATE
ADDCOLUMNS ( Becoming an expression,
SELECTCOLUMNS ( ProductKey no longer has
FILTER ( relationships
Product,
Product[Color] = "Red"
),
"ProductKey", Product[ProductKey] + 0
),
"Sales", CALCULATE ( SUM ( Sales[Quantity] ) )
)
Using ROW
Creates a one-row table, often used to get results, more rarely used inside calculations and complex
queries
EVALUATE
ROW (
"Jan Sales", CALCULATE (
SUM ( Sales[Quantity] ),
'Date'[Month] = "January"
),
"Feb Sales", CALCULATE (
SUM ( Sales[Quantity] ),
'Date'[Month] = "February"
)
)
Using DATATABLE
Creates a full table with a single function call, useful to build temporary tables
EVALUATE
DATATABLE (
"Price Range", STRING,
"Min Price", CURRENCY,
"Max Price", CURRENCY,
{
{ "Low", 0, 10 },
{ "Medium", 10, 100 },
{ "High", 100, 9999999 }
}
)
Using CONTAINS
Check for the existence of a row in a table
EVALUATE
FILTER (
Product,
CONTAINS (
RELATEDTABLE ( Sales ),
Sales[Unit Price], 56
)
)
Using LOOKUPVALUE
Lookups a value in a table, can be done with CALCULATE (VALUES()) but reads much better.
Often used in calculated columns to perform lookups
EVALUATE
ROW (
"Country",
LOOKUPVALUE (
Geography[Country Region Name],
Geography[Country Region Code], "CA"
)
)
Using TOPN
Returns the TOP N products sorting the table by Sales
EVALUATE
CALCULATETABLE (
TOPN (
10,
ADDCOLUMNS (
VALUES ( Product[Product Name] ),
"Sales", CALCULATE ( SUM ( Sales[Quantity] ) )
),
[Sales]
),
Product[Color] = "Red"
)
TOPN and GENERATE
Returns the TOP 3 products for each category sorting by Sales
EVALUATE
GENERATE (
VALUES ( 'Product Category'[Category] ),
CALCULATETABLE (
TOPN (
3,
ADDCOLUMNS (
VALUES ( Product[Product Name] ),
"Sales", CALCULATE ( SUM ( Sales[Quantity] ) )
),
[Sales]
)
)
)
UNION 2015
EVALUATE
UNION (
SUMMARIZE (
CALCULATETABLE (
Sales,
'Date'[Calendar Year] = "CY 2008"
),
Product[ProductKey]
),
SELECTCOLUMNS (
FILTER (
Product,
Product[Color] = "Red"
),
"ProductKey", Product[ProductKey]
)
)
ORDER BY Product[ProductKey]
Columns in UNION 2015
First table defines column names, columns can come from different sources. The first table defines the
column names.
EVALUATE
UNION (
ROW ( "DAX", 1 ),
ROW ( "is a", 1 ),
ROW ( "Language", 2 )
)
INTERSECT 2015
EVALUATE
INTERSECT (
VALUES ( Product[ProductKey] ),
SUMMARIZE (
FILTER (
Product,
Product[Color] = "Red" ),
Product[ProductKey]
)
)
UNION and INTERSECT 2015
INTERSECT gets column names and lineage from the first table, whereas UNION only keep
lineage if all the tables have the same one.
EVALUATE
INTERSECT (
SUMMARIZE (
FILTER (
Product,
Product[Color] = "Red"
),
Product[ProductKey]
),
SELECTCOLUMNS (
VALUES ( Product[ProductKey] ),
"Test", Product[ProductKey] + 1
)
)
EXCEPT 2015
EVALUATE
EXCEPT (
SELECTCOLUMNS (
VALUES ( Product[ProductKey] ),
"Test", Product[ProductKey]
),
SUMMARIZE (
FILTER (
Product,
Product[Color] = "Red"
),
Product[ProductKey]
)
)
GROUPBY 2015
Similar to SUMMARIZE, but has the additional feature of CURRENTGROUP to iterate over the subset of
rows
EVALUATE
GROUPBY (
Sales,
'Date'[Calendar Year],
Product[Color],
"Sales",
SUMX (
CURRENTGROUP (),
Sales[Quantity] * Sales[Unit Price]
)
)
GROUPBY 2015
The same query of the previous slide, without GROUPBY, is much more complex to write, even if you
have more freedom
EVALUATE
FILTER (
ADDCOLUMNS (
SUMMARIZE (
CROSSJOIN (
VALUES ( 'Date'[Calendar Year] ),
VALUES ( Product[Color] )
),
'Date'[Calendar Year],
Product[Color]
),
"Sales",
SUMX (
CALCULATETABLE ( Sales ),
Sales[Quantity] * Sales[Unit Price]
)
),
NOT ( ISBLANK ( [Sales] ) )
)
Query measures
Measures defined locally in the query make the authoring of complex queries much easier
DEFINE
MEASURE Sales[Subcategories] =
COUNTROWS ( RELATEDTABLE ( 'Product Subcategory' ) )
MEASURE Sales[Products] =
COUNTROWS ( RELATEDTABLE ( 'Product' ) )
EVALUATE
ADDCOLUMNS (
'Product Category',
"SubCategories", [Subcategories],
"Products Count", [Products]
)
DAX measures in MDX
An interesting option is the ability to define DAX measures inside an MDX query
WITH
SELECT
{ Measures.[Ship Sales 2003] }
ON COLUMNS,
NON EMPTY
[Product Category].[Product Category Name].[Product Category Name]
ON ROWS
FROM [Model]
Querying: conclusions
o DAX as a query language
• Is very powerful
• Pretty simple to author
• Reminds
• SQL
• MDX
• But it is different from both
o Some functions are used more in queries than in code
• SUMMARIZE, ADDCOLUMNS…
Querying in DAX
Querying is not very complex, but requires a good DAX
mood.
Next set of exercises is all about authoring some DAX
queries, from simple to more complex ones.
Please refer to lab number 6 on the exercise book.
Let us go deeper in the analysis of filter contexts
NumOfSubcategories =
COUNTROWS ( 'Subcategory' )
Filters are tables
Each filter is a table. Boolean expressions are shortcuts for more complex table expressions
CALCULATE (
SUM ( Sales[SalesAmount] ),
Sales[SalesAmount] > 100
)
What happens, if we
Is equivalent to
use a full table in a
CALCULATE ( CALCULATE
SUM ( Sales[SalesAmount] ),
FILTER ( expression?
ALL ( Sales[SalesAmount] ),
Sales[SalesAmount] > 100
)
)
Strange results!
What is happening here?
Cross table filtering is in action,
and… well, it’s not easy.
Filtered By Product =
CALCULATE(
[NumOfSubcategories],
'Product’
)
Key topics
The key to really understand filter context is to understand
o Base Tables
o Expanded Tables
o What Filters in the context are
o Filter Operators
o Block semantic of ALL
Base tables
Filtered By Product =
CALCULATE(
[NumOfSubcategories],
'Product’
)
What is happening here?
If the filter is working on the Product
table, why is it showing 44?
NumOfSubcategories =
COUNTROWS ( 'Subcategory' )
Slicers filter columns, not tables!
A slicer filters a column, not a table.
Filtered By Column =
CALCULATE (
[NumOfSubcategories],
VALUES ( 'Product'[Product Name] )
)
Filtering Columns
Filtering a column of DimProduct, which is not in common with other
tables, does not affect other tables.
DIVIDE (
[Sales Amount],
CALCULATE (
[Sales Amount],
ALL ( ProductSubcategory )
)
)
Wrong formula in action
DIVIDE (
[Sales Amount],
CALCULATE (
[Sales Amount],
ALL ( ProductSubcategory )
)
)
Correct formula
DIVIDE (
[Sales Amount],
CALCULATE (
[Sales Amount],
ALL ( ProductSubcategory ),
ProductCategory
)
)
Correct formula in action
DIVIDE (
[Sales Amount],
CALCULATE (
[Sales Amount],
ALL ( ProductSubcategory ),
ProductCategory
)
)
Same result using ALLEXCEPT
You can write the same code with ALLEXCEPT, removing filters from ProductSubcategory but not
from the expanded ProductCategory
DIVIDE (
[Sales Amount],
CALCULATE (
[Sales Amount],
ALLEXCEPT ( ProductSubcategory, ProductCategory )
)
)
Advanced filter context
Next exercise session is composed of only a couple of
formulas, nothing complex. To solve them, you need to
stretch your mind and use all the knowledge of this last
module. Good luck and… don’t look at the solution too
early.
Please refer to lab number 7 on the hands-on manual.
Let’s move a step further from plain vanilla relationships
Advanced Relationships
Complex relationships
o Not all the relationships are one-to-many relationships
based on a single column
o When relationships are complex ones, more DAX skills
are needed
o Data modeling, in SSAS Tabular and Power BI, is very
basic
o DAX is the key to unleash more analytical power
Many ways to handle many-to-many relationships
Many-to-many
Many-to-many relationships
o Whenever a relationship is not just one-to-many
o Examples
• Bank current account and holders
• Companies and shareholders
• Teams, roles and members
• House and householders
o They are a powerful tool, not an issue at all
o Learning how to use them opens the doors to very
powerful scenarios
Many-to-many challenges
o M2M do not work by default
• You need to write code or update the model
o M2M generate non-additive calculations
• Hard to understand at first glance
• Yet, it is in their nature to be non-additive
o Performance might be an issue
• Not always, but you need to pay attention to details
Current account example of M2M
SumOfAmount CrossFilter =
CALCULATE (
SUM ( Transactions[Amount] ),
CROSSFILTER (
AccountsCustomers[AccountKey],
Accounts[AccountKey],
BOTH Options:
)
) NONE, ONEWAY, BOTH
MANY-TO-MANY with SUMMARIZE
You can use SUMMARIZE and move the filter from AccountCustomer (already filtered by
customer) and summarizing it by Account, so that the Account key filters the fact table
AmountM2M :=
CALCULATE (
SUM ( Transaction[Amount] ),
SUMMARIZE (
AccountCustomer,
Account[ID_Account]
)
)
Using expanded table filtering
Leveraging table expansion and filter context, you can obtain the same result, without enabling
bidirectional filtering on relationships. Useful for older versions of DAX
AmountM2M :=
CALCULATE (
SUM ( Transaction[Amount] ),
AccountCustomer
)
CROSSFILTER versus expanded tables
o Using CROSSFILTER the filter is propagated only when
some filtering is happening
o Using expanded tables, the filter is always active
• Optimization for expanded table described here:
https://www.sqlbi.com/articles/many-to-many-relationships-in-power-bi-and-excel-2016/
Which technique to choose?
o Bidirectional filtering
• Set in the model
• Works with any measure
o Using CROSSFILTER
• Set in the formula
• Requires additional coding
o Expanded table filtering
• Set in the formula
• Works on any version of DAX
• Filter is always active, might slow down the model
Understanding non-additivity
o M2M generate non-additive calculations
o This is not an issue: it is the way many-to-many work
Non-additivity when coding
You need to be aware of non-additivity when writing code, because the results might be wrong if you do
not take into account the additivity behavior of your model
Interest :=
[SumOfAmount] * 0.01
Interest SUMX :=
SUMX (
Customers,
[SumOfAmount] * 0.01
)
Bidirectional filters look promising, but you need to avoid creating complex models
Multi-column relationships
Multi-Column Relationships
o Tabular supports standard 1:N relationships
o Sometimes you need relationships than span over more
than a single column
Multi-Column Relationships
1st Solution: Create Relationship
If the relationship is needed in the model, then you need to create a calculated column to set
the relationship
ProductAndReseller =
Discount =
LOOKUPVALUE (
Discounts[MaxDiscount],
Discounts[ProductKey], FactResellerSales[ProductKey],
Discounts[ResellerKey], FactResellerSales[ResellerKey]
)
Static Segmentation
Static segmentation
o Analysis of sales based on unit price
Static segmentation: the model
The quick and dirty solution
Quick and dirty, but it works.
Very useful for prototyping a data model with the customer
= IF (
Sales[UnitPrice] <= 5,
"01 LOW",
IF (
Sales[UnitPrice] <=30,
"02 MEDIUM LOW",
IF (
Sales[UnitPrice] <=100,
"03 MEDIUM HIGH",
IF (
Sales[UnitPrice] <= 500,
"04 HIGH",
"05 VERY HIGH" ) ) ) )
Static segmentation: the formula
By using a calculated column you can denormalize the price segment in the fact table.
Being a small-cardinality column, the size in RAM is very small
Sales[PriceRange] =
CALCULATE (
VALUES ( PriceRanges[PriceRange] ),
FILTER (
PriceRanges,
AND (
PriceRanges[MinPrice] <= Sales[Net Price],
PriceRanges[MaxPrice] > Sales[Net Price]
)
)
)
Denormalizing the key
Using a similar technique you can denormalize the key of the table, so to avoid replicating all of
the columns. You need to pay attention to circular references, this is why we need to use ALL
and DISTINCT
PriceRangeKey =
CALCULATE (
IFERROR (
DISTINCT ( PriceRanges[PriceRangeKey] ),
BLANK ()
),
FILTER (
PriceRanges,
AND (
PriceRanges[MinPrice] <= Sales[Net Price],
PriceRanges[MaxPrice] > Sales[Net Price]
)
),
ALL ( Sales )
)
Circular dependency in calculated tables
o Sales depends on PriceRanges
o If you build the relationship, then the model contains a circular
dependency, because – on the one side – the blank row might
appear. Thus, PriceRanges depends on Sales
o You need to use functions that do not use the blank row
• DISTINCT, ALLNOBLANKROW
Denormalizing keys
o Denormalizing the key goes further than you might think
at first
Using DAX, you can build any kind of relationship, no matter
how fancy it is, as long as you can compute it in a DAX
expression
o This is an extremely powerful technique, we call it
Calculated Relationships
Dynamic Segmentation
Dynamic segmentation
CustInSegment :=
COUNTROWS (
FILTER (
Customer,
AND (
[Sales Amount] > MIN ( Segments[MinSale] ),
[Sales Amount] <= MAX ( Segments[MaxSale] )
)
)
)
Beware of slicers…
o The previous formula is error-prone
o If the user filters data using a slicer, he can break your
code very easily
Dynamic segmentation: a better formula
By avoiding MIN and MAX and using iteration instead, you can obtain additivity at the segment
level
CustInSegment :=
SUMX (
Segments,
COUNTROWS (
FILTER (
Customer,
AND (
[Sales Amount] > Segments[MinSale],
[Sales Amount] <= Segments[MaxSale]
)
)
)
)
The measure is still
non-additive over years
Relationships at different granularities are a challenge
You can use DAX to move the filter from the Product[Brand] column to the Budget[Brand] one, and
repeat the same operation for the CountryRegion pair of columns.
Budget 2009 :=
CALCULATE (
SUM ( Budget[Budget] ),
INTERSECT (
VALUES ( Budget[Brand] ),
VALUES ( 'Product'[Brand] )
),
INTERSECT (
VALUES ( Budget[CountryRegion] ),
VALUES ( Store[CountryRegion] )
)
)
Use DAX to move the filters (before 2015)
In Excel 2013 and SSAS 2014, the INTERSECT function is not available, you can obtain the same
result by using the CONTAINS function along with an iteration on the Budget columns.
TREATAS can change the lineage of a column, transforming the lineage of Product and Store
columns in Budget ones.
Budget 2009 :=
CALCULATE (
SUM ( Budget[Budget] ),
TREATAS (
VALUES ('Product'[Brand] ),
Budget[Brand]
),
TREATAS (
VALUES (Store[CountryRegion] ),
Budget[CountryRegion]
)
)
TREATAS with table constructors 2017
TREATAS is not limited to a single column, you can use it with multiple columns (of different
tables) in a constructed table. (note: result could be different than previous slide)
CALCULATE (
SUM ( Budget[Budget] ),
TREATAS (
SUMMARIZE (
Sales,
'Store'[CountryRegion],
'Product'[Brand]
),
Budget[CountryRegion],
Budget[Brand]
)
)
Using DAX to move filter
o Flexibility
• You change the filter context in a very dynamic way
• Full control over the functions used
o Complexity
• Every measure need to be authored with the pattern
• Error-prone
o Speed
• Using DAX to move a filter is sub-optimal
• Leverages the slower part of the DAX engine (FE)
Filtering through relationships
Using calculated tables
In Power BI and SSAS 2016 you can build the intermediate tables using DAX calculated tables
Brands =
DISTINCT (
UNION (
DISTINCT ( Product[Brand] ), Use ALLNOBLANKROW or
DISTINCT ( Budget[Brand] )
) DISTINCT to avoid circular
)
dependency issues.
CountryRegions = In fact, if you use ALL or VALUES,
DISTINCT ( you depend from the existence
UNION (
DISTINCT ( Store[CountryRegion] ), of the blank row
DISTINCT ( Budget[CountryRegion] )
)
)
Use the correct column to slice
ProductsAtBudgetGranularity :=
CALCULATE (
COUNTROWS ( Product ),
ALL ( Product ),
VALUES ( Product[Brand] )
)
ProductsAtSalesGranularity :=
COUNTROWS ( Product )
Hiding the wrong granularity
Using a simple IF statement, you can clear out values which cannot be safely computed
Budget 2009 :=
IF (
AND (
[ProductsAtBudgetGranularity] = [ProductsAtSalesGranularity],
[StoresAtBudgetGranularity] = [StoresAtSalesGranularity]
),
SUM ( Budget[Budget] )
)
Hiding or reallocating?
o We hid the wrong values using DAX
o We strive for perfection, can we do something better?
o A viable option is to define a reallocation factor
• Using sales in 2009
• Compute the percentage of the selection against the total
• Use the percentage to dynamically reallocate the budget
o The code is slightly more complex
o Yet very dynamic and powerful
Allocating the budget
Allocation factor: the first formula
You compute the sales at the correct granularity and then divide the budget by the ratio between sales
and sales at budget granularity, using a technique similar to the one used to hide values
Sales2008AtBudgetGranularity :=
CALCULATE (
[Sales 2008],
ALL ( Store ), Beware: this is NOT the
VALUES ( Store[CountryRegion] ),
ALL ( Product ), same as ALLEXCEPT
VALUES ( Product[Brand] )
)
AllocationFactor :=
DIVIDE (
[Sales 2008],
[Sales2008AtBudgetGranularity]
)
Allocated Budget := SUM ( Budget[Budget] ) * [AllocationFactor]
Allocation factor: the final formula
The final formula is more complex because the allocation is valid at Brand / CountryRegion
granularity.
Allocated Budget :=
SUMX (
KEEPFILTERS (
CROSSJOIN (
VALUES ( 'Product'[Brand] ),
VALUES ( Store[CountryRegion] )
)
),
[Allocation Factor] * CALCULATE ( SUM ( Budget[Budget] ) )
)
Advanced Relationships
Time to write some DAX code by yourself.
Next exercise session is focuses on some simple and not-
so-simple DAX code. Choose the exercise that best fit your
skills.
Please refer to lab number 9 on the hands-on manual.
Thank you!