PowerBI Interview
PowerBI Interview
PowerBI Interview
Standard Mode: This version of the on-premises data gateway allows connection with
multiple on-premises data sources for more than one user.
Personal Mode:The personal mode of on-premises data gateway is used by only one
user to connect to different data sources.
Power BI refesh:>>>>>.
refreshing data typically means importing data from the original data sources
into a dataset, either based on a refresh schedule or on demand.The
Scheduled refresh section is where you define the frequency and time slots to refresh a
dataset. As mentioned earlier,you can configure up to eight daily time slots if your
dataset is on shared capacity, or 48 time slots on Power BI Premium.
A tile is a report visual pinned to a dashboard, and dashboard tile refreshes happen about every
hour so that the tiles show recent results.
Accessing on-premises and cloud sources in the same source query. Power BI must use
a gateway for the cloud data sources as well.
Power BI doesn’t need gateway to connect to cloud data source but for a comb ination of above it will
need gateway.
You are also limited in your DAX expressions; you can not write all types of
expressions. Many functions are not supported; as an example: Time Intelligence
functions are not supported.
Slow Connection refreshing of data takes longer.
Live Connection:
No Power Query. Just Visualization
The big disadvantage of this method is that you will not have even Power Query simple
transformation. With this method; you will only have Report tab.
Paginated report:
Paginated reports are designed to be printed or shared. They're called paginated because they're formatted to fit well on a
page. They display all the data in a table, even if the table spans multiple pages. They're also called pixel perfect because you
can control their report page layout exactly. Power BI Report Builder is the standalone tool for authoring paginated reports
for the Power BI service.
Performance tuning:
Using performance analyzer we can monitor the performance of our powerbi report.
Data flow:
Data flow is basically an ETL tool for data loading and preparation that is deployed in cloud and the data is stored in
Az`ure Data lake Gen2 storage.
Files: Data can be imported from Excel (.xlsx, xlxm), Power BI Desktop files
(.pbix) and Comma Separated Value (.csv).
Content Packs: It is a collection of related documents or files that are stored as
a group. In Power BI, there are two types of content packs, firstly those from
services providers like Google Analytics, Marketo or Salesforce and secondly
those created and shared by other users in your organization.
Connectors to databases and other datasets such as Azure SQL, Databaseand
SQL, Server Analysis Services tabular data, etc.
Types of filters>>>>>>
There are three levels of filters in Power BI: report, page, and visual
Power BI provides variety of option to filter report, data and visualization. The following
are the list of Filter types.
A tile is a snapshot of your data, pinned to the dashboard. A tile can be created from a
report, dataset, dashboard, the Q&A box, Excel, SQL Server Reporting Services
(SSRS) reports, and more.
If the calculation is row by row (example: Profit = Sales – Cost, or Full name = First
Name & ” ” & Last Name), then Calculated Column is what you need.
Import: The selected tables and columns are imported into Power BI Desktop. As you
create or interact with a visualization, Power BI Desktop uses the imported data. To see
underlying data changes since the initial import or the most recent refresh, you must
refresh the data, which imports the full dataset again.
The default one is the SQL Server Import and it enables you to make full use of the Power BI Desktop
capabilities. With Direct Query, you can get a direct connection to query data from only a few supported
data sources. Power BI will not store the data, just table metadata like tables and column names. Like
Direct Query, Live Connection also does not store data. However, it can only query three data sources,
SQL Server Analysis Services, Azure Analysis Services and Power BI Service."
Sumx: Returns the sum of an expression evaluated for each row in a table.
allexcept function>> ALLEXCEPT removes the filters from the expanded table
specified in the first argument, keeping only the filters in the columns specified in the
following arguments.
Row context is when your calculation is evaluated for each detail row from a input table
Filter context is set of filters that is applied before the table arrives for use.
TotalSales= SUM(Sales(SalesAMount))
=tatalytd(tatalsales, date[date])
PARALLELPERIOD:
CALCULATE(totasales, PARALLELPERIOD(DateTime[DateKey],-1,year))
Dateadd>> Returns a table that contains a column of dates, shifted either forward or
backward in time by the specified number of intervals from the dates in the current
context.
samperiodlastyear>> Returns a table that contains a column of dates shifted one year
back in time from the dates in the specified dates column, in the current context.
=calculate(tatalsales, sameperiodlastyear(date[date})
values>>> When the input parameter is a column name, returns a one-column table
that contains the distinct values from the specified column. Duplicate values are
removed and only unique values are returned.
unpivot column>>>The Unpivot Columns feature turns multiple column headers into a
single column but in rows. The values get stored under the original columns in another
column.Unpivot Columns to convert column data fields to row data fields.
Query folding is when steps defined in Power Query/Query Editor are translated into
SQL and executed by the source database rather than the client machine. It’s important
for processing performance and scalability, given limited resources on the client
machine.
Datesytd>>> Returns a table that contains a column of the dates for the year to date, in
the current context.
CALCULATE(SUM(InternetSales_USD[SalesAmount_USD]),
DATESYTD(DateTime[DateKey]))
The following example returns the current date and time plus 3.5 days:
DAXCopy
= NOW()+3.5
Ans: SUMMARIZE()
New group by function for SSAS and Power BI Desktop; more efficient.
Specify group by columns, table, and expressions.
It gives us an output column based on the filter condition specified in the current
table.
SummarizeColumnColumn=
Summarize(employees,employee[employee_id],employee[salary],
filter(employee,employee[salary]>15000))
By declaring and evaluating a variable, the variable can be reused multiple times in a
DAX expression, thus avoiding additional queries of the source database.
Variables can make DAX expressions more intuitive/logical to interpret.
Variables are only scoped to their measure or query, they cannot be shared among
measures, queries or be defined at the model level.
Create a duplicate of the table, then perform groupby on the duplicate table to get
distinct value. Merge the 2 table on the basis of common column and do a inner join, it
will remove the duplicate from the original table.
When you have one or more columns that you’d like to add to another query,
you merge the queries.
When you have additional rows of data that you’d like to add to an existing query,
you append the query.
Q) problem faced while creating a report or after creating a report?
Ans: An app is a Power BI content type that combines related dashboards and reports, all in
one place. An app can have one or more dashboards and one or more reports, all bundled
together. Apps are created by Power BI designers who distribute and share the apps with their
colleagues.
Advantage:
Apps are an easy way for designers to share different types of content at one time.
App designers create the dashboards and reports and bundle them together into an app.
The designers then share or publish the app to a location where you, the business user, can
access it. Because related dashboards and reports are bundled together, it's easier for you to
find and install in both the Power BI service (https://powerbi.com) and on your mobile device.
One-to-many
One-to-one
Many-to-many
Many-to-one
But we can only have one active relationship between two tables.
Ans: The following are the Building Blocks (or) key components of Power BI:
FILTER is an iterator and thus can negatively impact performance over large
source tables.
Calculatetable(<expression>[<filter>,<filter>,……,<filter>])
Note: Calculate performs exactly the same function, except it modifies the filter context
applied to an expression that returns a scalar value.
-> Whenever we need to return an scalar/single value we use calculate. However, when
we need to return an table as an expression we use calculate table.
Cal= Calculate(sum([sales],([country]=’Spain”))
By declaring and evaluating a variable, the variable can be reused multiple times
in a DAX expression, thus avoiding additional queries of the source database.
Variables are only scoped to their measure or query, they cannot be shared
among measures, queries or be defined at the model level.
Import Data from wide range of sources from files, databases, big data, social
media data, etc.
Q) What are the three Edit Interactions options of a visual tile in Power BI Desktop?
Filter: It completely filter a visual/tile based on the filter selection of another visual/tile.
Highlight: It highlight only the related elements on the visual/tile, gray out the non-
related items.
It allows users to transform data into visuals and share them with anyone
It establishes a connection for Excel queries and dashboards for fast analysis
It does not accept file sizes larger than 1 GB and also doesn’t mix imported data,
which is accessed from real-time connections
There are very few data sources that allow real-time connections to Power BI
reports and dashboard
Dashboards and reports are only shared with users logged in with the same
email address
Dashboard doesn’t accept or pass user, account, or other entity parameters
It allows you to drill through the page, bookmarks, and selection pane and also
lets you create various tiles and integrate URLs
A dashboard can also help you set report layout to mobile view
Q) Can you have a table in the model which does not have any relationship with other
tables?
Yes. There are two main reasons why you can have disconnected tables:
The table is used to present the user with parameter values to be exposed and
selected in slicers
There are three different views in Power BI, each of which serves a different purpose:
Report View - In this view, users can add visualizations and additional report
pages and publish the same on the portal.
Data View - In this view, data shaping can be performed using Query Editor tools
Q) What are the main components of the Power BI toolkit, and what do they do?
Power Query: lets you discover, access, and consolidate info from different
sources
Power View: a presentation tool for creating charts, tables, and more
Power Q&A: lets you use natural language to get answers to questions; for
example, “What were the total sales last week?”
Q) Explain the term data alerts.
Alert works on data that is refreshed, Power BI looks for an alert, and it
reaches the alert threshold or the limit then the alter will be triggered.
Selection Pane helps you to take control over visuals which require to be
displayed and which should not be displayed. It allows you to combine
multiple visual pages in the group and is also used in bookmarking.
You can use Crossfiltering option in Power BI to address the Many to Many
relationships.
It is the main engine which is used in power pivot. It allows you to load the
large set of data into Power BI data.
The only difference between two functions is that with District help you to
calculate the null values.
If you want to calculate numeric values, then use MAX. However, if it is for
non numeric values, then you should use MAXA.
Fact Tables:
The central table in a star schema of a data warehouse is a fact table that
stores quantitative information for analysis, which is not normalized in most
cases.
Dimension Tables:
It is a table in the star schema which helps you to store attributes and
dimensions which describe objects that are stored in a fact table.
Q) What are the method to hide and unhide a specific report in Power BI?
To hide and unhide specific report, you have to go to selection Pane in the
menu bar, and press hides/unhide toggle button to bookmark.
Q) How can you compare Target and Actual Value from a Power BI
report?
Q) Can you refresh Power BI reports after they are published to the
cloud?
DATEDD function helps you to convert any input to a date format. This input
can number, string, or a data type input.
DATENAME function shows the name of the specific part of the date that is
given.
This function gives a difference between 2 dates based on the specified Date
part.
INDEX function helps you to retrieve the index of the respective row.
LTRIM function helps you to remove the white space from the LEFT of the
string. RTRIM helps you to remove it from the right the last index.
MID function returns the string character from the specified index position.
SPLIT function is used to split the string database on the given delimiter.
Data Editing helps you to change and reshape data in Power BI.
Power BI, reports can be refresh using Data management, gateway, and
Power BI Personal Gateway.
Reporting:
Paginated reports visual (preview)
Analytics
Modeling
Format strings now persisted when using DirectQuery for Power BI datasets and
Azure Analysis Services (Preview)
Data preparation
Data connectivity
Service
Datasets discoverability
Request access to datasets
Mandatory label policy for Microsoft Information Protection sensitivity labels
(preview)
Admin API to Set and Remove Microsoft Information Protection sensitivity labels
Automate deployments with new APIs and PowerShell samples (preview) –
Manage Dataflows in deployment pipelines (preview)
Admin APIs for deployment pipelines
Mobile
Visualizations
New visuals
Updated visuals
Editor’s picks
Template apps
KEY DIFFERENCE
Database is a collection of related data that represents some elements of the real world whereas Data
warehouse is an information system that stores historical and commutative data from single or multiple
sources.
Database is designed to record data whereas the Data warehouse is designed to analyze data.
Database is application-oriented-collection of data whereas Data Warehouse is the subject-oriented
collection of data.
Database uses Online Transactional Processing (OLTP) whereas Data warehouse uses Online
Analytical Processing (OLAP).
Database tables and joins are complicated because they are normalized whereas Data Warehouse
tables and joins are easy because they are denormalized.
ER modeling techniques are used for designing Database whereas data modeling techniques are
used for designing Data Warehouse.
Example of databse Oracle database, Microsoft sql server. Example of data warehouse are Amazon
refshift, Informatica,Snowflake.
Q) data processing
Data processing is process of collecting data and translate it into usable information.
1. Data collection
Collecting data is the first step in data processing. Data is pulled from available sources, including data lakes and data
warehouses..
2. Data preparation
The collected data is cleaned up and organized for the following stage of data processing. During preparation, raw data is
checked for any errors. The purpose of this step is to eliminate bad data (redundant, incomplete, or incorrect data) and
begin to create high-quality data for the best business intelligence.
3. Data input
Data input is the first stage in which raw data begins to take the form of usable information.
4. Processing
During this stage, the data inputted to the computer in the previous stage is actually processed for interpretation. Processing
is done using machine learning algorithms, though the process itself may vary slightly depending on the source of data
being processed (data lakes, social networks, connected devices etc.) and its intended use (examining advertising patterns,
medical diagnosis from connected devices, determining customer needs, etc.).
5. Data output/interpretation
The output/interpretation stage is the stage at which data is finally usable to non-data scientists. It is translated, readable,
and often in the form of graphs, videos, images, plain text, etc.). Members of the company or institution can now begin
to self-serve the data for their own data analytics projects.
6. Data storage
The final stage of data processing is storage. After all of the data is processed, it is then stored for future use. While some
information may be put to use immediately, much of it will serve a purpose later on. Plus, properly stored data is a necessity
for compliance with data protection legislation like GDPR. When data is properly stored, it can be quickly and easily
accessed by members of the organization when
Q) data modeling
Data modeling (data modelling) is the process of creating a data model for the data to be stored in a database
Data models are made up of entities, which are the objects or concepts we want to track data about, and they
become the tables in a database. We have table refered to as entitied in our data model and the entities are
conetcted to each other the connection is called relationship.
Agility: all the data is stored in the warehouse and readily available to use. You don’t have to
think about how to structure the data before you load it into the warehouse. The data modeling
to transform the raw data can be set up as and when it is needed.
Simplicity: Transformations in the data warehouse are generally written in SQL, a language that
the entire data team (data engineers, data scientists, data analyst) understands. This allows the
entire team to contribute to the transformation logic.
Self service analytics: If all of your raw data is within the warehouse, you can use BI tools to drill
down from aggregated summary statistics to the raw data underlying them .
Fixing bugs: If you find errors in your transformation pipeline, you can fix the bug and re-run just
the transformations to fix your data. With an ETL approach, the entire extract-load-transform pro
cess would need to be re-run.
Q) What’s is view ?
A view is a database object that is created using a Select Query with complex logic, so
views are said to be a logical representation of the physical data, i.e Views behave like a
physical table and users can use them as database objects in any part of SQL queries.
Q) Merge query
When you have one or more columns that you’d like to add to another query,
you merge the queries.
Advantages
Disadvantages
Indexes take additional disk space.
indexes slow down INSERT,UPDATE and DELETE, but will speed up UPDATE if the
WHERE condition has an indexed field. INSERT, UPDATE and DELETE becomes
slower because on each operation the indexes must also be updated.
Q) why is it important to have continuous date rabnge in our dataset? What will happen if we
have discountinious date range? How the graph will look with it?
Q) We have 3 measure and we want to show time range for that which chart will we use and
why?
Q) We have a dataset and we import it in power bi one of the column nam,e gets change in our
original data set how do we change in our powerbi data set ? Look for Alt process that changing
it in power query steps.
Left Outer: Rows from left table and matching with the right.
Right Outer: Rows from right table and matching with the left.
Full Outer: Rows from both tables (matching or not matching)
Inner: Only matching rows from both tables.
Left Anti: Not matching rows from left table.
Right Anti: No matching rows from right table.
Q) dax function to calculate active no of employees on the basis of release date column?
Employees At End of Period =
VAR MaxDate = MAX ( 'Date'[Date] )
VAR EmpCnt =
CALCULATE (
COUNTROWS (
CALCULATETABLE ( 'Employees', 'Employees'[HireDate] <= MaxDate, ALL ( 'Date'
))
),
(ISBLANK ( 'Employees'[TerminationDate] ) ||
'Employees'[TerminationDate] > MaxDate)
)
RETURN
IF ( ISBLANK ( EmpCnt ), 0, EmpCnt )
Headcount:
Calculate(Cuntx(filter(employee, employee(hiredate)<=Max(date(date)) &&
(Isbalank(employee(terminatedate)) || Employee(terminatedate)>Max(Date(date)))) ,
Epmoyee(ID))
Hire:
Calculate(count(employee(id), userelationship(employee(hiredate),date(date))
Terminate:
Calculate(count(employee(id), userelationship(employee(terminatedate),date(date),
NOT(Isblank(employee(terminatedate))
Employeeturnover:
Calocuate(countrows(employee,) filter(Values(employee(terminatedate),
employee(terminatedate)<=min(date(date), employee(terminatedate)<>Blank))
Turnover rate:
Var turnoverrate =((headcount+last period employee)/2
Return (employeeturnover)/turnoverrate+0
AVG Age:
Calculate(Average(employee(age), userelationship(employee(hiredate), date9date))