Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

PowerBI Interview

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 25

Power BI gateway:>>>>>>

Power BI Gateway is a software that is required to access data situated in an on-


premises network. Gateway act as a gatekeeper for the on-premises data source. If
anyone needs to access on-premises data from a cloud or web-based app, the request
goes through the gateway.

Types of Power BI Gateway

You can select from two available types of gateways:

Standard Mode: This version of the on-premises data gateway allows connection with
multiple on-premises data sources for more than one user.

Personal Mode:The personal mode of on-premises data gateway is used by only one
user to connect to different data sources.

Power BI refesh:>>>>>.

refreshing data typically means importing data from the original data sources
into a dataset, either based on a refresh schedule or on demand.The
Scheduled refresh section is where you define the frequency and time slots to refresh a
dataset. As mentioned earlier,you can configure up to eight daily time slots if your
dataset is on shared capacity, or 48 time slots on Power BI Premium.

A tile is a report visual pinned to a dashboard, and dashboard tile refreshes happen about every
hour so that the tiles show recent results.

Accessing on-premises and cloud sources in the same source query. Power BI must use
a gateway for the cloud data sources as well. 
Power BI doesn’t need gateway to connect to cloud data source but for a comb ination of above it will
need gateway.

DirectQuery: Limited Functionality: Few Power Query Operations, Mainly


Visualization
This method will not have full functionality of Power BI. With this method, you would
have only two tabs in Power BI Desktop; Report, and Relationship. You can change the
relationship in this mode.

You are also limited in your DAX expressions; you can not write all types of
expressions. Many functions are not supported; as an example: Time Intelligence
functions are not supported.
Slow Connection refreshing of data takes longer.

Live Connection:
No Power Query. Just Visualization
The big disadvantage of this method is that you will not have even Power Query simple
transformation. With this method; you will only have Report tab.

Import Data or Schedule Refresh


Advantages
 Fastest Possible Connection
 Power BI Fully Functional
 Combining Data from different sources
 Full DAX expressions
 Full Power Query transformations
Disadvantages
 Power BI file size limitation (It is different for Premium)
DirectQuery
Advantages
 Large Scale data sources supported. No size limitation.
 Pre-Built models in some data sources can be used instantly
Disadvantages
 Very Limited Power Query functionality
 DAX very limited
 Cannot combine data from multiple sources
 Slower Connection type: Performance Tuning in the data source is MUST DO
Live Connection
Advantages
 Large Scale data sources supported. No size limitation as far as SSAS Supports.
 Many organizations already have SSAS models built. So they can use it as a Live Connection
without the need to replicate that into Power BI.
 Report Level Measures
 MDX or DAX analytical engines in the data source of SSAS can be great asset for modeling
compared to DirectQuery
Disadvantages
 No Power Query
 Cannot combine data from multiple sources
 Slower Connection type: Performance Tuning in the data source is MUST DO

Increamental refresh in powerbi and how it works?


With incremental refresh, the service dynamically partitions and separates data that needs to be refreshed frequently from
data that can be refreshed less frequently.

Paginated report:

Paginated reports are designed to be printed or shared. They're called paginated because they're formatted to fit well on a
page. They display all the data in a table, even if the table spans multiple pages. They're also called pixel perfect because you
can control their report page layout exactly. Power BI Report Builder is the standalone tool for authoring paginated reports
for the Power BI service.

Performance tuning:
Using performance analyzer we can monitor the performance of our powerbi report.

Data flow:
Data flow is basically an ETL tool for data loading and preparation that is deployed in cloud and the data is stored in
Az`ure Data lake Gen2 storage.

How do we add 3000 users in RLS:


Creating a security group and adding all the 3000 members in it and adding the group to rls.

 Files: Data can be imported from Excel (.xlsx, xlxm), Power BI Desktop files
(.pbix) and Comma Separated Value (.csv).
 Content Packs: It is a collection of related documents or files that are stored as
a group. In Power BI, there are two types of content packs, firstly those from
services providers like Google Analytics, Marketo or Salesforce and secondly
those created and shared by other users in your organization.
 Connectors to databases and other datasets such as Azure SQL, Databaseand
SQL, Server Analysis Services tabular data, etc.

Types of filters>>>>>>

There are three levels of filters in Power BI: report, page, and visual

Power BI provides variety of option to filter report, data and visualization. The following
are the list of Filter types.

 Visual-level Filters: These filters work on only an individual visualization,


reducing the amount of data that the visualization can see. Moreover, visual-level
filters can filter both data and calculations.
 Page-level Filters: These filters work at the report-page level. Different
pages in the same report can have different page-level filters.
 Report-level Filters: There filters work on the entire report, filtering all
pages and visualizations included in the report.

Row level security>>>>>


Row-level security (RLS) with Power BI can be used to restrict data access for given
users. Filters restrict data access at the row level, and you can define filters within roles.
In the Power BI service, members of a workspace have access to datasets in the
workspace. RLS doesn't restrict this data access.

rlsuserprincipalname for email id and username for userid

Tiles (individual placeholder)>>>>>

A tile is a snapshot of your data, pinned to the dashboard. A tile can be created from a
report, dataset, dashboard, the Q&A box, Excel, SQL Server Reporting Services
(SSRS) reports, and more.

measures vs calculated column

Calculated column Measure


Expands table by creating new column Summarizes data into a single value
Stored along with table, consumes Calculated at runtime/ stored temporarily
memory
Less analytical capability Rich analytical capabilitys

If the calculation is row by row (example: Profit = Sales – Cost, or Full name = First
Name & ” ” & Last Name), then Calculated Column is what you need.

If the calculation is an aggregation or it is going to be affected by filter criteria in the


report (example: Sum of Sales = Sum(Sales), or Sales Year to Date = TotalYTD(….)),
then Measure 

Direct and import query>>>>

Import: The selected tables and columns are imported into Power BI Desktop. As you
create or interact with a visualization, Power BI Desktop uses the imported data. To see
underlying data changes since the initial import or the most recent refresh, you must
refresh the data, which imports the full dataset again.

DirectQuery: No data is imported or copied into Power BI Desktop. For relational


sources, the selected tables and columns appear in the Fields list. For multi-
dimensional sources like SAP Business Warehouse, the dimensions and measures of
the selected cube appear in the Fields list. As you create or interact with a visualization,
Power BI Desktop queries the underlying data source, so you’re always viewing current
data.

The default one is the SQL Server Import and it enables you to make full use of the Power BI Desktop
capabilities. With Direct Query, you can get a direct connection to query data from only a few supported
data sources. Power BI will not store the data, just table metadata like tables and column names. Like
Direct Query, Live Connection also does not store data. However, it can only query three data sources,
SQL Server Analysis Services, Azure Analysis Services and Power BI Service."

Bookmarks>>>Bookmarks capture the currently configured view of a report page,


including filters, slicers, and the state of visuals.

Sum:Adds all the numbers in a column.`

Sumx: Returns the sum of an expression evaluated for each row in a table.

Running total find DAX= Calculate([Totalsales], filter(allselected(date[date[,


date[date<=MAX(order[date]))))

Allselected function >>AllSelected : Calculate All Values of Selected data. Any


external filter that applied will be excluded from it.

allexcept function>> ALLEXCEPT removes the filters from the expanded table
specified in the first argument, keeping only the filters in the columns specified in the
following arguments.

Filter context and row context>>>

Row context is when your calculation is evaluated for each detail row from a input table

(which can be also a calculated table).


Profit= SUMX(Sales, [SalesAMount]-[CostAmount])

Filter context is set of filters that is applied before the table arrives for use.
TotalSales= SUM(Sales(SalesAMount))

Applying filters from filters pane/Slicer/Visuals(Year = 2017, Country= “India”,


City=”Chennai”)

Totalytd, totalqtd, totalmtd

Evaluates the year-to-date value of the expression in the current context.

=tatalytd(tatalsales, date[date])
PARALLELPERIOD:

Returns a table that contains a column of dates that represents a period


parallel to the dates in the specified dates column, in the current context, with
the dates shifted a number of intervals either forward in time or back in time.

CALCULATE(totasales, PARALLELPERIOD(DateTime[DateKey],-1,year))

Dateadd>> Returns a table that contains a column of dates, shifted either forward or
backward in time by the specified number of intervals from the dates in the current
context.

samperiodlastyear>> Returns a table that contains a column of dates shifted one year
back in time from the dates in the specified dates column, in the current context.

=calculate(tatalsales, sameperiodlastyear(date[date})

=calculate(totalsales,dateadd(date, -1, month)

value>>Converts a text string that represents a number to a number.

values>>> When the input parameter is a column name, returns a one-column table
that contains the distinct values from the specified column. Duplicate values are
removed and only unique values are returned.

unpivot column>>>The Unpivot Columns feature turns multiple column headers into a
single column but in rows. The values get stored under the original columns in another
column.Unpivot Columns to convert column data fields to row data fields.

How do we do querry folding->>


Query folding is the ability for a Power Query query to generate a single query
statement that retrieves and transforms source data.
In the Power Query Editor window, it is possible to determine when a Power Query
query can be folded. In the Query Settings pane, when you right-click the last applied
step, if the View Native Query option is enabled (not greyed out), then the entire query
can be folded.

Query folding is when steps defined in Power Query/Query Editor are translated into
SQL and executed by the source database rather than the client machine. It’s important
for processing performance and scalability, given limited resources on the client
machine.

How to avoid many-to-many relationship-->>>>

 Add a bridging table to store associated entities


 We need to create a bridge table and Create one-to-many relationships between
the three tables.

Calculate function>>> Evaluates an expression in a modified filter context.

Datesytd>>> Returns a table that contains a column of the dates for the year to date, in
the current context.
CALCULATE(SUM(InternetSales_USD[SalesAmount_USD]),
DATESYTD(DateTime[DateKey]))

calendenr>> CALENDAR(<start_date>, <end_date>)

Format >> Converts a value to text according to the specified format.

Today>>>Returns the current date.

YEAR(TODAY())-1963 can b us to find age of a person

now>>>Returns the current date and time in datetime format.

The following example returns the current date and time plus 3.5 days:

DAXCopy
= NOW()+3.5

Q) What is the common table function for grouping data?

Ans:  SUMMARIZE()

 Main groupby function in SSAS.


 Recommended practice is to specify table and group by columns but not metrics.You
can use ADDCOLUMNS function.
 If we want to get an output of some column from a table we use Summarize=
Summarize(employees,employee[employee_id],employee[salary])
 SUMMARIZECOLUMNS

 New group by function for SSAS and Power BI Desktop; more efficient.
 Specify group by columns, table, and expressions.
 It gives us an output column based on the filter condition specified in the current
table.
SummarizeColumnColumn=
Summarize(employees,employee[employee_id],employee[salary],
filter(employee,employee[salary]>15000))

Q) What are some benefits of using Variables in DAX ?

Ans: Below are some of the benefits: 

 By declaring and evaluating a variable, the variable can be reused multiple times in a
DAX expression, thus avoiding additional queries of the source database.
 Variables can make DAX expressions more intuitive/logical to interpret.
 Variables are only scoped to their measure or query, they cannot be shared among
measures, queries or be defined at the model level.

Q). What are the different Excel BI add-in?

Ans: Below are the most important BI add-in to Excel:

 Power Query: It helps in finding, editing and loading external data.


 Power Pivot: Its mainly used for data modeling and analysis.

 Power View: It is used to design visual and interactively reports.


 Power Map: It helps to display insights on 3D Map.

Q) Duplicate find and alsedeletenth highest value

Create a duplicate of the table, then perform groupby on the duplicate table to get
distinct value. Merge the 2 table on the basis of common column and do a inner join, it
will remove the duplicate from the original table.

Q) Difference between merge and append query?

Ans: There are two primary ways of combining queries: merging and appending.

 When you have one or more columns that you’d like to add to another query,
you merge the queries.
 When you have additional rows of data that you’d like to add to an existing query,
you append the query.
Q) problem faced while creating a report or after creating a report?

nth highets value->> We can use topN or RANKX function

Q) Use of APP fucntion (workspace me hota)

Ans: An app is a Power BI content type that combines related dashboards and reports, all in
one place. An app can have one or more dashboards and one or more reports, all bundled
together. Apps are created by Power BI designers who distribute and share the apps with their
colleagues.
Advantage:
Apps are an easy way for designers to share different types of content at one time.
App designers create the dashboards and reports and bundle them together into an app.
The designers then share or publish the app to a location where you, the business user, can
access it. Because related dashboards and reports are bundled together, it's easier for you to
find and install in both the Power BI service (https://powerbi.com) and on your mobile device.

Q) When to use gauge chart?

Ans: Radial gauges are a great choice to:

 Show progress toward a goal.

 Represent a percentile measure, like a KPI.

 Show the health of a single measure.

 Display information you can quickly scan and understand.

Q) what are the options available in the 3 dots in bokmarks pane?

Ans: Edit, rename, update


You can also select whether each bookmark will apply Data properties, such as filters and
slicers; Display properties, such as spotlight and its visibility; and Current page changes, which
present the page that was visible when the bookmark was added. These capabilities are useful
when you use bookmarks to switch between report views or selections of visuals, in which case
you'd likely want to turn off data properties, so that filters aren't reset when users switch views
by selecting a bookmark.

Q) How many relationships can be establsihed between two tables?

Ans: We can have 4 types of relationship:

 One-to-many
 One-to-one
 Many-to-many
 Many-to-one
But we can only have one active relationship between two tables.

Q). What are Building Blocks in Power BI?

Ans: The following are the Building Blocks (or) key components of Power BI:

1. Visualizations: Visualization is a visual representation of data.


Example: Pie Chart, Line Graph, Side by Side Bar Charts, Graphical
Presentation of the source data on top of Geographical Map, Tree Map, etc.

2. Datasets: Dataset is a collection of data that Power BI uses to create its


visualizations.
Example: Excel sheets, Oracle or SQL server tables.

3. Reports: Report is a collection of visualizations that appear together on one or


more pages.
Example: Sales by Country, State, City Report, Logistic Performance report,
Profit by Products report etc.

4. Dashboards: Dashboard is single layer presentation of multiple visualizations,


i.e we can integrate one or more visualizations into one page layer.
Example: Sales dashboard can have pie charts, geographical maps and bar
charts.

5. Tiles: Tile is a single visualization in a report or on a dashboard.


Example: Pie Chart in Dashboard or Report.

Q)How is the FILTER function used?


Ans: The FILTER function returns a table with a filter condition applied for each of its
source table rows. The FILTER function is rarely used in isolation, it’s generally used as
a parameter to other functions such as CALCULATE. 

 FILTER is an iterator and thus can negatively impact performance over large
source tables.

 Complex filtering logic can be applied such as referencing a measure in a filter


expression.

o FILTER(MyTable,[SalesMetric] > 500)

Q) Difference between CALCULATE and CALCULATETABLE function?

Calculate: Evaluates an expression in a modified filter context.

Calculatetable: Evaluates a table expression in a modified filter context.


Syntax:
Calculate(<expression> [<filter>,<filter>,……<filter>])

Calculatetable(<expression>[<filter>,<filter>,……,<filter>])

Note: Calculate performs exactly the same function, except it modifies the filter context
applied to an expression that returns a scalar value.
-> Whenever we need to return an scalar/single value we use calculate. However, when
we need to return an table as an expression we use calculate table.

Cal= Calculate(sum([sales],([country]=’Spain”))

Cal_Table= Sum(Calculatetable([Sale] ,([country]=’Spain”))

Q) What are some benefits of using Variables in DAX ?

Ans: Below are some of the benefits: 

 By declaring and evaluating a variable, the variable can be reused multiple times
in a DAX expression, thus avoiding additional queries of the source database.

 Variables can make DAX expressions more intuitive/logical to interpret.

 Variables are only scoped to their measure or query, they cannot be shared
among measures, queries or be defined at the model level.

Q) What is Power Query?


Ans: Power query is an ETL Tool used to shape, clean and transform data using
intuitive interfaces without having to use coding. It helps the user to:

 Import Data from wide range of sources from files, databases, big data, social
media data, etc.

 Join and append data from multiple data sources. 

o Shape data as per requirement by removing and adding data.

Q) What are the three Edit Interactions options of a visual tile in Power BI Desktop? 

Ans: The 3 edit interaction options are  Filter, Highlight, and None.

Filter: It completely filter a visual/tile based on the filter selection of another visual/tile.

Highlight: It highlight only the related elements on the visual/tile, gray out the non-
related items.

None: It ignore the filter selection from another tile/visual.

Q)List out some benefits of using Power BI.

Here are some benefits of using Power BI:

 It helps build interactable data visualization in data centres

 It allows users to transform data into visuals and share them with anyone

 It establishes a connection for Excel queries and dashboards for fast analysis

 It provides quick and accurate solutions

 It enables users to perform queries on reports using simple English words 

Q) List out some drawbacks/limitations of using Power BI.

Here are some limitations of using Power BI:

 It does not accept file sizes larger than 1 GB and also doesn’t mix imported data,
which is accessed from real-time connections

 There are very few data sources that allow real-time connections to Power BI
reports and dashboard

 Dashboards and reports are only shared with users logged in with the same
email address
 Dashboard doesn’t accept or pass user, account, or other entity parameters

Q) What is a dashboard in Power BI? 

A dashboard is a single layer presentation sheet of multiple visualizations reports. The


main features of the Power BI dashboard are:

 It allows you to drill through the page, bookmarks, and selection pane and also
lets you create various tiles and integrate URLs

 A dashboard can also help you set report layout to mobile view

Q) Can you have a table in the model which does not have any relationship with other
tables? 

Yes. There are two main reasons why you can have disconnected tables:

 The table is used to present the user with parameter values to be exposed and
selected in slicers 

 The table is used as a placeholder for metrics in the user interface

Q) What are the different views available in Power BI Desktop? 

There are three different views in Power BI, each of which serves a different purpose: 

 Report View - In this view, users can add visualizations and additional report
pages and publish the same on the portal.

 Data View - In this view, data shaping can be performed using Query Editor tools

 Relationship View - In this view, users can manage relationships between


datasets

Q) What are the main components of the Power BI toolkit, and what do they do?

 Power Query: lets you discover, access, and consolidate info from different
sources

 Power Pivot: a modeling tool

 Power View: a presentation tool for creating charts, tables, and more

 Power Map: lets you create geospatial representations of your data

 Power Q&A: lets you use natural language to get answers to questions; for
example, “What were the total sales last week?”
Q)  Explain the term data alerts.

Alert works on data that is refreshed, Power BI looks for an alert, and it
reaches the alert threshold or the limit then the alter will be triggered.

Q) Explain data source filter.

It is a parameter to filter the data into machines.

Q) Why use selection pane in Power BI?

Selection Pane helps you to take control over visuals which require to be
displayed and which should not be displayed. It allows you to combine
multiple visual pages in the group and is also used in bookmarking.

Q) . How to handle Many to Many relationships in Power BI?

You can use Crossfiltering option in Power BI to address the Many to Many
relationships.

Q) Explain x-velocity in memory.

It is the main engine which is used in power pivot. It allows you to load the
large set of data into Power BI data.

Q) State the main difference between District() and Values() in DAX?

The only difference between two functions is that with District help you to
calculate the null values.

Q) State the major differences between MAX and MAXA functions

If you want to calculate numeric values, then use MAX. However, if it is for
non numeric values, then you should use MAXA.

Q)  What kind of data can you store in Power BI?

In Power BI, you can store mainly two types of data.

Fact Tables:
The central table in a star schema of a data warehouse is a fact table that
stores quantitative information for analysis, which is not normalized in most
cases.

Dimension Tables:

It is a table in the star schema which helps you to store attributes and
dimensions which describe objects that are stored in a fact table.

Q) What are the method to hide and unhide a specific report in Power BI?

To hide and unhide specific report, you have to go to selection Pane in the
menu bar, and press hides/unhide toggle button to bookmark.

Q) How can you compare Target and Actual Value from a Power BI
report?

You need to use Gauge chart to compare two different measure.

Q) Can you refresh Power BI reports after they are published to the
cloud?

Yes, it is possible. Gateways can be used to do so.

 For SharePoint: Data Management Gateway


 For Powerbi.com: Power BI Personal Gateway

Q) State the difference between Count and CounD function.

Count function returns to count, excluding NULL values whereas Countd


returns distinct values which exclude NULL values.

Q) Explain DATEDD function in Power BI.

DATEDD function helps you to convert any input to a date format. This input
can number, string, or a data type input.

Q) What does DATENAME function do?

DATENAME function shows the name of the specific part of the date that is
given.

Q) What is the DATEPART function?


It returns date function as an integer. However, DATENAME function does the
same thing, except it returns the name of the part of the date.

Q) What does DATEDIFF function do?

This function gives a difference between 2 dates based on the specified Date
part.

Q) What is the use of INDEX Function in Power BI?

INDEX function helps you to retrieve the index of the respective row.

Q) What is the main difference between LTRIM and RTRIM?

LTRIM function helps you to remove the white space from the LEFT of the
string. RTRIM helps you to remove it from the right the last index.

Q)  What is the use of MID function?

MID function returns the string character from the specified index position.

Q) What is the use of split function?

SPLIT function is used to split the string database on the given delimiter.

Q) What area do you go to change and reshape data in Power BI?

Data Editing helps you to change and reshape data in Power BI.

Q) What is the process to refresh Power BI reports when it is uploaded


into the cloud?

Power BI, reports can be refresh using Data management, gateway, and
Power BI Personal Gateway.

Q) What context style is allowed by Power BI DAX?

Power BI DAX content style is both Row and Filter.

Q) June 2021 Update

Reporting:
 Paginated reports visual (preview)

 Area chart transparency sliders

 Inner padding for continuous axes

 Small multiples (preview): responsiveness and conditional formatting

Analytics

 Q&A improvement for inferred results

Modeling

 Format strings now persisted when using DirectQuery for Power BI datasets and
Azure Analysis Services (Preview)

Data preparation

 DirectQuery support for Dataflows GA


 Select all operation is now supported for Dynamic M Query Parameters (preview)

Data connectivity

 Assemble Views (new connector)


 BQE Core (new connector)
 SumTotal (new connector)
 Updated connectors
o Adobe Analytics (updated connector)
o Anaplan (updated connector)
o Azure Databricks (updated connector)
o Cognite Data Fusion (updated connector)
o Dynamics 365 Business Central (updated connector)
o FactSet Analytics (updated connector)
o Google BigQuery (updated connector)
o Starburst Enterprise (updated connector)
o Vessel Insight (updated connector)
o Workplace Analytics (updated connector)
o Azure Consumption Insights (connector deprecated)

Service

 Datasets discoverability
 Request access to datasets
 Mandatory label policy for Microsoft Information Protection sensitivity labels
(preview)
 Admin API to Set and Remove Microsoft Information Protection sensitivity labels
 Automate deployments with new APIs and PowerShell samples (preview) –
 Manage Dataflows in deployment pipelines (preview)
 Admin APIs for deployment pipelines

Mobile

 A new look for the Power BI Windows app (preview)


 Passing URL parameters to paginated reports

Visualizations

New visuals

 Power Slider by TME AG


 Growth Rate Chart by Djeeni BV
 Stratada Program Taskboard by Stratada
 Charturo Interactive Line Chart by Charturo
 Multiple Sparklines by excelnaccess.com

Updated visuals

 Drill Down Combo Bar PRO by ZoomCharts Learn more:


https://zoomcharts.com/en/microsoft-p...
 Dumbbell Bar Chart by Nova Silva Learn more: https://visuals.novasilva.com/
 graphomate bubbles 2021.2 Learn more: https://graphomate.atlassian.net/wiki...
 Zebra BI Tables 5.0 Learn more: https://zebrabi.com/power-bi-custom-v...
 Zebra BI Charts 5.0 Learn more: https://zebrabi.com/power-bi-custom-v...

Editor’s picks

 Drill Down Map PRO by ZoomCharts


 Brick Chart by MAQ Software
 HierarchySlicer by Jan Pieter
 Horizon Chart by xViz
 Card with States by xViz

Template apps

 Template app one-click update and republish


 Salesforce Analytics for Sales Managers

Q) difference between data base & data warehousing?

KEY DIFFERENCE
 Database is a collection of related data that represents some elements of the real world whereas Data
warehouse is an information system that stores historical and commutative data from single or multiple
sources.
 Database is designed to record data whereas the Data warehouse is designed to analyze data.
 Database is application-oriented-collection of data whereas Data Warehouse is the subject-oriented
collection of data.
 Database uses Online Transactional Processing (OLTP) whereas Data warehouse uses Online
Analytical Processing (OLAP).
 Database tables and joins are complicated because they are normalized whereas Data Warehouse
tables and joins are easy because they are denormalized.
 ER modeling techniques are used for designing Database whereas data modeling techniques are
used for designing Data Warehouse.
 Example of databse Oracle database, Microsoft sql server. Example of data warehouse are Amazon
refshift, Informatica,Snowflake.

Q) data processing
Data processing is process of collecting data and translate it into usable information.

Six stages of data processing

1. Data collection
Collecting data is the first step in data processing. Data is pulled from available sources, including data lakes and data
warehouses..

2. Data preparation
The collected data is cleaned up and organized for the following stage of data processing. During preparation, raw data is
checked for any errors. The purpose of this step is to eliminate bad data (redundant, incomplete, or incorrect data) and
begin to create high-quality data for the best business intelligence.

3. Data input
Data input is the first stage in which raw data begins to take the form of usable information.

4. Processing
During this stage, the data inputted to the computer in the previous stage is actually processed for interpretation. Processing
is done using machine learning algorithms, though the process itself may vary slightly depending on the source of data
being processed (data lakes, social networks, connected devices etc.) and its intended use (examining advertising patterns,
medical diagnosis from connected devices, determining customer needs, etc.).
5. Data output/interpretation
The output/interpretation stage is the stage at which data is finally usable to non-data scientists. It is translated, readable,
and often in the form of graphs, videos, images, plain text, etc.). Members of the company or institution can now begin
to self-serve the data for their own data analytics projects.

6. Data storage
The final stage of data processing is storage. After all of the data is processed, it is then stored for future use. While some
information may be put to use immediately, much of it will serve a purpose later on. Plus, properly stored data is a necessity
for compliance with data protection legislation like GDPR. When data is properly stored, it can be quickly and easily
accessed by members of the organization when 

Q) data modeling

Data modeling (data modelling) is the process of creating a data model for the data to be stored in a database
Data models are made up of entities, which are the objects or concepts we want to track data about, and they
become the tables in a database. We have table refered to as entitied in our data model and the entities are
conetcted to each other the connection is called relationship.

Q) difference between ETL & ELT


ETL is extract tarsnform and load whereas ELT is extract load and transform.
So essentially the main difference between ETL and ELT is the order that these steps take place.
So why is it beneficial to transform your data after loading it into the data warehouse?

 Agility: all the data is stored in the warehouse and readily available to use. You don’t have to
think about how to structure the data before you load it into the warehouse. The data modeling
to transform the raw data can be set up as and when it is needed.
 Simplicity: Transformations in the data warehouse are generally written in SQL, a language that
the entire data team (data engineers, data scientists, data analyst) understands. This allows the
entire team to contribute to the transformation logic.
 Self service analytics: If all of your raw data is within the warehouse, you can use BI tools to drill
down from aggregated summary statistics to the raw data underlying them .
 Fixing bugs: If you find errors in your transformation pipeline, you can fix the bug and re-run just
the transformations to fix your data. With an ETL approach, the entire extract-load-transform pro
cess would need to be re-run.

Q) Process of working in ADF


Q) how many schema is there
2 type Star and Snowflake schema
A star schema contains of fact and dimension table represented in a star format. Fact table
contains quantitative information for analysis and dimension table contains dimension for the
quantitative data in fact table.
As with the star schema, the snowflake schema has a central fact table that stores the main data
points and references to its dimensional tables. Unlike the star schema, the snowflake schema
dimensional tables can have their own dimensional tables, thus expanding how descriptive a
dimension can be.

Q) Data cleaning process


Data cleaning is the process of fixing or removing incorrect, corrupted, incorrectly
formatted, duplicate, or incomplete data within a dataset.
Steps involved in data ceaning are:
Removing duplicate data
Fix errors

Q) What is power Query

Power query is an ETL tool for data loading and transforming.

Q) Process of Power BI Infrastructure is maintaining


Q) What is data source in Power BI
Q) Connection mode with Power Query
Q) Performance issue in direct query
The major performace isuue with direct querry is that adta refresh can take longer than compared
to impoert query. So loading of report and dashboard will take longer.

Q) What’s is view ?
A view is a database object that is created using a Select Query with complex logic, so
views are said to be a logical representation of the physical data, i.e Views behave like a
physical table and users can use them as database objects in any part of SQL queries.

Q) Merge query
 When you have one or more columns that you’d like to add to another query,
you merge the queries.

Q) Whats is index in SQL-Advantage & disadvantage

A SQL index is used to retrieve data from a database very fast.

Advantages

 Speed up SELECT query


 Helps to make a row unique or without duplicates(primary,unique) 
 If index is set to fill-text index, then we can search against large string values. for
example to find a word from a sentence etc.

Disadvantages
 Indexes take additional disk space.
 indexes slow down INSERT,UPDATE and DELETE, but will speed up UPDATE if the
WHERE condition has an indexed field.  INSERT, UPDATE and DELETE becomes
slower because on each operation the indexes must also be updated. 

Q) Dax function in power BI

Q) why is it important to have continuous date rabnge in our dataset? What will happen if we
have discountinious date range? How the graph will look with it?

Q) We have 3 measure and we want to show time range for that which chart will we use and
why?

Q) We have a dataset and we import it in power bi one of the column nam,e gets change in our
original data set how do we change in our powerbi data set ? Look for Alt process that changing
it in power query steps.

Q) Use of Data function in bookmarks eclipse option?


Q) How do we connect our dataset to gateway?
Add a data source
1. From the page header in the Power BI service, select Settings. ...
2. Select a gateway and then select Add data source. ...
3. Assign a name to your data source, then select the Data Source Type. ...
4. Enter information about the data source. ...
5. Select an Authentication Method to use when connecting to the data source.

Q) Difference between filter and slicer?


the main difference between a Power BI slicer and a filter is that a slicer is an on-canvas,
dynamic feature, whereas, a filter (page level in this case) is a hidden, static feature.
Furthermore, a filter can refine an entire report, just a page or simply a visual on the canvas
Q) Grouping vs ridding?
Q) Querry folding happens in power query source database?
Q) We have 30 card visual and sql databse with 500mb, but its slow to load what can be done to
improve the performance?
Q) Paginated report?
Q) 3000 users hai usko rls me kaise add kare?
Q) Data flow ?
Q) We have 2 line charts and we want to create a process such that only one will be displayed at
a time?
Yearly and Quaterly chart name, If wqe select yearly chart quarterly will be hiden from our
canvas.
Q) What is the difference between bar chart and column chart why do we use them and when to
use them?

Q) How do we configure increamental refresh?


Q) can we have slicers in dashboard
Q) There are six types of joins as below;

 Left Outer: Rows from left table and matching with the right.
 Right Outer: Rows from right table and matching with the left.
 Full Outer: Rows from both tables (matching or not matching)
 Inner: Only matching rows from both tables.
 Left Anti: Not matching rows from left table.
 Right Anti: No matching rows from right table.

Q) do we need gateway to connect to sharepoint services : No


Q) Pre requisite to connect to azure sql server in power bi
Q) Prerequisit to use a date table in our database?
Q) When yo use calculated column and calculated measure?
Q) What is a workspace?
A workspace is a repository where we store our dashboards,reports,dataset, workbooks and
dataflows in powerbi. Workspaces are used to collaborate and share content with colleagues.
You can add colleagues to your workspaces and collaborate on dashboards, reports,
workbooks, and datasets.
Q) Why do we have inactive relationship in our data model?
Q) Power BI Designer is a free desktop program that offers a compilation of the most
commonly used tools - Power Query, Power Pivot, Power View, and Power Map - all in one
place. It is used to aggregate and model your data, and to create reports and dashboards to be
shared.

Q) dax function to calculate active no of employees on the basis of release date column?
Employees At End of Period =
VAR MaxDate = MAX ( 'Date'[Date] )
VAR EmpCnt =
    CALCULATE (
        COUNTROWS (
            CALCULATETABLE ( 'Employees', 'Employees'[HireDate] <= MaxDate, ALL ( 'Date'
))
                                ),
                              (ISBLANK ( 'Employees'[TerminationDate] ) ||
'Employees'[TerminationDate] > MaxDate)
                       )
RETURN
IF ( ISBLANK ( EmpCnt ), 0, EmpCnt )

Headcount:
Calculate(Cuntx(filter(employee, employee(hiredate)<=Max(date(date)) &&
(Isbalank(employee(terminatedate)) || Employee(terminatedate)>Max(Date(date)))) ,
Epmoyee(ID))

Hire:
Calculate(count(employee(id), userelationship(employee(hiredate),date(date))

Terminate:
Calculate(count(employee(id), userelationship(employee(terminatedate),date(date),
NOT(Isblank(employee(terminatedate))

Employeeturnover:
Calocuate(countrows(employee,) filter(Values(employee(terminatedate),
employee(terminatedate)<=min(date(date), employee(terminatedate)<>Blank))

Turnover rate:
Var turnoverrate =((headcount+last period employee)/2
Return (employeeturnover)/turnoverrate+0

AVG Age:
Calculate(Average(employee(age), userelationship(employee(hiredate), date9date))

Format(date(date), “YYYY)= 2012


Format(date(date), “MMM)= Jan
Format(date(date), “MMMM)= January
Format(date(date), “MM”)= 01
Format(date(date), “DD”)= 03
Format(date(date), “DDD”)= Mon
Format(date(date), “DDDD”)= Monday

Q) What is grouping and aggregation?


Q) Difference between contect pack and powerbi app?

You might also like