Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Manual

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Power Query

1. Merging columns
1. Go to Data>Get & transform data>Get Data>from Excel workbook.
2. Select the workbook you wish to load. You can select multiple
workbooks.
3. Select transform data.
4. Range the target columns in the order you want them merged.
5. Go to Transform/Add column*>Merge Columns. Select the separator
and rename the merged column.
6. Click OK.
*“Add column” ribbon → select columns by pressing control according to the sequence → Merge
columns → select separator and name the column. → See a new column created.
“Transform” ribbon → select columns by pressing control according to the sequence → Merge columns
→ select separator and name the column. → Only see ONE column, does not add columns

Tips: You may also do customisation on the new column you want to add. Go to Add Column>Custom
Column to add a new column and merge all the target columns into this new column. But this method
cannot be used to merge columns with “number” format. So, we must first change the format to “text”
before customising.
Tips:
1. To load your edited data into Excel, click Home>Close & Load.
2. To edit the table in Power Query, click Query>Edit.
3. The data in Excel is linked to the original data through “Power Query”, so even if the data is deleted,
when we click refresh, the data will be loaded back from the original data. If need the loaded data to do
report, need to unlink. Go to “Table design” → convert to range / unlink.

2. Append / merge data


1. Go to Data>Get & transform data>Get Data>from Excel workbook.
2. Select the workbook you wish to load. You can select multiple
workbooks.
3. Select transform data.
4. Go to Power Query, select Home>Combine>Append Queries as New.
5. Select the target tables to append
6. After appending, load the final query into Excel

Tips: Columns must be the same. Append tables means to put the data line by line – layman term is to
combine all data under the same column headers.
1. Go to Data>Get & transform data>Get Data>from Excel workbook.
2. Select the workbook you wish to load. You can select multiple
workbooks.
3. Select transform data.
4. Go to Power Query, select Home>Combine>Merge Queries as New.
5. Select the target tables and target columns (use control button) to
merge.
6. After merging, load the final query into Excel

Tips: Columns need not to be the same.

Tips: Join Kind


Left Outer -
Right Outer -
Full Outer -
Inner -
Left Anti -
Right Anti -

3. Combining data from folder

1. Go to Data>New Query>From File>From Folder.


2. Browse folder to obtain the folder path. Click OK.
3. Combine>Combine & Load (or Combine & transform data if you wish to
edit).
4. Select First File, click on the target sheet under Sample File Parameter
5. Click OK.
6. Go to Power Query to check your results.

Tips:
1. If the column headers are not the same, the data will be loaded in a separate column. Even if there is a
difference of spacing, Excel will deem as different.
2. If there are new tables added into the folder and you wish to update, click Home>Refresh to have the
new tables combined.
4. Pivoting columns

1. Go to Data>Get & transform data>Get Data>from Excel workbook.


2. Select the workbook you wish to load. You can select multiple
workbooks.
3. Select transform data.
4. In Power Query, select the Key* column, select Transform>Pivot
Columns.
5. Under Values Column, select Value.
6. Under Advanced options, select Don't Aggregate.

*Key means the column that you wish to put as header.


Tips: You can promote first row as headers where required, go to Home>Use First Row as Headers. You
can also demote the headers as first row, go to Home>Use Headers as First Row.

5. Unpivoting columns

1. Go to Data>Get & transform data>Get Data>from Excel workbook.


2. Select the workbook you wish to load. You can select multiple
workbooks.
3. Select transform data.
4. Select target columns, go to Transform>Unpivot Columns.

Tips: How to determine which columns to be selected?


Unpivot actually means combine all the columns that you have selected into one single column. So, the
columns that you select should be the columns that you want to combine. Try and see the result.

6. Transpose data

1. Go to Data>Get & transform data>Get Data>from Excel workbook.


2. Select the workbook you wish to load. You can select multiple
workbooks.
3. Select transform data.
4. Demote first row as headers where required, go to Home>Use Headers
as First Row.
5. Go to Transform>Transpose.
6. Promote first row as headers if necessary.
7. Splitting columns

1. Go to Data>Get & transform data>Get Data>from Excel workbook.


2. Select the workbook you wish to load. You can select multiple workbooks.
3. Select transform data.
4. Analyse your data and choose the best way to split your data.
5. Below are two examples:

1. In Power Query, go to 1. In Power Query, go to


Home>Split Column>By Home>Split Column>By
Delimiter. Number of Characters.
2. Select the correct delimiter. 2. Select the Number of
3. Select the correct split. characters.
4. Go to Advance options to split 3. Select the Split.
into either Columns or Rows. 4. Go to Advance options to
Select the Number of columns split into either Columns or
to split into. Select Quote Rows. Select the Number of
Character. You may also select columns to split into.
Split using special characters. 5. Click OK when done.
5. Click OK when done.

Tips:
1. You may need to TRIM your data before you start splitting.
2. You may need to split more than one time in a single table, and may need to use more than one type of
splitting method.
Try it!

8. Indexing columns

1. Go to Data>Get & transform data>Get Data>from Excel workbook.


2. Select the workbook you wish to load. You can select multiple
workbooks.
3. Select transform data.
4. In Power Query, go to Add Column>Index Column and select the
desired index format.
9. Extracting a Web Query & Merging with local files
1. Go to Power Query, Data>New Query>From Other Sources>From Web.
2. Paste the provided URL into the URL box.
3. Select desired table and click transform data.
4. Cleanse data & load to Excel.
5. Load the second table from the local folder into Power Query. The
steps are similar to your earlier tables downloaded from local folders.
6. Cleanse data if necessary.
7. Merge the tables.
https://en.wikipedia.org/wiki/List_of_Malaysian_states_by_GDP

Revision!
Combining data from folder – use “Sales Data” folder!
Can you do it?

10. Subtraction for date


1. Go to Data>Get & transform data>Get Data.
2. Select the source of your data.
3. Browse folder to obtain the folder path. Click OK.
4. Combine>Combine & transform.
5. Select First File, click on the target sheet under Sample File Parameter
6. Click OK.
7. Go to Power Query to check your results.
8. Select the first date column you wish to include for subtraction.
9. Press “ctrl” button and select the second date column you wish to
include.
10. Go to Add Column>Date>Subtract days.
11. Rename the column appropriately.

You can also do this for columns of other data type, e.g. number.
Try to find the profit!
11. Remove duplicates

12. Go to Data>Get & transform data>Get Data>from Excel workbook.


13. Select the workbook you wish to load. You can select multiple
workbooks.
14. Select transform data.
15. You can first check the duplicates by grouping the data. We want to
group data in a way to detect unique ID duplication, showing how
many duplications and what the duplications are.
16. Select the unique ID duplication>Transform>Group By.
17. Select Advanced.
18. Fill in the column name and select the appropriate operations.
According to no.4, we would select “count rows” at first level, and then
select “all rows” at second level.
19. Click OK.
20. Sort the “count rows” column according to descending order.
21. You will be able to see how many duplicates, and you can click on the
cell named “table” to see the duplicated rows.
22. If you confirm you want to remove all duplicates, you can cancel the
“group by” steps*.
23. Select the unique ID column, go to Home>Remove Rows>Remove
Duplicates.

*Tips: You can always “undo” by cancelling the steps you have done. Just click the cross button beside
the steps listed under “Applied Steps” at right hand side of Power Query window.

Data Analytics
You need plug in to run this session.
https://support.microsoft.com/en-us/office/load-the-analysis-toolpak-in-excel-6a63e598-cd6d-42e3-9317-
6b40ba1a66b4
Click the File tab, click Options, and then click the Add-Ins category.
In the Add-Ins box, check the Analysis ToolPak check box, and then click OK.

- If Analysis ToolPak is not listed in the Add-Ins available box, click Browse to locate it.
- If you are prompted that the Analysis ToolPak is not currently installed on your
computer, click Yes to install it.
12. Descriptive Analysis

1. Open any Excel workbook.


2. Go to Data>Data Analysis.
3. Select Descriptive Analysis.
4. Select range of data to be analysed.
5. Select Summary statistics.

Tips:
Mean – Average
Standard Error – A measure of how accurate the mean of a sample is likely to be compared to the true
population mean. A small standard error indicates that the sample mean is a reliable estimate of the
population mean, while a large standard error means that the sample mean may vary a lot from the
population mean. The standard error decreases as the sample size increases
Median – Center value
Mode – The value that occurs most frequently in a given set of data.
Standard Deviation - A measure of the amount of variation or dispersion of a set of values.
The smallest possible value for the standard deviation is 0, and that happens only in contrived situations
where every single number in the data set is exactly the same (no deviation).

Sample Variance – A measure of how far a set of numbers is spread out from their average value
Kurtosis – The sharpness of the peak of a frequency-distribution curve. A positive value for the kurtosis
indicates a distribution more peaked than normal, while a negative kurtosis indicates a shape flatter than
normal.
Skewness – A measure of the symmetry of a distribution. It can be used to determine whether a dataset is
symmetric or skewed. A negative value for skewness indicates that the tail is on the left side of the
distribution, which extends towards more negative values. A positive value for skewness indicates that the
tail is on the right side of the distribution, which extends towards more positive values. If skewness = 0,
the data are perfectly symmetrical.
Range = Max - Min
Minimum – Smallest value
Maximum – Largest value
Sum – Addition of all values
Count – The number of values

Tips: You can also do descriptive analysis using pivot table. How?

13. Correlation

1. Open any Excel workbook.


2. Insert Pivot Table.
3. Drag the independent variable into “Rows” field. Group your data as
desired.
4. Drag the dependent variable into “Values” field. Change the Value
Field Setting to average*.
5. Add chart element Trendline. Trendline represents the correlation.

*Why must the value be average?


Because you are trying to find out at each level of independent variable, how would be the dependent
variable change. So, you cannot choose sum because sum represents the value of ALL independent
variables, not EACH independent variable.
Try for other combination!

Correlations of multiple variables

1. Open any Excel workbook.


2. Go to Data>Data Analysis.
3. Select Correlation.
4. Select Input Range.
5. Tick Lables in First Row.
6. Click OK.

You might also like