Basic Statistical Functions in Excel
Basic Statistical Functions in Excel
1. Count Function
We use the count function when we need to count the number of cells containing a number.
Remember ONLY NUMBERS! Let’s see the function:
COUNT(value1, [value2], …)
So, let’s try to find the answer to our first question – How many items were on discount?
2. Counta Function
While the count function only counts the numeric values, the COUNTA function counts all
the cells in a range that are not empty. The function is useful for counting cells containing
any type of information, including error values and empty text.
COUNTA(value1, [value2], …)
We’ll answer the second question using the counta function since it is able to count all the
non-empty values – How many items/pieces of equipment are sold by the store?
The COUNTBLANK function counts the number of empty cells in a range of cells. Cells
with formulas that return empty text are also counted here but cells with zero values are not
counted. This is a great function for summarizing empty cells while analyzing any data.
COUNTBLANK(range)
Summarizing empty cells is the requirement for our third question – What products are not in
the discount section? Let’s apply the function!
4. Countifs Function
Countifs are one of the most used statistical functions in Excel. The COUNTIFS function
applies one or more conditions to the cells in the given range and returns only those cells that
fulfill all of the conditions.
Note: Every new range must have the same number of rows and columns as
the criteria_range1 argument. The ranges do not have to be adjacent to each other.This
function seems perfect to answer the fourth question – Are there any products sold having
cost more than 2000 along with a discount rate greater than 50%?
The questions seemed complex but it was really easy to find the answer in Excel. Only 1
product i.e., sneakers, cost more than 2000 and sold at a discount rate greater than
20%.Wonderful, isn’t it? We have gone through some basic statistical functions in MS Excel
so far. Next, let’s have a look at the intermediate statistical functions.
5. Average Function
The most common function we usually use in our daily lives is the average (or mean). The
AVERAGE function simply returns the arithmetic mean of all the cells in a given range:
AVERAGE(number1, [number2], …)
But there’s one simple drawback to using averages – they are prone to outliers. Therefore,
they can paint a very unrealistic picture in our analysis. Let’s find out the average number of
goods sold:dispersed
The average comes out to be ~ 365.2. We will be doing similar calculations for cost as well.
6. Median Function
The problem of outliers can be solved by using another function for the central tendency –
median. The median function returns the middle value of the given range of cells. The syntax
is quite simple:
MEDIAN(number1, [number2], …)
Let’s find the median of the number of goods sold in our sports store and see how close this
is to our average value:
We see that the median comes out to be ~ 320 which is pretty close to the average value. It
means there is not much fluctuation in our data. Let’s see if this is the case for the cost of
goods:
The median and the average value for the cost of each item vary a lot. For example, the cost
of a ball is 50 but the cost of a bat is 2000 – resulting in high dispersion.
7. Mode Function
For numerical values, mean and median usually, suffice but what about categorical values?
Here, mode comes into the picture. Mode returns the most frequent and repeated value in the
given range of values:
MODE.SNGL(number1,[number2],…)
Note: MODE.SNGL returns only a single value whereas MODE.MULT returns an array of
most commonly occurring values.
Well, this is a simple one. Let’s find the most frequent discount value given by the sports
store:
This discount value is 10%.
8. Standard Deviation Function
Standard Deviation is one of the ways to quantify dispersion. It is a measure of how widely
values are dispersed from the average value.
Here, we will be using the STDEV.P function which is used to calculate standard deviation
based on the entire population given as arguments:
STDEV.P(number1,[number2],…)
Note: STDEV.P function assumes that its arguments are the entire population. If that’s not
the case, you may use the function STDEV.S() function.For a large sample size, the standard
deviation of the population and samples will return approximately similar values. Previously,
we have calculated mean and median to get a picture of the central tendency. Let’s find out
the standard deviation to see the level of dispersion:
As expected, the standard deviation of the quantity sold is less, meaning that the dispersion is
less whereas the standard deviation for the cost of products is high.
9. Quartiles Functions
This is yet another function with abundant applications in the industry. It helps us divide the
population into groups. The QUARTILES.INC returns the quartile of a dataset, based on
percentile values from 0 to 1, inclusive.
For example, you can use this function to find out the top 25% of your customer base.
QUARTILE.INC(array, quart)
CORREL(array1, array2)
Let’s head to our final and most interesting question – is there any relationship between the
number of goods sold and the percentage of discount?
Well, the correlation comes out to be ~0.8 which is pretty high. It seems these are positively
related – meaning more the discount, more the quantity sold.
Introduction to What-If Analysis
Excel for Microsoft 365 Excel for Microsoft 365 for Mac More...
By using What-If Analysis tools in Excel, you can use several different sets of values in one
or more formulas to explore all the various results.
For example, you can do What-If Analysis to build two budgets that each assumes a certain
level of revenue. Or, you can specify a result that you want a formula to produce, and then
determine what sets of values will produce that result. Excel provides several different tools
to help you perform the type of analysis that fits your needs.
Note that this is just an overview of those tools. There are links to help topics for each one
specifically.
Overview
What-If Analysis is the process of changing the values in cells to see how those changes will
affect the outcome of formulas on the worksheet.
Three kinds of What-If Analysis tools come with Excel: Scenarios, Goal Seek, and Data
Tables. Scenarios and Data tables take sets of input values and determine possible results. A
Data Table works with only one or two variables, but it can accept many different values for
those variables. A Scenario can have multiple variables, but it can only accommodate up to
32 values. Goal Seek works differently from Scenarios and Data Tables in that it takes a
result and determines possible input values that produce that result.
In addition to these three tools, you can install add-ins that help you perform What-If
Analysis, such as the Solver add-in. The Solver add-in is similar to Goal Seek, but it can
accommodate more variables. You can also create forecasts by using the fill handle and
various commands that are built into Excel.
For more advanced models, you can use the Analysis ToolPak add-in.
Use scenarios to consider many different variables
A Scenario is a set of values that Excel saves and can substitute automatically in cells on a
worksheet. You can create and save different groups of values on a worksheet and then
switch to any of these new scenarios to view different results.
For example, suppose you have two budget scenarios: a worst case and a best case. You can
use the Scenario Manager to create both scenarios on the same worksheet, and then switch
between them. For each scenario, you specify the cells that change and the values to use for
that scenario. When you switch between scenarios, the result cell changes to reflect the
different changing cell values.
1. Changing cells
2. Result cell
1. Changing cells
2. Result cell
If several people have specific information in separate workbooks that you want to use in
scenarios, you can collect those workbooks and merge their scenarios.
After you have created or gathered all the scenarios that you need, you can create a Scenario
Summary Report that incorporates information from those scenarios. A scenario report
displays all the scenario information in one table on a new worksheet.
Note: Scenario reports are not automatically recalculated. If you change the values of a
scenario, those changes will not show up in an existing summary report. Instead, you must
create a new summary report.
Use Goal Seek to find out how to get a desired result
If you know the result that you want from a formula, but you're not sure what input value the
formula requires to get that result, you can use the Goal Seek feature. For example, suppose
that you need to borrow some money. You know how much money you want, how long a
period you want in which to pay off the loan, and how much you can afford to pay each
month. You can use Goal Seek to determine what interest rate you must secure in order to
meet your loan goal.
Cells B1, B2, and B3 are the values for the loan amount, term length, and interest rate.
Cell B4 displays the result of the formula =PMT(B3/12,B2,B1).
Note: Goal Seek works with only one variable input value. If you want to determine more
than one input value, for example, the loan amount and the monthly payment amount for a
loan, you should instead use the Solver add-in. For more information about the Solver add-in,
see the section Prepare forecasts and advanced business models, and follow the links in
the See Also section.
Use Data Tables to see the effects of one or two variables on a formula
If you have a formula that uses one or two variables, or multiple formulas that all use one
common variable, you can use a Data Table to see all the outcomes in one place. Using Data
Tables makes it easy to examine a range of possibilities at a glance. Because you focus on
only one or two variables, results are easy to read and share in tabular form. If automatic
recalculation is enabled for the workbook, the data in Data Tables immediately recalculates;
as a result, you always have fresh data.
A PivotTable is a powerful tool to calculate, summarize, and analyze data that lets you see
comparisons, patterns, and trends in your data.
PivotTables work a little bit differently depending on what platform you are using to run
Excel.
Note: Your data shouldn't have any empty rows or columns. It must have only a single-row
heading.
1. To add a field to your PivotTable, select the field name checkbox in the PivotTables
Fields pane.
Note: Selected fields are added to their default areas: non-numeric fields are added to Rows,
date and time hierarchies are added to Columns, and numeric fields are added
to Values.
2. To move a field from one area to another, drag the field to the target area.
Create a chart
Note: You can select the data you want in the chart and press ALT + F1 to create a chart
immediately, but it might not be the best chart for the data. If you don’t see a chart you like,
select the All Charts tab to see all chart types.
4. Select a chart.
5. Select OK.
Add a trendline
1. Select a chart.
2. Select Design > Add Chart Element.
3. Select Trendline and then select the type of trendline you want, such
as Linear, Exponential, Linear Forecast, or Moving Average.
4. Select the cell(s) you want to create a rule for.
5. Select Data >Data Validation.
Now, if the user tries to enter a value that is not valid, an Error Alert appears with your
customized message.