Chapter 4&5
Chapter 4&5
Chapter 4&5
MEASURES OF CENTERAL
TENDENCY
• Limitations on our ability to understand complex dataset
lead us to summarize them with few numbers. We use
over-simplified representations because our minds can
not cope with data in its full diversity.
• A measure of Central tendency is a typical value around
which other figures congregate. It is a figure that
represents the data under study
Objectives
• Brief description: An average gives us simple and brief
description of the main features of the data under study.
• To facilitate comparisons between data.
• To make further statistical analysis (base for other
statistical analysis): other statistical devices such as
mean deviation, co-ef fic ient of variation, co-relation, etc
are defined/calculated based on the averages.
Requisites of a Good Measure of Central Tendency:
x i
x i 1
n
Example :
Class x1 x2 …….. xk
Frequenc f1 f2 ……. fk
y
k
x i
fi
x i 1
k
;
f
i 1
i
th
where x i
is i individual item, and
th
f i is frequency of the i class
Example: Given below are marks obtained by 20 students in Basic
Statistics out of 25.
21, 23, 19, 17, 12, 15, 15, 17, 17, 19, 23, 23, 21, 23, 25, 25, 21, 19, 19,
19.
Summarize the raw data using ungrouped frequency distribution
form and find the mean mark of students. Ans: 19.65
Mark Frequency
12 1
15 2
17 3
19 5
21 3
23 4
25 2
Arithmetic mean for grouped frequency distribution
Class x1 x2 …….. xk
Frequenc f1 f2 ……. fk
y
x i
fi
x i 1
k
;
f
i 1
i
th
where x i is class mark/mid - point of i class, and
th
f i is frequency of the i class interval
Example: The following table gives the frequency distribution of
the number of orders received each day during the past 50 days
at the of fice of a mail-order company. Calculate the mean. Ans
: 16.64
Number of orders Frequency
10-12 4
13-15 12
16-18 20
19-21 14
Special properties of Arithmetic mean
.
.
If x is the meann of
k
items
n
Then the mean of all the items in all groups often called
the combined mean is given by:
k
x n i i
x combined i 1
k
n
i 1
i
Exercise: Assume that group 1 has 25 employees with an average monthly salary
of $820, group 2 has 32 employees with an average monthly salary of $450, and
group 3 has 77 employees. If the combined salary of the three groups is $708.6,
find out the average salary of group 3.
*If a wrong f igure has been used when calculating the mean the
correct mean can be obtained with out repeating the whole
process using:
Exercise: The mean value of the weekly income of 40 families is $265. But in the
calculation, the income of one family was read as $150 instead of $115. Find the
“Corrected” mean.
*The effect of transforming original series on the mean.
Weight W1 W2 ……. wk
x i
wi
x weighted i 1
n
;
w
i 1
i
Compute the weighted mean of the price of the drinks. (Ans : $0.875)
n
1/n
GM x 1x 2 x 3 x n x 1x 2 x 3 x n
.
Example 1 : Determine the geometric mean of following set of numbers.
1, 3, 9, 27, 81
Example 2 : A fund manager tr ies to convince you to invest in their fund by showing you
n
HM n
1
x
i 1 i
.
Example 1 : Determine the harmonic mean of following set of numbers.
2, 5, 10, 20, 8
Example 2 : If a car trave ls at a speed of 80 km/hr from A to B, and the same car travel s at
w ( f mo f1 )
Mode ( x̂ ) L mo
2 f mo f1 f2
Number of customer complaints per month in ABC
company is summarized below:
Class Interval Frequency
0-5 15
6-11 20
12-17 30
18-23 15
24-29 12
30-35 8
th
i ( n 1)
Qi value of observatio n, i 1, 2, 3.
4
5. MEASURES OF DISPERSION/VARIATION
Dispersion/Variation is the degree to which numerical data
tend to vary about the central value (MCT).
Used to determine the scatter of values in a distribution
Larger variation
Smaller variation
Objectives:
• To judge the reliability of measures of central tendency
• To control variability itself.
• To compare two or more groups of numbers in terms of their
variability.
• To make further statistical analysis.
TYPES OF MEASURES OF DISPERSION/VARIATION
Range
Quartile deviation/Semi-inter-quartile range
Mean deviation
Standard deviation
Coefficient of variation
1.Range
1.1 Range for ungrouped data
Range is the difference between the largest (Max)
and smallest (Min) values.
Range = Max Min
Example:
Find the range for the sample values: 26, 25, 35, 27,
29, 29.
1.2Range for grouped data
Range is the difference between the upper class
limit of the highest class and lower class limit of the
lowest class.
Examples
1.The time (in minutes) taken by sampled workers of company X to arrive
at workplace are shown below:
25, 18, 25, 8, 15 ,15, 10, 35, 40, 45
Compute:
i. Range ii. Quartile deviation ii. Mean deviation about mean iV. Sample standard
deviation V. Coefficient of variation. Ans: R=37 , QD= 11.25 , MD(mean)=10.4,
S=12.8, CV= 48.12%
2. The following distribution shows age of 50 employees working in a
whole sale centre:
Compute:
i. Range ii. Quartile deviation ii. Mean deviation about median iV. Sample
standard deviation V. Coefficient of variation. Ans:R=29,QD(Exc),
Age in years Number of employees
45-49 7
50-54 14
55-59 11
60-64 8
65-69 6
2.Quartile deviation/semi-interquartile range
•It is half of the difference between the upper and lower
quartiles
3. Mean Deviation
| x i
c |
MD ( c ) i 1
n
•Mean deviation about the central value, c, for grouped data is:
k
fi | xi c |
MD ( c ) i 1
k
i 1
fi
4. Variance
The variance is a measure that uses the mean as a point of
reference.
The variance is small when all values are close to the mean.
The variance is large when all values are spread out from the
mean.
Sample variance
Let x1 , x 2 , , x n
be the sample values. The sample variance is defined by:
n
x
2
x 2 2 2
i 1
i
x 1 x x 2 x x N x
S
2
n 1 n 1
S S
2
5.Coefficient of variation
Compares the relative variation of two variables/dataset.
Do not depend on the units of the variables (free of units).
S
C .V * 100 %
x
The relative variability in the 1st data set is larger than the relative variability in
the 2nd data set if C.V1> C.V2 (and vice versa).
CV is inversely proportional to consistency of data.
NB:
• In f inance, the coef ficient of variation allows investors to determine how much
volatility, or risk, is assumed in comparison to the amount of return expected from
investments. The lower the ratio of the standard deviation to mean return, the
better risk-return trade-off (i.e. in finance, the coefficient of variation is important in
investment selection. From a f inancial perspective, the f inancial metric represents
the risk-to-reward ratio where the volatility shows the risk of an investment and the
mean indicates the reward of an investment. By determining the coef ficient of
variation of different investment, an investor identif ie s the risk-to-reward ratio of
each investment and develops an investment decision. Generally, an investor
seeks an investment with a lower coef ficient of variation because it provides the
most optimal risk-to-reward ratio with low volatility but high returns.)
•CV is a measure of the relative dispersion of data points around the mean.
Example