Chapter-3ni Kamote Chua
Chapter-3ni Kamote Chua
Chapter-3ni Kamote Chua
Measures of Variation
OBJECTIVES
Upon the completion of this chapter, the student should be able to:
INTRODUCTION
Proper allocation of resources and the view trend of how resources are distributed are just
some of the benefits that an economist gets from their knowledge about measure of distribution.
The importance of knowing this chapter is of best help, not just in the field of economics but also
to other fields such as education, medicine, surveying and even study on engineering.
This chapter presents major topics such as Normal Distribution, Skewness and Kurtosis
as well as other measures of variation including Range, Quartile Deviation, Mean Absolute
Deviation, Standard Deviation, and Coefficient of Variation.
45
Range, Quartile Deviation and Mean Absolute Deviation (MAD)
Several measures of variation or dispersion are Range, Quartile Deviation and Mean
Absolute Deviation. Their definition is as follows:
Range is simply the difference between the lowest and highest values. It can also
mean all the output values of a function.
Range = 𝐻 − 𝐿
Quartile Deviation is one-half of the difference between the upper and lower, that
is, the third and first, quartiles also known as semi-inter quartile range.
𝑄3 −𝑄1
QD =
2
where 𝑄1: first quartile value
𝑄3 : third quartile value
Mean Absolute Deviation is the measure of the average of the absolute deviations
from the mean of all observations in a given data.
For the ungrouped data, the formula given below should be used.
∑𝑑
MAD =
𝑛
where d: |𝑥 − 𝑥̅ | (the deviation of each score from the mean)
x: individual score
𝑥̅ : mean score
n: total frequency
If the data given is being grouped or in frequency distribution, the formula is:
∑𝑓𝑑
MAD =
𝑛
46
where d: absolute deviation (the difference between each
class midpoint and the mean of the distribution
f: frequency
n: number of items/total frequency
The mean absolute deviation or MAD is essential in order to distinguish the amount of
each measure of variation that may be varies from the mean of distribution. It may be obtained
according to the set of data such as grouped or ungrouped data.
We note that the above application of formulas under the Range, Quartile Deviation and
the mean absolute deviation will lead to the better understanding of the measures of variation in
describing the given set of data.
Standard Deviation
The more spread apart the data, the higher the deviation. It is useful in comparing sets
of data which may have the same mean but a different range. It has proven to be an extremely
useful measure of spread in part because it is mathematically tractable. The Standard
Deviation formula is simply the square root of the variance.
The formula for the standard deviation for ungrouped data for an entire population is:
∑(𝑥−𝑥̅ )
S= √
𝑛
The formula for the standard deviation for ungrouped data for a sample of a population is:
∑(𝑥−𝑥̅ )2
S=√
𝑛−1
47
where S: standard deviation
x: each value in the population
𝑥̅ : the mean of the values
n: the number of values (sample)
By grouped data we mean the situation where we have, for example, sampled many
people or situations and it is convenient to classify our data into groups and subgroups.
The formula for the standard deviation for grouped data for an entire population is:
∑ 𝑓𝑥 2 ∑ 𝑓𝑥 2
S= √ −( )
𝑛 𝑛
where f: frequency of each class
x: midpoint of every class
n: total number of frequency
The formula for the standard deviation for grouped data for a sample of a population is:
Coefficient of Variation
The Coefficient of Variation is useful tool in statistic for comparing the degree of
variation from one data series to another, even if the means are drastically different from each
other a statistical measure of the dispersion of data points in a data series around the mean.
The coefficient of variation represents the ratio of the standard deviation to the mean.
Standard Deviation
CV = × 100%
Mean
Suppose you want to evaluate the relative dispersion of grades for two classes of
students: Class A and Class B. The coefficient of variation can be used to compare these two
48
groups and determine how the grade dispersion in Class A compares to the grade dispersion in
Class B.
Skewness
Skewness gives us an idea about the shape of the curve which we can draw with the help
of the given data. A distribution is symmetric if the mean, median and mode all coincide with
each other.
But if the mean, median and mode do not have equal values, the data set is said to be
skewed distribution. It can come in the form of “positive skewness” or “negative skewness”
depending on whether the data points are skewed to the right (positive skew) or to the left
(negative skew).
Kinds of skewness:
a. Symmetrical distribution
It is a symmetrical distribution because the mean, median and mode have equal values.
The spread of the frequencies looks the same to the left and right of the center point.
49
b. Positively skewed distribution
It is a positively skewed distribution because the value of the mean is maximum, the
mode is least and the median lies in between the two. The spread of the frequencies has greater
range of values on the right side than on the left side.
It is a negatively skewed distribution because the value of the mode is maximum, the
mean is least and the median lies in between the two. The spread of the frequencies has greater
range of values on the left side than on the right side.
Measure of skewness:
50
independent of the units of measurement, is defined as the Karl Pearson’s coefficient of
skewness, 𝑆𝐾 . The formula is given as:
𝑥̅ − 𝑥̂
𝑆𝐾 =
𝑆
where 𝑥̅ : mean
𝑥̂: mode
S: standard deviation
The sign of 𝑆𝐾 gives the direction and its magnitude give the extent of skewness.
Symmetrical distribution 𝑆𝐾 = 0
Positively skewed distribution 𝑆𝐾 > 0
Negatively skewed distribution 𝑆𝐾 < 0
If mode is not given in the distribution, the empirical relation between mean, median and
mode is use which states that Mean – Mode = 3(Mean – Median). Hence, Karl Pearson’s
coefficient of skewness is defined in terms of median as:
3(𝑥̅ − 𝑥̃ )
𝑆𝐾 =
𝑆
where 𝑥̅ : mean
𝑥̃: median
S: standard deviation
According to Bowley, the quartiles are equidistant from the value of the median in a
symmetrical distribution. But in a skewed distribution, the quartiles will not be equidistant from
the median. He has suggested a formula based on relative position of quartiles, given as:
𝑄3 − 2𝑄2 + 𝑄1
𝑆𝑄 =
𝑄3 − 𝑄1
c. Kelly’s measure of skewness
51
𝐷9 − 2𝐷5 + 𝐷1
𝑆𝐷 =
𝐷9 − 𝐷1
Kurtosis
Kurtosis used to describe the peakedness of a curve. All the frequency curves expose
different degrees of flatness or peakedness. Measure of kurtosis denotes the shape of top of a
frequency curve.
Measure of kurtosis tell us the extent to which a distribution is more peaked or more
flat topped than the normal curve, which is symmetrical and bell-shaped.
Measures of Kurtosis:
𝑄𝐷
𝐾=
𝑃90 − 𝑃10
(𝑄3 − 𝑄1 )
where: QD =
2
52
EXAMPLES
Let us take the following examples on how to compute the range, quartile deviation,
mean absolute deviation, standard deviation, coefficient of variation, skewness and kurtosis
for the given set of data:
Example 3.1
Given the following set of data for the diameter of screw drivers.
Solution:
𝒙 |𝒙 − 𝒙
̅| ̅ |𝟐
|𝒙 – 𝒙
1.1 1.89 3.5721
1.6 1.39 1.9321
1.6 1.39 1.9321
1.7 1.29 1.6641
1.8 1.19 1.4161
2.1 0.89 0.7921
2.2 0.79 0.6241
2.3 0.69 0.4761
2.3 0.69 0.4761
2.5 0.49 0.2401
2.9 0.09 0.0081
3.4 0.41 0.1681
3.5 0.51 0.2601
4 1.01 1.0201
4.3 1.31 1.7161
4.4 1.41 1.9881
4.5 1.51 2.2801
4.5 1.51 2.2801
4.6 1.61 2.5921
4.8 1.81 3.2761
∑𝒅 = 21.88 𝟐
∑𝒅 = 28.714
RANGE
Range = 𝐻 − 𝐿
Range = 4.8 − 1.1
Range = 3.7
53
QUARTILE DEVIATION
𝑄1 = 25% (N)
𝑄1 = .25 (20)
𝑄1 = 5 rank
𝑄1 = 1.8
𝑄3 = 75% (N)
𝑄3 = .75 (20)
𝑄3 = 15 rank
𝑄3 = 4.3
𝑄3 −𝑄1
QD =
2
4.3−1.8
QD =
2
QD = 𝟏. 𝟐𝟓
∑𝑑
MAD =
𝑛
21.88
MAD =
20
MAD = 𝟏. 𝟎𝟗𝟒
STANDARD DEVIATION
∑(𝑥−𝑥̅ )2
S =√
𝑛−1
28.714
S =√
20−1
S = 𝟏. 𝟓𝟏𝟏𝟑
COEFFICIENT OF VARIATION
S
CV = × 100%
𝑥̅
1.5113
CV = × 100%
2.99
CV = 50.5452 %
54
SKEWNESS
KURTOSIS
𝑄𝐷
𝐾=
𝑃90 − 𝑃10
1.25
𝐾=
4.5−1.6
𝑲 = 0.4310
55
Example 3.2
Mike is deciding on tile for his bathroom. He has narrowed it down to 12 tiles. The prices
of the 12 tiles are 3.29, 4.15, 2.97, 2.61, 5.00, 2.99, 3.75, 3.50, 2.75, 4.30, 4.45 and 2.83 dollars.
Solution:
𝒙 |𝒙 − 𝒙
̅| ̅ |𝟐
|𝒙 – 𝒙
2.61 0.9392 0.8821
2.75 0.7992 0.6387
2.83 0.7192 0.5172
2.97 0.5792 0.3355
2.99 0.5592 0.3127
3.29 0.2592 0.0672
3.50 0.0492 0.0024
3.75 0.2008 0.0403
4.15 0.6008 0.3610
4.30 0.7508 0.5637
4.45 0.9008 0.8114
5.00 1.4508 2.1048
∑𝒅 = 7.8084 ∑𝒅𝟐 = 6.637
RANGE
Range = 𝐻 − 𝐿
Range = 5.00 − 2.61
Range = 2.39
QUARTILE DEVIATION
𝑄1 = 25% (N)
𝑄1 = .25 (12)
𝑄1 = 3 rank
𝑄1 = 2.83
𝑄3 = 75% (N)
𝑄3 = .75 (12)
𝑄3 = 9 rank
𝑄3 = 4.15
𝑄3 −𝑄1
QD =
2
4.15−2.83
QD =
2
QD = 𝟎. 𝟔𝟔
56
MEAN ABSOLUTE DEVIATION
2.61+2.75+2.83+2.97+2.99+3.29+3.50+3.75+4.15+4.30+4.45+5.00
𝑥̅ = 12
𝑥̅ = 3.5492
∑𝑑
MAD =
𝑛
7.8084
MAD =
12
MAD = 𝟎. 𝟔𝟓𝟎𝟕
STANDARD DEVIATION
∑(𝑥−𝑥̅ )2
S =√
𝑛−1
6.637
S =√
12−1
S = 𝟎. 𝟕𝟕𝟔𝟖
COEFFICIENT OF VARIATION
S
CV = × 100%
𝑥̅
0.7768
CV = × 100%
3.5492
CV = 21.8866 %
SKEWNESS
3.29+3.50
𝑥̃ =
2
𝑥̃ = 3.395
3(𝑥̅ −𝑥̃)
𝑆𝐾 =
𝑆
3(3.5492−3.395)
𝑆𝐾 =
0.7768
𝑺𝑲 = 0.5955
KURTOSIS
57
𝑃10 = 1.2 rank or 2 rank
𝑃10 = 2.75
𝑄𝐷
𝐾=
𝑃90 − 𝑃10
0.66
𝐾=
4.45− 2.75
𝑲 = 0.388
Example 3.3
The following are the size of radius of a water tank that are available on the market with
their corresponding quantities.
Radius (dm) f
0.98 – 1.01 7
0.94 – 0.97 5
0.90 – 0.93 2
0.86 – 0.89 6
0.82 – 0.85 4
0.78 – 0.81 5
0.74 – 0.77 3
0.70 – 0.73 8
0.66 – 0.69 4
0.62 – 0.65 2
0.58 – 0.61 1
0.54 – 0.57 3
Compute the range, quartile deviation, mean absolute deviation, standard deviation,
coefficient of variation, Karl Pearson’s coefficient of skewness and the coefficient of kurtosis
from the table above.
58
Solution:
For the solution of the following, refer from the previous chapter (Page 35).
Mean: 0.807
Median: 0.807
Mode: 0.71
Q1: 0.708
Q3: 0.925
RANGE
Range = 𝐻 − 𝐿
Range = 1.01 − 0.54
Range = 0.47
QUARTILE DEVIATION
𝑄3 −𝑄1
QD =
2
0.925−0.708
QD =
2
QD = 𝟎. 𝟏𝟎𝟖𝟓
∑𝑓𝑑
MAD =
𝑛
5.584
MAD =
50
MAD = 𝟎. 𝟏𝟏𝟏𝟕
59
STANDARD DEVIATION
COEFFICIENT OF VARIATION
S
CV = × 100%
𝑥̅
0.1318
CV = × 100%
0.807
CV = 16.3321 %
SKEWNESS
𝑥̅ − 𝑥̂
𝑆𝐾 =
𝑆
0.807−0.71
𝑆𝐾 =
0.1318
𝑺𝑲 = 0.736
KURTOSIS
10𝑁
−𝐹𝑃10
100
𝑃10 = 𝐿𝐿 + ( )𝑖
𝑓𝑃10
5−4
𝑃10 = 0.615 + ( ) 0.04
2
𝑃10 = 0.635
90𝑁
−𝐹𝑃90
100
𝑃90 = 𝐿𝐿 + ( )𝑖
𝑓𝑃90
45−43
𝑃90 = 0.975 + ( ) 0.04
7
𝑃90 = 0.986
𝑄𝐷
𝐾=
𝑃90 − 𝑃10
60
0.1085
𝐾=
0.986− 0.635
𝑲 = 0.3091
Example 3.4
The following are the mass of the Milkis powder repacks with their corresponding
numbers.
Mass f
65.5 – 67.5 15
62.5 – 64.5 8
59.5 – 61.5 17
56.5 – 58.5 6
53.5 – 55.5 4
50.5 – 52.5 11
47.5 – 49.5 5
44.5 – 46.5 3
41.5 – 43.5 4
38.5 – 40.5 7
35.5 – 37.5 3
32.5 – 34.5 2
Compute the range, quartile deviation, mean absolute deviation, standard deviation,
coefficient of variation, Bowley’s measure of skewness and the coefficient of kurtosis from the
table above.
Solution:
61
For the solution of the following, refer from the previous chapter (Page 37).
Mean: 54.889
Median: 57.75
Mode: 60.35
Q1: 48.35
Q3: 62.656
RANGE
Range = 𝐻 − 𝐿
Range = 67.5 − 32.5
Range = 35
QUARTILE DEVIATION
𝑄3 −𝑄1
QD =
2
62.656−48.35
QD =
2
QD = 𝟕. 𝟏𝟓𝟑
∑𝑓𝑑
MAD =
𝑛
708.277
MAD =
85
MAD = 𝟖. 𝟑𝟑𝟐𝟕
STANDARD DEVIATION
COEFFICIENT OF VARIATION
S
CV = × 100%
𝑥̅
62
9.7518
CV = × 100%
54.889
CV = 17.7664 %
SKEWNESS
Since median and Q2 have the same value, therefore Q2 is equal to 57.75.
𝑄3 − 2𝑄2 + 𝑄1
𝑆𝑄 =
𝑄3 − 𝑄1
62.656− 2(57.75)+ 48.35
𝑆𝑄 =
62.656− 48.35
𝑺𝑸 = −0.3141
KURTOSIS
10𝑁
−𝐹𝑃10
𝑃10 = 𝐿𝐿 + ( 100 )𝑖
𝑓𝑃10
8.5−5
𝑃10 = 38 + ( )3
7
𝑃10 = 39.5
90𝑁
−𝐹𝑃90
100
𝑃90 = 𝐿𝐿 + ( )𝑖
𝑓𝑃90
76.5−70
𝑃90 = 65 + ( )3
15
𝑃90 = 66.3
𝑄𝐷
𝐾=
𝑃90 − 𝑃10
7.153
𝐾=
66.3− 39.5
𝑲 = 0.2669
63
Example 3.5
The following data are the scores of the students in the recent National Achievement Test
in Section 3 of Kabayao High School.
Scores f
91 – 96 10
85 – 90 4
79 – 84 12
73 – 78 25
66 – 72 10
61 – 66 7
55 – 60 33
49 – 54 15
43 – 48 18
37 – 42 6
31 – 36 7
25 – 30 3
Compute the range, quartile deviation, mean absolute deviation, standard deviation,
coefficient of variation, Kelly’s measure of skewness and the coefficient of kurtosis from the
table above.
Solution:
For the solution of the following, refer from the previous chapter (Page 40).
Mean: 62.22
Median: 59.227
Mode: 58.045
64
Q1: 49.9
Q3: 75.74
RANGE
Range = 𝐻 − 𝐿
Range = 96 − 25
Range = 71
QUARTILE DEVIATION
𝑄3 −𝑄1
QD =
2
75.74−49.9
QD =
2
QD = 𝟏𝟐. 𝟗𝟐
∑𝑓𝑑
MAD =
𝑛
2118.08
MAD =
150
MAD = 𝟏𝟒. 𝟏𝟐𝟎𝟓
STANDARD DEVIATION
COEFFICIENT OF VARIATION
S
CV = × 100%
𝑥̅
16.7353
CV = × 100%
62.22
CV = 𝟐𝟔. 𝟖𝟗𝟕𝟎 %
65
SKEWNESS
10𝑁
−𝐹𝑃10
100
𝑃10 = 𝐿𝐿 + ( )𝑖
𝑓𝑃10
15−10
𝑃10 = 36.5 + ( )6
6
𝑃10 = 41.5
50𝑁
−𝐹𝑃50
100
𝑃50 = 𝐿𝐿 + ( )𝑖
𝑓𝑃50
75−49
𝑃50 = 54.5 + ( )6
33
𝑃50 = 59.23
90𝑁
−𝐹𝑃90
𝑃90 = 𝐿𝐿 + ( 100 )𝑖
𝑓𝑃90
135−124
𝑃90 = 78.5 + ( )6
12
𝑃90 = 84
𝑃90 − 2𝑃50 + 𝑃10
𝑆𝑃 =
𝑃90 − 𝑃10
84− 2(59.23)+ 41.5
𝑆𝑃 =
84− 41.5
𝑺𝑷 =0.1656
KURTOSIS
𝑄𝐷
𝐾=
𝑃90 − 𝑃10
12.92
𝐾=
84− 41.5
𝑲 = 0.304
66
Normal Curve
Blood pressure, height and weight, and even ages are just some of the natural occurring
phenomena in which a normal distribution can be used.
It is generally represented by a bell shaped curve. The area under the curve represents
probability and since the probability of sure events (that DV value will fall between negative and
positive infinity) is always 1, the total area under all curves should always be equal to 1 or 100%.
The preliminary step to find the area under the normal curve is to convert the given
variables into standardized thru applying this formula:
𝑋−𝑋̅
Z=
𝑆
where X: any given value of a particular variable
Z: standardized score
𝑋̅: mean
𝑆: standard deviation
In finding the area, we’ll need the help of the Z – Distribution Chart, it’s a chart in which
certain values of Z are given for us to find the curve much easier. It is located on the last page of
this module.
67
EXAMPLES
Example 3.6
a. Find the area under the normal curve between 𝑧1 = 0 and 𝑧2 = 2.4
Solution:
At 𝑧1 = 0 the area is 0
At 𝑧2 = 2.4 the area is 0.9918
Total Area = 𝑧1 + 𝑧2
Area = 0+0.9918
A=0.9918
b. Find the area under the normal curve between 𝑧1 = -1.3 and 𝑧2 = 0.
Solution:
c. Find the area under the normal curve between 𝑧1 = -1.3 and = 𝑧2 = 2.4.
Solution:
68
Example 3.7
If there were 5,000 PUPCET examinees, how many of them will get lower than 80 of the
score. If M = 84 and S = 5.
Solution:
S=5
M = 84
̅ = 80
X
𝑋 − 𝑋̅
𝑍=
𝑆
84 − 80
Z=
5
𝐙 = 𝟎. 𝟖
Since we are after for the number of examinees, we are ought to multiply the area to the number
of examinees to get the total number of examinees that scored lower than 80.
Example 3.8
Solution:
S=2
M=8
̅
X1 = 5.5 hours
𝑋−𝑋̅1
𝑍1 = 𝑆
5.5−8
𝑍1 = 2
𝐙𝟏 = −𝟏. 𝟐𝟓
S=2
M=8
̅ 2 = 10.5 hours
X
69
𝑋−𝑋̅2
𝑍2 = 𝑆
10.5−8
𝑍2 = 2
𝐙𝟐 = 𝟏. 𝟐𝟓
To get the probability simply multiply the area by 100%. So the probability that Joseph will play
DOTA between6 – 10 hours a week is 89.44%since we got the same value of Z it means that
joseph has the same percentage in either 6 or 10.
Example 3.9
Determine the probability that a CE board examinee, who took the board exam would get
a score higher than 85 if the average score is 83 and S = 3.
Solution:
M = 83
X = 85
S=3
𝑋−𝑀
𝑍= 𝑆
85−83
𝑍=
3
𝒁 = 𝟎. 𝟔𝟕
Example 3.10
Determine the probability that a CE board examinee, who took the board exam would get a
score higher than 80 if the average score is 83 and S = 3.
Solution:
M = 80
S=3
X = 85
𝑋−𝑀
𝑍= 𝑆
80−83
Z= 3
Z = -1.0
70
Name:_________________________ Section:_______________ Date:______________
I. UNGROUPED DATA
A. A hen lays twelve eggs. Each egg was weighed and recorded as follows: 51, 53, 54,
56, 60, 61, 68, 69, 71, 74, 76 and 79 grams respectively. Find the range, quartile
deviation, mean absolute deviation, standard deviation, coefficient of variation,
coefficient of skewness and the coefficient of kurtosis.
C. At the end of the semester, Raymundo Eugenio Panaligan Jr., a BSCE student got the
following grades. Compute for the measures of variation.
Subject Grade
Basic Economics 1.75
Basic Electrical Engineering 2.25
Construction Works IV 2.50
Buhay, Gawain at mga Sinulat ni Rizal 2.25
Integral Calculus 1.50
Team Sports – Volleyball 1.00
Logic 2.25
Sosyolohiya, Kultura at Pagpapamilya 2.00
Humanities 1.75
English III 1.75
Filipino III 2.00
Architectural Drafting III 2.25
71
II. GROUPED DATA
72
III. NORMAL CURVES
A. Determine the probability that an examinee will have a percentage score between
85 and 90 if the mean score is 87 and a standard deviation of 5.
B. Determine the probability that an examinee will have a percentage score 88 if the
mean score is 87 and a standard deviation of 5.
C. Find the area:
a. Find the area under the normal curve between 𝑧1 = -1.51 and 𝑧2 = 0.
b. Find the area under the normal curve between 𝑧1 = 0 and 𝑧2 = 1.01.
D. If there were 1,124 board examinees in CE, how many of them will score higher
that 85 if the average is 83 and S = 3?
E. Using 83 % for the mean score and 3 as the Standard Deviation and N = 1,124 as
the number of examinees. How many of them will score lower than 80?
73