Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Chapter 6 (Philoid-In)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

CHAPTER

Measures of Dispersion

Studying this chapter should Three friends, Ram, Rahim and


enable you to:
Maria are chatting over a cup of tea.
• know the limitations of averages;
• appreciate the need for measures During the course of their conversation,
of dispersion; they start talking about their family
• enumerate various measures of incomes. Ram tells them that there are
dispersion; four members in his family and the
• calculate the measures and average income per member is Rs
compare them;
• distinguish between absolute
15,000. Rahim says that the average
and relative measures. income is the same in his family, though
the number of members is six. Maria
1. INTRODUCTION says that there are five members in her
family, out of which one is not working.
In the previous chapter, you have
She calculates that the average income
studied how to sum up the data into a
in her family too, is Rs 15,000. They
single representative value. However,
are a little surprised since they know
that value does not reveal the variability
that Maria’s father is earning a huge
present in the data. In this chapter you
will study those measures, which seek salary. They go into details and gather
to quantify variability of the data. the following data:

2020-21
MEASURES OF DISPERSION 75

Family Incomes in values, your understanding of a


Sl. No. Ram Rahim Maria distribution improves considerably.
1. 12,000 7,000 0 For example, per capita income gives
2. 14,000 10,000 7,000 only the average income. A measure of
3. 16,000 14,000 8,000 dispersion can tell you about income
4. 18,000 17,000 10,000 inequalities, thereby improving the
5. ----- 20,000 50,000
6. ----- 22,000 ------
understanding of the relative standards
of living enjoyed by different strata of
Total income 60,000 90,000 75,000
Average income 15,000 15,000 15,000
society.
Dispersion is the extent to which
Do you notice that although the values in a distribution differ from the
average is the same, there are average of the distribution.
considerable differences in individual To quantify the extent of the
incomes? variation, there are certain measures
It is quite obvious that averages try namely:
to tell only one aspect of a distribution (i) Range
i.e. a representative size of the values. (ii) Quartile Deviation
To understand it better, you need to (iii) Mean Deviation
know the spread of values also. (iv) Standard Deviation
You can see that in Ram’s family,
Apart from these measures which
differences in incomes are
give a numerical value, there is a
comparatively lower. In Rahim’s family,
graphic method for estimating
differences are higher and in Maria’s
dispersion.
family, the differences are the highest.
Range and quartile deviation
Knowledge of only average is
measure the dispersion by calculating
insufficient. If you have another value
the spread within which the values lie.
which reflects the quantum of variation
Mean deviation and standard deviation
calculate the extent to which the values
differ from the average.

2. M EASURES B ASED UPON S PREAD


OF VALUES

Range
Range (R) is the difference between the
largest (L) and the smallest value (S) in
a distribution. Thus,
R=L–S
Higher value of range implies higher
dispersion and vice-versa.

2020-21
76 STATISTICS FOR ECONOMICS

Activities Quartile Deviation


Look at the following values: The presence of even one extremely
20, 30, 40, 50, 200 high or low value in a distribution can
• Calculate the Range. reduce the utility of range as a measure
• What is the Range if the value of dispersion. Thus, you may need a
200 is not present in the data
measure which is not unduly affected
set?
• If 50 is replaced by 150, what by the outliers.
will be the Range? In such a situation, if the entire data
is divided into four equal parts, each
containing 25% of the values, we get
Range: Comments
Range is unduly affected by extreme the values of quartiles and median.
values. It is not based on all the (You have already read about these in
values. As long as the minimum Chapter 5).
and maximum values remain The upper and lower quartiles (Q3
unaltered, any change in other and Q 1, respectively) are used to
values does not affect range. It calculate inter-quartile range which is
cannot be calculated for open-
Q3 – Q1.
ended frequency distribution.
Interquartile range is based upon
Notwithstanding some limitations, middle 50% of the values in a
range is understood and used distribution and is, therefore, not
frequently because of its simplicity. For affected by extreme values. Half of the
example, we see the maximum and inter-quartile range is called quartile
minimum temperatures of different deviation (Q.D.). Thus:
cities almost daily on our TV screens
and form judgments about the
temperature variations in them. Q.D. is therefore also called Semi-
Inter Quartile Range.
Open-ended distributions are those
in which either the lower limit of Calculation of Range and Q.D. for
the lowest class or the upper limit
ungrouped data
of the highest class or both are not
specified. Example 1
Calculate range and Q.D. of the
Activity following observations:
• Collect data about 52-week high/ 20, 25, 29, 30, 35, 39, 41,
low of shares of 10 companies 48, 51, 60 and 70
from a newspaper. Calculate the Range is clearly 70 – 20 = 50
range of share prices. Which For Q.D., we need to calculate
company’s share is most volatile
values of Q3 and Q1.
and which is the most stable?

2020-21
MEASURES OF DISPERSION 77

Range is just the difference between


n +1
Q1 is the size of th value. the upper limit of the highest class and
4 the lower limit of the lowest class. So
n being 11, Q1 is the size of 3rd value. range is 90 – 0 = 90. For Q.D., first
As the values are already arranged calculate cumulative frequencies as
in ascending order, it can be seen that follows:
Q1, the 3rd value is 29. [What will you
Class- Frequencies Cumulative
do if these values are not in an order?] Intervals Frequencies
CI f c. f.
3 ( n +1)
Similarly, Q3 is size of th 0–10 5 05
4 10–20 8 13
value; i.e. 9th value which is 51. Hence 20–40 16 29
Q3 = 51 40–60 7 36
60–90 4 40
51 − 29 n = 40
= = 11
2 n
Do you notice that Q.D. is the Q1 is the size of th value in a
4
average difference of the Quartiles from continuous series. Thus, it is the size
the median. of the 10th value. The class containing
Activity the 10th value is 10–20. Hence, Q1 lies
in class 10–20. Now, to calculate the
• Calculate the median and
check whether the above exact value of Q1, the following formula
statement is correct. is used:

Calculation of Range and Q.D. for a n cf


frequency distribution.
Q1 =L + 4 ×i
Example 2 f
For the following distribution of marks Where L = 10 (lower limit of the
scored by a class of 40 students, relevant Quartile class)
calculate the Range and Q.D. c.f. = 5 (Value of c.f. for the class
preceding the quartile class)
TABLE 6.1 i = 10 (interval of the quartile class),
Class intervals No. of students and
CI (f) f = 8 (frequency of the quartile class)
0–10 5 Thus,
10–20 8
20–40 16 10 − 5
40–60 7 Q = 10 + × 10 =16.25
60–90 4
8
40 3n
Similarly, Q3 is the size of th
4

2020-21
78 STATISTICS FOR ECONOMICS

value; i.e., 30th value, which lies in Quartile deviation can generally be
class 40–60. Now using the formula calculated for open-ended
for Q3, its value can be calculated as distributions and is not unduly affected
follows: by extreme values.

3. MEASURES OF D ISPERSION FROM


AVERAGE
Recall that dispersion was defined as
the extent to which values differ from
their average. Range and quartile
deviation are not useful in measuring,
how far the values are, from their
In individual and discrete series, average. Yet, by calculating the spread
of values, they do give a good idea
n +1
Q1 is the size of th value, but about the dispersion. Two measures
4 which are based upon deviation of the
in a continuous distribution, it is values from their average are Mean
n Deviation and Standard Deviation.
the size of th value. Similarly, Since the average is a central value,
4
for Q3 and median also, n is used in some deviations are positive and some
place of n+1. are negative. If these are added as they
are, the sum will not reveal anything.
If the entire group is divided into In fact, the sum of deviations from
two equal halves and the median Arithmetic Mean is always zero. Look
calculated for each half, you will have at the following two sets of values.
the median of better students and the
Set A : 5, 9, 16
median of weak students. These
Set B : 1, 9, 20
medians differ from the median of the
entire group by 13.31 on an average. You can see that values in Set B are
Similarly, suppose you have data farther from the average and hence
about incomes of people of a town. more dispersed than values in Set A.
Median income of all people can be Calculate the deviations from
calculated. Now, if all people are Arithmetic Mean and sum them up.
divided into two equal groups of rich What do you notice? Repeat the same
and poor, medians of both groups can with Median. Can you comment upon
be calculated. Quartile deviation will the quantum of variation from the
tell you the average difference between calculated values?
medians of these two groups belonging
Mean Deviation tries to overcome
to rich and poor, from the median of
this problem by ignoring the signs of
the entire group.

2020-21
MEASURES OF DISPERSION 79

deviations, i.e., it considers all average. The average used is either the
deviations positive. For standard arithmetic mean or median.
deviation, the deviations are first (Since the mode is not a stable
squared and averaged and then square average, it is not used to calculate mean
root of the average is found. We shall deviation.)
now discuss them separately in detail. Activities
• Calculate the total distance to
Mean Deviation
be travelled by students if the
Suppose a college is proposed for college is situated at town A, at
students of five towns A, B, C, D and E town C, or town E and also if it
which lie in that order along a road. is exactly half way between A
Distances of towns in kilometres from and E.
• Decide where, in your opinion,
town A and number of students in
the college should be establi-
these towns are given below:
shed, if there is only one
student in each town. Does it
Town Distance No.
from town A of Students
change your answer?

A 0 90
Calculation of Mean Deviation from
B 2 150
C 6 100 Arithmetic Mean for ungrouped
D 14 200 data.
E 18 80
Direct Method
620
Steps:
Now, if the college is situated in
(i) The A.M. of the values is calculated
town A, 150 students from town B will
(ii) Difference between each value and
have to travel 2 kilometers each (a total
the A.M. is calculated. All differences
of 300 kilometres) to reach the college.
are considered positive. These are
The objective is to find a location so that
denoted as |d|
the average distance travelled by
(iii) The A.M. of these differences (called
students is minimum.
deviations) is the Mean Deviation.
You may observe that the students
will have to travel more, on an average, ∑|d|
if the college is situated at town A or E. i.e. M.D. =
n
If on the other hand, it is somewhere in
the middle, they are likely to travel less. Example 3
Mean deviation is the appropriate Calculate the mean deviation of the
statistical tool to estimate the average following values; 2, 4, 7, 8 and 9.
distance travelled by students. Mean
deviation is the arithmetic mean of the ∑X
The A.M. = =6
differences of the values from their n

2020-21
80 STATISTICS FOR ECONOMICS

X |d| Mean Deviation from Mean for


Continuous Distribution
2 4
4 2 TABLE 6.2
7 1
8 2 Profits of Number of
9 3 companies Companies
(Rs in lakh)
12 Class intervals
10–20 5
12 20–30 8
M.D.( x ) = = 2.4 30–50 16
5 50–70 8
70–80 3
Mean Deviation from median for 40
ungrouped data.
Steps:
Method
(i) Calculate the mean of the
Using the values in Example 3, M.D. distribution.
from the Median can be calculated as
(ii) Calculate the absolute deviations
follows, |d| of the class midpoints from the
(i) Calculate the median which is 7. mean.
(ii) Calculate the absolute deviations (iii) Multiply each |d| value with its
from median, denote them as |d|. corresponding frequency to get f|d|
(iii) Find the average of these absolute values. Sum them up to get Σ f|d|.
deviations. It is the Mean Deviation. (iv) Apply the following formula,
 f |d|
Example 5 M.D.( x ) =
Âf
X d=|X-MEDIAN|
Mean Deviation of the distribution
2 5 in Table 6.2 can be calculated as
4 3 follows:
7 0
8 1
9 2
Example 6
11 C.I. f m.p. |d| f|d|
10–20 5 15 25.5 127.5
M. D. from Median is thus, 20–30 8 25 15.5 124.0
30–50 16 40 0.5 8.0
Â|d| 11 50–70 8 60 19.5 156.0
M.D.( Median ) = = = 2.2 70–80 3 75 34.5 103.5
n 5 40 519.0

2020-21
MEASURES OF DISPERSION 81

it ignores the signs of deviations


and cannot be calculated for open-
ended distributions.
Mean Deviation from Median
Standard Deviation
TABLE 6.3
Class intervals Frequencies
Standard Deviation is the positive
square root of the mean of squared
20–30 5
30–40 10 deviations from mean. So if there are
40–60 20 five values x1, x2, x3, x4 and x5, first their
60–80 9 mean is calculated. Then deviations of
80–90 6
the values from mean are calculated.
50 These deviations are then squared. The
The procedure to calculate mean mean of these squared deviations is the
deviation from the median is the same variance. Positive square root of the
as it is in case of M.D. from mean, variance is the standard deviation.
except that deviations are to be taken (Note that standard deviation is
from the median as given below: calculated on the basis of the mean only).

Example 7 Calculation of Standard Deviation


for ungrouped data
C.I. f m.p. |d| f|d|
Four alternative methods are available
20–30 5 25 25 125 for the calculation of standard
30–40 10 35 15 150
40–60 20 50 0 0
deviation of individual values. All these
60–80 9 70 20 180 methods result in the same value of
80–90 6 85 35 210 standard deviation. These are:
50 665 (i) Actual Mean Method
∑ f |d| (ii) Assumed Mean Method
M.D.( Median ) = (iii) Direct Method
∑f
(iv) Step-Deviation Method
665 Actual Mean Method:
= = 13.3
50 Suppose you have to calculate the
standard deviation of the following
Mean Deviation: Comments values:
Mean deviation is based on all
5, 10, 25, 30, 50
values. A change in even one value
will affect it. Mean deviation is the
First step is to calculate
least when calculated from the 5+10+25+30+50 120
median i.e., it will be higher if X= = = 24
calculated from the mean. However
5 5

2020-21
82 STATISTICS FOR ECONOMICS

Example 8 Formula for Standard Deviation


2
X d (x-x̄ ) d2 Σd2  Σd 
σ= −  
5 –19 361 n  n 
10 –14 196
25 +1 1 2
30 +6 36 1275  −5 
σ= − = 254 = 15.937
50 +26 676 5  5 
0 1270
Note that the sum of deviations
Then the following formula is used: from a value other than actual
mean will not be equal to zero.
Standard deviation is not affected
by the value of the constant from
which deviations are calculated.
The value of the constant does not
figure in the standard deviation
formula. Thus, Standard deviation
is Independent of Origin.

Direct Method
Do you notice the value from which
Standard Deviation can also be
deviations have been calculated in the
calculated from the values directly, i.e.,
above example? Is it the Actual Mean?
without taking deviations, as shown
Assumed Mean Method below:
For the same values, deviations may be
Example 10
calculated from any arbitrary value
2
A x such that d = X – A x . Taking A x X X
= 25, the computation of the standard 5 25
deviation is shown below: 10 100
25 625
Example 9 30 900
50 2500
X d (x-A x ) d2 120 4150
5 –20 400
10 –15 225 (This amounts to taking deviations
25 0 0 from zero)
30 +5 25 Following formula is used.
50 +25 625
–5 1275

2020-21
MEASURES OF DISPERSION 83

4150 50.80
or σ =
2
− (24) σ= ×5
5 5

or σ = 254 = 15.937 σ = 10.16 × 5


s =15.937
Step-deviation Method
Alternatively, instead of dividing the
If the values are divisible by a common values by a common factor, the
factor, they can be so divided and deviations can be calculated and then
standard deviation can be calculated divided by a common factor.
from the resultant values as follows: Standard deviation can be
calculated as shown below:
Example 11
Example 12
Since all the five values are divisible by
a common factor 5, we divide and get x d =(x-25) d' =(d/5) d' 2
the following values: 5 –20 –4 16
x x' d' = (x'-x ' ) d'2 10 –15 –3 9
25 0 0 0
5 1 –3.8 14.44 30 +5 +1 1
10 2 –2.8 7.84 50 +25 +5 25
25 5 +0.2 0.04
–1 51
30 6 +1.2 1.44
50 10 +5.2 27.04 Deviations have been calculated
0 50.80 from an arbitrary value 25. Common
factor of 5 has been used to divide
In the above table,
deviations.
x
x'=
c
where c = common factor
First step is to calculate
' 1+2+5+6+10 24
X = =
= 4.8
5 5
s = 10.16 × 5 = 15.937
The following formula is used to
calculate standard deviation:
Standard deviation is not independent
of scale. Thus, if the values or
deviations are divided by a common
factor, the value of the common factor
is used in the formula to get the value
Substituting the values, of standard deviation.

2020-21
84 STATISTICS FOR ECONOMICS

Standard Deviation in Continuous 5. Apply the formula as under:


frequency distribution:
Σfd2 11790
Like ungrouped data, S.D. can be σ= = = 17.168
calculated for grouped data by any of n 40
the following methods:
(i) Actual Mean Method Assumed Mean Method
(ii) Assumed Mean Method For the values in example 13, standard
(iii) Step-Deviation Method deviation can be calculated by taking
deviations from an assumed mean (say
Actual Mean Method 40) as follows:
For the values in Table 6.2, Standard Example 14
Deviation can be calculated as follows:
(1) (2) (3) (4) (5) (6)
Example 13 CI f m d fd fd2
10–20 5 15 -25 –125 3125
(1) (2) (3) (4) (5) (6) (7) 20–30 8 25 -15 –120 1800
CI f m fm d fd fd2 30–50 16 40 0 0 0
10–20 5 15 75 –25.5 –127.5 3251.25 50–70 8 60 +20 160 3200
20–30 8 25 200 –15.5 –124.0 1922.00 70–80 3 75 +35 105 3675
30–50 16 40 640 –0.5 –8.0 4.00
50–70 8 60 480 +19.5 +156.0 3042.00 40 +20 11800
70–80 3 75 225 +34.5 +103.5 3570.75
The following steps are required:
40 1620 0 11790.00
1. Calculate mid-points of classes
Following steps are required: (Col. 3)
1. Calculate the mean of the 2. Calculate deviations of mid-points
distribution. from an assumed mean such that
Σfm 1620 d = m – A –(Col. 4). Assumed
x= = = 40.5 Mean = 40.
Σf 40
3. Multiply values of ‘d’ with
2. Calculate deviations of mid-values
corresponding frequencies to get
from the mean so that
‘fd’ values (Col. 5). (Note that the
(Col. 5)
3. Multiply the deviations with their total of this column is not zero since
corresponding frequencies to get deviations have been taken from
‘fd’ values (Col. 6) [Note that Σ fd assumed mean).
= 0] 4. Multiply ‘fd’ values (Col. 5) with ‘d’
4. C a l c u l a t e ‘ f d 2 ’ v a l u e s b y values (col. 4) to get fd2 values (Col.
multiplying ‘fd’ values with ‘d’ 6). Find Σ fd2.
values. (Col. 7). Sum up these to 5. Standard Deviation can be
get Σ fd2. calculated by the following formula.

2020-21
MEASURES OF DISPERSION 85

2 4. Multiply ‘fd'’ values with ‘d'’ values


Σfd2  Σfd  to get ‘fd'2’ values (Col. 7)
σ= −
n  n  5. Sum up values in Col. 6 and Col. 7
2 to get Σ fd' and Σ fd'2 values.
11800  20 
or σ = −
40  40  6. Apply the following formula.
2
or σ = 294.75 = 17.168 Σfd ′2  Σfd ′ 
s = − ×c
Σf  Σf 
Step-deviation Method
2
In case the values of deviations are 472  4 
or s = −  ×5
divisible by a common factor, the 40  40 
calculations can be simplified by the or s = 11.8 − 0.01 × 5
step-deviation method as in the
following example. s = 11.79 × 5
or
s = 17.168
Example 15
Standard Deviation: Comments
(1) (2) (3) (4) (5) (6) (7)
Standard Deviation, the most widely
CI f m d d' fd' fd'2
used measure of dispersion, is
10–20 5 15 –25 –5 –25 125 based on all values. Therefore a
20–30 8 25 –15 –3 –24 72 change in even one value affects
30–50 16 40 0 0 0 0 the value of standard deviation. It
50–70 8 60 +20 +4 +32 128
is independent of origin but not of
70–80 3 75 +35 +7 +21 147
scale. It is also useful in certain
40 +4 472 advanced statistical problems.

Steps required:
4. ABSOLUTE AND RELATIVE MEASURES
1. Calculate class mid-points (Col. 3) OF DISPERSION
and deviations from an arbitrarily
chosen value, just like in the All the measures, described so far, are
assumed mean method. In this absolute measures of dispersion. They
example, deviations have been calculate a value which, at times, is
taken from the value 40. (Col. 4) difficult to interpret. For example,
consider the following two data sets:
2. Divide the deviations by a common
factor denoted as ‘c’. c = 5 in the Set A 500 700 1000
Set B 1,00,000 1,20,000 1,30,000
above example. The values so
obtained are ‘d'’ values (Col. 5). Suppose the values in Set A are the
daily sales recorded by an ice-cream
3. Multiply ‘d'’ values with
vendor, while Set B has the daily sales
corresponding ‘f'’ values (Col. 2) to
of a big departmental store. Range for
obtain ‘fd'’ values (Col. 6).
Set A is 500 whereas for Set B, it is

2020-21
86 STATISTICS FOR ECONOMICS

30,000. The value of Range is much For Mean Deviation, it is Coefficient


higher in Set B. Can you say that the of Mean Deviation.
variation in sales is higher for the Coefficient of Mean Deviation =
departmental store? It can be easily
M.D.( x ) M.D.( Median )
observed that the highest value in Set or
A is double the smallest value, whereas x Median
for the Set B, it is only 30% higher. Thus, if Mean Deviation is
Thus, absolute measures may give calculated on the basis of the Mean, it
misleading ideas about the extent of is divided by the Mean. If Median is
variation specially when the averages used to calculate Mean Deviation, it is
differ significantly. divided by the Median.
Another weakness of absolute For Standard Deviation, the relative
measures is that they give the answer measure is called Coefficient of
in the units in which original values are Variation, calculated as below:
expressed. Consequently, if the values Coefficient of Variation
are expressed in kilometers, the Standard Deviation
dispersion will also be in kilometers. = × 100
Arithmetic Mean
However, if the same values are
expressed in meters, an absolute It is usually expressed in
measure will give the answer in meters percentage terms and is the most
and the value of dispersion will appear commonly used relative measure of
to be 1000 times. dispersion. Since relative measures are
To overcome these problems, free from the units in which the values
relative measures of dispersion can be have been expressed, they can be
used. Each absolute measure has a compared even across different groups
relative counterpart. Thus, for range, having different units of measurement.
there is coefficient of range which is
calculated as follows: 5. LORENZ CURVE
L −S The measures of dispersion discussed
Coefficient of Range = so far give a numerical value of
L +S
dispersion. A graphical measure called
where L = Largest value
Lorenz Curve is available for estimating
S = Smallest value
inequalities in distribution. You may
Similarly, for Quartile Deviation, it have heard of statements like ‘top 10%
is Coefficient of Quartile Deviation of the people of a country earn 50% of
which can be calculated as follows: the national income while top 20%
Coefficient of Quartile Deviation account for 80%’. An idea about
income disparities is given by such
Q3 − Q 1
= rd figures. Lorenz Curve uses the
Q3 + Q 1 where Q3=3 Quartile information expressed in a cumulative
Q1 = 1st Quartile manner to indicate the degree of

2020-21
MEASURES OF DISPERSION 87

inequality. For example, Lorenz Curve as a percentage (%) of the grand


of income gives a relationship between total income of all classes together.
percentage of population and its share Thus obatain Col. (6) of Table 6.4.
of income in total income. It is specially 5. Prepare less than cumulative
useful in comparing the variability of two frequency and Cumulative income
or more distributions by drawing two Table 6.5.
or more Lorenz curves on the same axis. 6. Col. (2) of Table 6.5 shows the
Construction of the Lorenz curve cumulative frequency of empolyees.
7. Col. (3) of Table 6.5 shows the
Following steps are required. cumulative income going to these
1. Calculate class Midpoints to obtain persons.
Col.2 of Table 6.4. 8. Draw a line joining Co-ordinate
2. Calculate the estmated total income (0,0) with (100,100). This is called
of employees in each class by the line of equal distribution shown
multiplying the midpoint of the as line ‘OE’ in figure 6.1.
class by the frequency in the class.
9. Plot the cumulative percentages of
Thus obtain Col. (4) of Table 6.4.
3. Express frequency in each class as empolyees on the horizontal axis
a percentage (%) of total frequency. and cumulative income on the
Thus, obtain Col. (5) of Table 6.4. vertical axis. We will the thus gate
4. Express total income of each class the line.

Given below are the monthly incomes of employees of a company:

TABLE 6.4
Income Midpoint (X) Frequency (f) Total income % of frequency % of Total
class of class (FX) income

(1) (2) (3) (4) (5) (6)


0-5000 2500 5 12500 10 1.29
5000-10000 7500 10 75000 20 7.71
10000-20000 15000 18 270000 36 27.76
20000-40000 30000 10 300000 20 30.85
40000-50000 45000 7 315000 14 32.39
50 972500 100

2020-21
88 STATISTICS FOR ECONOMICS

TABLE 6.5 20% of total income and top 60% earn


‘Less Than’ Cumulative Frequency and Income 60% of the total income. The farther the
‘Less Than’ Cumulative Cumulative curve OABCDE from this line, the
frequency Income
(Rs) (%) (%)
greater is the inequality present in the
distribution. If there are two or more
5,000 10 1.29
10,000 3 9.00 curves on the same axes, the one which
20,000 66 36.76 is the farthest from line OE has the
40,000 86 67.61 highest inequality.
50,000 100 100.00
8. CONCLUSION
Studying the Lorenz Curve
Although Range is the simplest to
OE is called the line of equal
calculate and understand, it is unduly
distribution, since it would imply a
affected by extreme values. QD is not
situation like, top 20% people earn
affected by extreme values as it is based
on only middle 50% of the data.
However, it is more difficult to interpret
M.D. and S.D. Both are based upon
deviations of values from their average.
M.D. calculates average of deviations
from the average but ignores signs of
deviations and therefore appears to be
unmathematical. Standard deviation
attempts to calculate average deviation
from mean. Like M.D., it is based on
all values and is also applied in more
advanced statistical problems. It is the
most widely used measure of
dispersion.

Recap
• A measure of dispersion improves our understanding about the
behaviour of an economic variable.
• Range and Quartile Deviation are based upon the spread of values.
• M.D. and S.D. are based upon deviations of values from the average.
• Measures of dispersion could be Absolute or Relative.
• Absolute measures give the answer in the units in which data are
expressed.
• Relative measures are free from these units, and consequently
can be used to compare different variables.
• A graphic method, which estimates the dispersion from shape
of a curve, is called Lorenz Curve.

2020-21
MEASURES OF DISPERSION 89

EXERCISES

1. A measure of dispersion is a good supplement to the central value in


understanding a frequency distribution. Comment.
2. Which measure of dispersion is the best and how?
3. Some measures of dispersion depend upon the spread of values whereas
some are estimated on the basis of the variation of values from a central
value. Do you agree?
4. In a town, 25% of the persons earned more than Rs 45,000 whereas
75% earned more than 18,000. Calculate the absolute and relative values
of dispersion.
5. The yield of wheat and rice per acre for 10 districts of a state is as
under:
District 1 2 3 4 5 6 7 8 9 10
Wheat 12 10 15 19 21 16 18 9 25 10
Rice 22 29 12 23 18 15 12 34 18 12
Calculate for each crop,
(i) Range
(ii) Q.D.
(iii) Mean deviation about Mean
(iv) Mean deviation about Median
(v) Standard deviation
(vi) Which crop has greater variation?
(vii)Compare the values of different measures for each crop.
6. In the previous question, calculate the relative measures of variation
and indicate the value which, in your opinion, is more reliable.
7. A batsman is to be selected for a cricket team. The choice is between X
and Y on the basis of their scores in five previous tests which are:
X 25 85 40 80 120
Y 50 70 65 45 80
Which batsman should be selected if we want,
(i) a higher run getter, or
(ii) a more reliable batsman in the team?
8. To check the quality of two brands of lightbulbs, their life in burning
hours was estimated as under for 100 bulbs of each brand.
Life No. of bulbs
(in hrs) Brand A Brand B
0–50 15 2
50–100 20 8
100–150 18 60
150–200 25 25
200–250 22 5
100 100

2020-21
90 STATISTICS FOR ECONOMICS

(i) Which brand gives higher life?


(ii) Which brand is more dependable?
9. Averge daily wage of 50 workers of a factory was Rs 200 with a standard
deviation of Rs 40. Each worker is given a raise of Rs 20. What is the
new average daily wage and standard deviation? Have the wages become
more or less uniform?
10. If in the previous question, each worker is given a hike of 10 % in wages,
how are the mean and standard deviation values affected?
11. Calculate the mean deviation using mean and Standard Deviation for
the following distribution.
Classes Frequencies
20–40 3
40–80 6
80–100 20
100–120 12
120–140 9
50
12. The sum of 10 values is 100 and the sum of their squares is 1090. Find
out the coefficient of variation.

2020-21

You might also like