Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
445 views

Measure of Variation

Measures of variation (dispersion) describe how spread out or concentrated data values are around a central tendency like the mean. The document discusses several measures of variation including range, variance, standard deviation, and coefficient of variation. It also covers measures of position such as z-scores, percentiles, quartiles, and outliers which identify where a particular data point sits within the overall data distribution. Understanding measures of variation and position is important for applications like quality control and measuring economic disparities.

Uploaded by

Nighi Zara
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
445 views

Measure of Variation

Measures of variation (dispersion) describe how spread out or concentrated data values are around a central tendency like the mean. The document discusses several measures of variation including range, variance, standard deviation, and coefficient of variation. It also covers measures of position such as z-scores, percentiles, quartiles, and outliers which identify where a particular data point sits within the overall data distribution. Understanding measures of variation and position is important for applications like quality control and measuring economic disparities.

Uploaded by

Nighi Zara
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 50

Measures of Variation

(Dispersion)
Variation (Dispersion)

 How observations in a data are spread about


an average value (Mean, Median, Mode).
 If observations are close to averages (Mean,
Median, Mode) then variation is small but if
observations are spread away from the
centre, we say variation is large.
Explanation

Suppose we have three groups of students


who have obtained following marks in the test.
Group A: 46 48 50 52 54 XA = 50

Group B: 30 40 50 60 70 XB = 50

Group C: 40 50 60 70 80 XC = 60
Explanation

Group A and group B have the same Means, but in


group A observations are concentrated on the centre.
While in group B the observations are not close to the
centre (Mean) that is one observation is as small as 30
and the other one is as large as 70. In group B and C
the Variations are same but Means are different. In
group A and C both variation and Means are different.
IMPORTANCE

(1) With the help of variation we can maintain


consistency in the wages of factory workers, for the
satisfaction of workers.
(2) We can measure the economic disparity with the
help of variation.
(3) Variance is important to predicts about the prices of
commodities, standard of living of different peoples,
distribution of wealth, land and so on.
How Can We Measure Variability?

Followings are the parameters use to


measure Variations:
– Range
– Variance
– Standard Deviation
– Coefficient of Variation
(1) RANGE

 The Range is the difference between the


maximum (Xmax) and minimum (Xmin) values in
a data set i.e.,
Range = Xmax - Xmin .
In case of grouped data:
Range = Upper boundary of the highest class
minus (-) Lower boundary of the lowest class
Application

(1) It is the simplest measure of dispersion.


(2) It has its application in quality control
methods which are used to maintain the quality
of the products produced in the factories. The
quality of products is to be kept within certain
range of values.
EXAMPLE

Two experimental brands of outdoor paint are


tested to see how long each will last before
fading. Six cans of each brand constitute a
small population. The results (in months) are
shown. Find the mean and range of each
group.
EXAMPLE

Brand A Brand B

10 35   X 210
  35
Brand A: N 6
60 45
R  60  10  50
50 30

30 35   X

210
 35
Brand B: N 6
40 40
R  45  25  20
20 25
EXAMPLE

 The average for both brands is the same, but


the range for Brand A is much greater than the
range for Brand B.
(2) Variance & Standard Deviation

 The variance is the average of the


squares of the distance each value is
from the mean.
 The standard deviation is the square
root of the variance.
 The standard deviation is a measure of
how spread out your data are.
Uses of the Variance and Standard
Deviation

 To determine the spread of the data.


 To determine the consistency of a
variable.
 To determine the number of data values
that fall within a specified interval in a
distribution
 Used in inferential statistics.
FORMULAS

 The population variance is


  X  
2

 2

N
 The population standard deviation is

 X  
2


N
EXAMPLE

Find the variance and standard deviation


for the data set for Brand A paint.
10, 60, 50, 30,40, 20.
Solution

 X  
2
Months, X µ X-µ (X - µ) 2
 2

n
10 35 -25 625
60 35 25 625 1750

50 35 15 225 6
30 35 -5 25
40 35 5 25
 291.7
20 35 15 225

1750
1725
 
6
 17.1
Variance & Standard Deviation
(Sample Theoretical Model)

The sample variance is


n X    X 
2 2

s 
2

n  n  1
The sample standard deviation
is:
s s 2
EXAMPLE

Find the variance and standard deviation


for the amount of European auto sales for
a sample of 6 years. The data are in
millions of Dollars:
11.2, 11.9, 12.0, 12.8, 13.4, 14.3
Solution

n X    X 
X X2 2 2

11.2 125.44 s 
2

11.9 141.61 n  n  1
12.0 144.0
6  958.94    75.6 
2
12.8 163.84
13.4 179.56 s 
2

14.3 204.49 6  5
75.6 958.94
2

s  6  958.94  75.6 /  6  5
2

s 2  1.28
s  1.13
Coefficient of Variation

 The coefficient of variation is the


standard deviation divided by the mean,
expressed as a percentage.
s
CVAR  100%
X
 Use CVAR to compare standard
deviations when the units are different
Example

The mean of the number of sales of cars


over a 3-month period is 87, and the
standard deviation is 5. The mean of the
commissions is $5225, and the standard
deviation is $773. Compare the variations
of the two.
Solution

5
CVar  100%  5.7% Sales
87

773
CVar  100%  14.8% Commissions
5225

Commissions are more variable than sales.


Practice problems

Exercise 3-3

Page # 126, Q # 10, 11, 12.


Page # 128, Q # 28-31.
Measures of Position

 It identify the position of data value in a


data set, using various measures of
position such as percentiles, deciles
and quartiles etc.
How Can We Measure position?

 Standard scores OR Z-scores


 Percentiles
 Quartile
 Outlier
Standard scores OR Z-scores
 A z-score or standard score for a value is obtained by
subtracting the mean from the value and dividing the
result by the standard deviation.

X X X 
z z
s 
The z scores represents the number of standard
deviations that a data value falls above or below the mean
EXAMPLE

A student scored 65 on a calculus test


that had a mean of 50 and a standard
deviation of 10; he scored 30 on a history
test with a mean of 25 and a standard
deviation of 5. Compare his relative
positions on the two tests.
Solution

X X 65  50
z   1.5 Calculus
s 10

X  X 30  25
z   1.0 History
s 5

Since z score in calculus is higher, he


has a higher relative position in the
Calculus class than history.
NOTE

 If z score is positive, the score is above


the mean. If z score is zero, the score
is same as mean. And if z score is
negative, the score is below the mean.
Percentiles

 Percentiles separate the data set into


100 equal groups.
 A percentile rank for a datum represents
the percentage of data values below the
datum.
FORMULA

Percentile 
 # of values below X   0.5
100%
total # of values

The value correspond to the given


percentile is: c  n  p
100
Example 1

A teacher gives a 20-point test to 10


students. Find the percentile rank of a
score of 12.
18, 15, 12, 6, 8, 2, 3, 5, 20, 10
Solution

Sort in ascending order:


2, 3, 5, 6, 8, 10, 12, 15, 18, 20
 # of values below X   0.5
Percentile  100%
total # of values
6  0.5
 100%
10 A student whose score
 65% was 12 did better than
65% of the class.
EXAMPLE 2

A teacher gives a 20-point test to 10


students. Find the value corresponding to
the 25th percentile.
18, 15, 12, 6, 8, 2, 3, 5, 20, 10
SOLUTION

Sort in ascending order.


2, 3, 5, 6, 8, 10, 12, 15, 18, 20
n p
c 
100
10  25
  2.5 3
100

The value 5 corresponds to the 25th


percentile.
Quartile
 Quartiles separate the data set into 4
equal groups. Q1=P25, Q2=MD, Q3=P75

25% 25% 25% 25%

Q1 Q2 Q3
PROCEDURE

 Arrange the data in order from lowest to


highest.
 Find the median of data values. This is the
values for Q2.
 Find the median of the data falls below Q 2. This
is the value for Q1.
 Find the median of the data that fall above Q 2.
This is value for Q3.
EXAMPLE

Find Q1, Q2, Q3.for the data set.


15, 13, 6, 5,12, 50, 22, 18.
Solution: Step 1: 5, 6, 12, 13, 15, 18, 22, 50

Step 2: Find the median Q2


5, 6, 12, 13, 15, 18, 22, 50

Median
Median = 13 +15 =14
2
EXAMPLE

Step3: Find the median of the data values less


Than 14.
5, 6, 12, 13

Q1 Q1 = 6 + 12 = 9
2
EXAMPLE

Step 4: Find the median of the data values greater


than 14.
15, 18, 22, 50
Q3
Q3 = 18+22 = 20
2
OUTLIER

An OUTLIER is an extremely high or


extremely low data value when compared
with the rest of the data values.
 The Interquartile Range,
Range
IQR = Q3 – Q1.
Procedure to find OUTLIER

 A data value less than Q1 – 1.5(IQR) or


greater than Q3 + 1.5(IQR) can be
considered an outlier.
EXAMPLE

 Check the following data set for outliers.


5, 6, 12, 13, 15, 18, 22, 50
Solution: We have found Q1 = 9 and Q3 =20
Interquartile range (IQR) = Q3 - Q1 = 20-9=11
Q1 - 1.5(IQR) = 9 – 1.5(11) = -7.5
Q3 + 1.5(IQR) = 20 + 1.5(11) = 36.5
Now check the data set for any data values that falls
outside the interval [-7.5, 36.5]. The only value is 50, so
it will be consider as an outlier.
Practice Problems

 Page # 141: Q#10, Q#12, Q#14, Q#15.


 Page # 142: Q#22, Q#24, Q#26, Q#28
SKEWNESSS

 Skewness measures the degree of asymmetry


exhibited by the data

 (x i  x) 3

skewness  i 1
3
ns
NOTE

 The histogram is an effective graphical


technique for showing both the skewness
and kurtosis of a data set
Positive skewness vs negative
skewness

 Positive skewness
– There are more observations below the mean than above it
– When the mean is greater than the median
 Negative skewness
– There are a small number of low observations and a large
number of high ones
– When the median is greater than the mean
Positive skewness vs negative
skewness

 If skewness equals zero, the histogram is


symmetric about the mean
Kurtosis

 Kurtosis measures how peaked the histogram is


n

 i
( x  x ) 4

kurtosis  i
4
3
ns

 The kurtosis of a normal distribution is 0


 Kurtosis characterizes the relative peakedness or flatness of a
distribution compared to the normal distribution

You might also like