Statistical Data
Statistical Data
x i
x i 1
n
xi 2 9 1 1 5 6 3 3
x 6 .6
n 5 5
x i
3.35 3.37 3.28 3.34 3.30
x i 1
3.33
n 5
Weighted Mean
The Weighted mean of the positive real numbers
x1,x2, ..., xn with their weight w1,w2, ..., wn is defined
n
to be
wi xi
i 1
x n
w
i 1
i
Example
Geometric Mean
Geometric mean is defined as the positive root of the
product of observations. Symbolically,
GM ( x1 x2 x3 xn ) 1/ n
Find geometric mean of rate of growth: 34, 27, 45, 55, 22, 34
Harmonic Mean
The harmonic mean is the number of variables divided
by the sum of the reciprocals of the variables.
n
HM n
1
i 1 xi
n 25 8/25
Relative frequency
Median? 6/25
m2 4/25
mode 2
0
0 1 2 3 4 5
Quarts
Extreme Values
The mean is more easily affected by extremely
large or small values than the median.
Range
Difference between maximum and minimum values
Interquartile Range
Variability
No Variability
The Range
• The range, R, of a set of n measurements is the
difference between the largest and smallest
measurements.
• Example: A botanist records the number of
petals on 5 flowers:
5, 12, 6, 8, 14
• The range is R = 14 – 5 = 9.
Quartiles
Q1 Q2 Q3
Interquartile Range:
IQR=Q3 – Q1
• The position of p-th percentile is 0.p(n + 1)
Q1is 3/4 of the way between the 4th and 5th ordered
measurements, or Q1 = 65 + 0.75(65 - 65) = 65.
Example
The prices ($) of 18 brands of walking shoes:
40 60 65 65 65 68 68 70 70
70 70 70 70 74 75 75 90 95
45
x 9
5
4 6 8 10 12 14
The Variance
• The variance of a population of N measurements
is the average of the squared deviations of the
measurements about their mean m.
( x m ) 2
2 i
N
• The variance of a sample of n measurements is the sum
of the squared deviations of the measurements about their
mean, divided by (n – 1).
( xi x ) 2
s
2
n 1
The Standard Deviation
• In calculating the variance, we squared all of
the deviations, and in doing so changed the
scale of the measurements.
• To return this measure of variability to the
original units of measure, we calculate the
standard deviation, the positive square root of
the variance.
P o p ulatio n stand ard d eviatio n : 2
xi x ( xi x )2 ( x x ) 2
s2 i
5 -4 16 n 1
12 3 9
60
6 -3 9 15
8 -1 1 4
14 5 25 s s
2
1 5 3 .8 7
Sum 45 0 60
Two Ways to Calculate the Sample
Variance
Use the calculation formula:
x2
( xi ) 2
xi
i 2
5 25
s2 n
12 144 n 1
6 36 2
45
8 64 465
5 15
14 196
4
Sum 45 465
s s
2
1 5 3 .8 7
Example- ungrouped data
Sample: Moisture content (%) of kraft paper are:
6.7, 6.0, 6.4, 6.4, 5.9, and 5.8.
(231.26) (37.2) 2 6
s 0.35
(6 1)
Sample standard deviation, s = 0.35
Using Measures of Center and Spread:
The Empirical Rule
Given a distribution of measurements
Suppose s = 2. s
x x 4
z - s co re s s
s
x 5 x9
x = 9 lies z =2 std dev from the mean.
z-Scores
• z-scores between –2 and 2 are not unusual. z-scores
should not be more than 3 in absolute value. z-scores
larger than 3 in absolute value would indicate a
possible outlier.
-3 -2 -1 0 1 2 3
Somewhat unusual
Example of z-Scores
X z-Score X z-Score
10 -1.28244 10 -0.29204
15 0.625954 500 3.473714
10 -1.28244 10 -0.29204
16 1.007634 16 -0.24593
11 -0.90076 11 -0.28435
17 1.389313 17 -0.23824
14 0.244275 14 -0.2613
13 -0.1374 13 -0.26898
10 -1.28244 10 -0.29204
16 1.007634 16 -0.24593
11 -0.90076 11 -0.28435
17 1.389313 17 -0.23824
14 0.244275 14 -0.2613
13 -0.1374 13 -0.26898
Coefficient of Variation(CV)
When comparing between data sets with different units
or widely different means, one should use the
coefficient of variation for comparison instead of the
standard deviation.
The Coefficient of Variation can be written as
s
CV
x
We express CV as a percentage by multiplying 100
Example: Page 181
Skewness
Skewness measures the degree of asymmetry exhibited
by the data
(x x)
i
3
Mathematically, skewness i 1
ns 3
37
Skewness
i
( x x ) 4
Mathematically, kurtosis i 1
ns 4
39
Kurtosis
Peakedness of a distribution
Leptokurtic: high and thin
Mesokurtic: normal in shape
Platykurtic: flat and spread out
Leptokurtic
Mesokurtic
Platykurtic
Skewness and Kurtosis
41