Chapter 3
Chapter 3
Chapter 3
CHAPTER 3
NUMERICAL DESCRIPTIVE MEASURES
OBJECTIVES
In this chapter, you learn to:
• Describe the properties of central tendency, variation, and shape in
numerical variables.
• Construct and interpret a boxplot.
• Compute descriptive summary measures for a population.
• Calculate the covariance and the coefficient of correlation.
1
1/15/2024
SUMMARY DEFINITIONS
• The central tendency is the extent to which the values of a numerical
variable group around a typical or central value.
• The shape is the pattern of the distribution of values from the lowest
value to the highest value.
X i
X1 X 2 Xn
X i 1
n n
Sample size Observed values
2
1/15/2024
11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20
Mean = 13 Mean = 14
11 12 13 14 15 65 11 12 13 14 20 70
13 14
5 5 5 5
• In an ordered array, the median is the “middle” number (50% above, 50%
below).
11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20
Median = 13 Median = 13
• Less sensitive than the mean to extreme values.
3
1/15/2024
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
Mode = 9 No Mode
4
1/15/2024
5
1/15/2024
X G (X1 X 2 X n )1 / n
• Geometric mean rate of return
• Measures the status of an investment over time.
R G [(1 R1 ) (1 R 2 ) (1 Rn )]1/ n 1
• Where Ri is the rate of return in time period i.
6
1/15/2024
Arithmetic
mean rate ( .5) (1) Misleading result
X .25 25%
of return: 2
Geometric RG [(1 R1 ) (1 R2 ) (1 Rn )] 1
1/ n
More
mean rate of [(1 ( .5)) (1 (1))]1 / 2 1 representative
return: [(.50) ( 2)]1 / 2 1 11 / 2 1 0% result
Central Tendency
X i
XG ( X1 X2 Xn )1/ n
X i1
Rate of change
n Middle value Most
of a variable
in the ordered frequently
over time
array observed
value
7
1/15/2024
MEASURES OF VARIATION
Variation
Same center,
different variation
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 13 - 1 = 12
8
1/15/2024
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
• Sample variance: n
(X i X) 2
S2 i 1
n -1
9
1/15/2024
(X X)
i
2
10
1/15/2024
Sample
Data (Xi) : 10 12 14 15 17 18 18 24 n=8 Mean = X = 16
130
4.3095 A measure of the “average”.
7
Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 3.338
11
1/15/2024
• The more the data are spread out, the greater the range, variance, and standard
deviation.
• The more the data are concentrated, the smaller the range, variance, and
standard deviation.
• If the values are all the same (no variation), all these measures will be zero.
• None of these measures are ever negative.
12
1/15/2024
S
CV 100%
X
MEASURES OF VARIATION:
COMPARING COEFFICIENTS OF VARIATION
• Stock A:
• Mean price last year = $50.
• Standard deviation = $5.
S $5
CVA 100% 100% 10% Both stocks have
X $50 the same
standard
deviation, but
• Stock B: stock B is less
variable relative
• Mean price last year = $100. to its mean price.
• Standard deviation = $5. S $5
CVB 100% 100% 5%
X $100
13
1/15/2024
14
1/15/2024
XX
Z
S
where X represents the data value
X is the sample mean
S is the sample standard deviation
15
1/15/2024
SHAPE OF A DISTRIBUTION
• Describes how data are distributed.
• Kurtosis:
• Kurtosis measures the peakedness of the curve of the distribution—that is, how sharply
the curve rises approaching the center of the distribution.
16
1/15/2024
Sharper Peak
Than Bell-Shaped
(Kurtosis > 0)
Bell-Shaped
(Kurtosis = 0)
Flatter Than
Bell-Shaped
(Kurtosis < 0)
• The quartiles.
• Constructing a boxplot.
17
1/15/2024
QUARTILE MEASURES
• Quartiles split the ranked data into 4 segments with an equal number of values per
segment.
Q1 Q2 Q3
The first quartile, Q1, is the value for which 25% of the
values are smaller and 75% are larger.
Q2 is the same as the median (50% of the values are
smaller and 50% are larger).
Only 25% of the values are greater than the third quartile.
18
1/15/2024
(n = 9)
Q1 is in the (9+1)/4 = 2.5 position of the ranked data,
so Q1 = (12+13)/2 = 12.5.
19
1/15/2024
Example:
X Median X
minimum Q1 (Q2) Q3 maximum
12 30 45 57 70
Interquartile range
= 57 – 30 = 27
20
1/15/2024
• The five numbers that help describe the center, spread and shape
of data are:
• Xsmallest.
• First Quartile (Q1).
• Median (Q2).
• Third Quartile (Q3).
• Xlargest.
> ≈ <
> ≈ <
21
1/15/2024
Example:
• If data are symmetric around the median then the box and central line are
centered between the endpoints.
22
1/15/2024
Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3
BOXPLOT EXAMPLE
0 2 3 5 27
• The data are right skewed, as the plot depicts.
23
1/15/2024
X i
X1 X 2 XN
i 1
N N
Where μ = population mean
N = population size
Xi = ith value of the variable X
24
1/15/2024
N
• Population variance:
(X μ)
i
2
σ2 i1
N
σ i1
N
25
1/15/2024
68%
µ
µ ± 1σ
26
1/15/2024
95% 99.7%
μ 2σ μ 3σ
• Suppose that the variable Math SAT scores is bell-shaped with a mean of 500
and a standard deviation of 90. Then:
• Approximately 68% of all test takers scored between 410 and 590, (500 ± 90).
• Approximately 95% of all test takers scored between 320 and 680, (500 ± 180).
• Approximately 99.7% of all test takers scored between 230 and 770, (500 ± 270).
27
1/15/2024
CHEBYSHEV’S RULE
• Regardless of how the data are distributed, at least (1 - 1/k2) x 100% of the
values will fall within k standard deviations of the mean (for k > 1).
• Examples:
At least Within
(1 - 1/22) x 100% = 75% ….............. k=2 (μ ± 2σ)
(1 - 1/32) x 100% = 88.89% ……….. k=3 (μ ± 3σ)
28
1/15/2024
( X X)( Y Y )
i i
cov ( X , Y ) i1
n 1
INTERPRETING COVARIANCE
29
1/15/2024
• Where,
n n n
(X X)(Y Y)
i i (X X)
i
2
(Y Y )
i
2
cov (X , Y) i1
SX i1
SY i1
n 1 n 1 n 1
30
1/15/2024
X X
r = -1 r = -.6
Y
Y Y
X X X
r = +1 r = +.3 r=0
31
1/15/2024
ETHICAL CONSIDERATIONS
CHAPTER SUMMARY
In this chapter we have discussed:
• Describing the properties of central tendency, variation, and
shape in numerical variables.
• Constructing and interpreting a boxplot.
• Computing descriptive summary measures for a population.
• Calculating the covariance and the coefficient of correlation.
32