Numerical Descriptive Techniques (6 Hours)
Numerical Descriptive Techniques (6 Hours)
Numerical Descriptive Techniques (6 Hours)
Numerical Descriptive
Techniques (6 hours)
Learning Objectives
In this chapter you learn:
1. Measures of centre and location
2. Measures of dispersion and variation
3. Measures of correlation
Definitions
The central tendency is the extent to which the
values of a numerical variable group around a typical
or central value.
Central Tendency
(Location)
Variation
The variability of the set of measurements–that is,
the spread of the data.
Variation
(Dispersion)
Measures of Central Tendency:
The Mean
X i
X1 X 2 Xn
X i1
n n
Sample size Observed values
Measures of Central Tendency:
The Mean (con’t)
11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20
Mean = 13 Mean = 14
11 12 13 14 15 65 11 12 13 14 20 70
13 14
5 5 5 5
Numerical Descriptive
Measures for a Population
X i
X1 X 2 XN
i1
N N
Where μ = population mean
N = population size
Xi = ith value of the variable X
Example
X i
X1 X 2 Xn
X i1
n n
XG ( X1 X 2 Xn ) 1/ n
RG [(1 R1 ) (1 R 2 ) (1 Rn )]1/ n 1
Where Ri is the rate of return in time period i
Measures of Central Tendency:
The Median
11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20
Median = 13 Median = 13
The location of the median when the values are in numerical order (smallest to largest):
n 1
If theMedian
number of values is odd,the median position
position is the middlein the ordered data
number
2
If the number of values is even, the median is the average of the two middle numbers
Note that is not the value of the median, only the position of
n 1
2
Measures of Central Tendency:
The Mode
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
Mode = 9 No Mode
Measures of Central Tendency:
Which Measure to Choose?
Skewness
Statistic < 0 0 >0
Measures of Variation
Variation
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 13 - 1 = 12
Measures of Variation:
Why The Range Can Be Misleading
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Measures of Variation:
The Sample Variance
DCOVA
Average (approximately) of squared deviations
of values from the mean
n
Sample variance:
(X X) i
2
S 2 i1
n -1
Where X = arithmetic mean
n = sample size
Xi = ith value of the variable X
Measures of Variation:
The Sample Standard Deviation
S i1
n -1
Measures of Variation:
The Standard Deviation
Sample
Data (Xi) : 10 12 14 15 17 18 18 24
n=8 Mean = X = 16
σ2 i1
N
N
Population standard deviation: i
(X μ) 2
σ i1
N
Sample statistics versus
population parameters
X
2 S2
S
Interpreting Standard
Deviation: Empirical Rule
1. Measure of dispersion
2. Also called midspread
3. Difference between upper and lower quartiles
Interquartile Range = QU – QL
4. Spread in middle 50%
5. Not affected by extreme values
Thinking Challenge
You’re a financial analyst for Prudential-Bache
Securities. You have collected the following
closing stock prices of new stock issues: 17,
16, 21, 18, 13, 16, 12, 11.
What are the quartiles, Q1 and Q3, and the
interquartile range?
Box Plot
4 6 8 10 12
Box Plot
( X X)( Y Y )
i i
cov ( X , Y ) i1
n 1
Only concerned with the strength of the relationship
No causal effect is implied
Interpreting Covariance
( X X)( Y Y )
i i
cov ( X , Y )
r i1
n n SX SY
i
(
i1
X X ) 2
i
(
i1
Y Y ) 2
Features of
Correlation Coefficient, r
Unit free
Ranges between –1 and 1
The closer to –1, the stronger the negative linear
relationship
The closer to 1, the stronger the positive linear
relationship
The closer to 0, the weaker any positive linear
relationship
Scatter Plots of Data with Various
Correlation Coefficients
Y Y Y
X X X
r = -1 r = -.6 r=0
Y
Y Y
X X X
r = +1 r = +.3 r=0
Applications of standard deviation
UCL
+3σ
Process Average
- 3σ
LCL
time
Control Chart Basics
UCL
Common Cause +3σ
Process Mean
Variation: range of
- 3σ
expected LCL
variability
time
UCL = Process Mean + 3 Standard Deviations
LCL = Process Mean – 3 Standard Deviations
Process Variability
Special Cause of Variation:
A measurement this far from the process average is very
unlikely if only expected variation is present
UCL
±3σ → 99.7% of
process values Process Mean
should be in this
range LCL
time
UCL = Process Mean + 3 Standard Deviations
LCL = Process Mean – 3 Standard Deviations
Using Control Charts
UCL
Process Mean
LCL
time
Process Not in Control
LCL LCL
E ( Ri ) R f ( E ( Rm ) R f )
• Volatility : The volatility is the standard deviation of
the continuously compounded rate of return in 1 year
Uncorrected sample standard deviation/
standard deviation of the sample
1 n
n i 1
( Ri E ( R)) 2
n
1
n 1 i 1
( Ri E ( R)) 2
Two Assets With Same Expected Return But
Different (Continuous) Probability Distributions
Probability Density
Stock 1
Stock 2
0 5 6 7 8 9 10 11 12 13 14 15
Return %
Return and Risk of a portfolio
RP R1 w1 R2 w2
w w 2 w1w2 cov( R1 , R2 )
2
P
2
1
2
1
2
2
2
2
w w 2 w1w2 12 1 2
2
1
2
1
2
2
2
2
The Question Being Asked in VaR