U3 IntroSummaryStatistics
U3 IntroSummaryStatistics
U3 IntroSummaryStatistics
Statistics
• The collection, evaluation, and interpretation of
data
μ =
∑ xi
N
Mean Central Tendency
μ =
∑ xi
N
= mean value
xi = individual data value
= summation of all data values
N = # of data values in the data set
Mean Central Tendency
• Data Set
3 7 12 17 21 21 23 27 32 36 44
• Sum of the values = 243
• Number of values = 11
Mean = μ =
∑ xi = 243 = 22.09
N 11
Mode Central Tendency
Data Set:
27 17 12 7 21 44 23 3 36 32 21
Mode Central Tendency
Data Set:
3 7 12 17 21 21 23 27 32 36 44
Mode = M = 21
Mode Central Tendency
Data Set:
27 17 12 7 21 44 23 3 36 32 21
Median Central Tendency
Data Set:
3 7 12 17 21 21 23 27 32 36 44
Median Central Tendency
Data Set:
3 7 12 17 21 21 23 27 32 36 44
Range = R = 44 – 3 = 41
Standard Deviation Variation
√ ∑ ( xi − 2
μ)
σ=
N
σ = standard deviation
xi = individual data value ( x1, x2, x3, …)
μ = mean
N = size of population
Standard Deviation Variation
√ ∑ 2
Procedure: ( xi − μ )
σ=
1. Calculate the mean, μ. N
2. Subtract the mean from each value and
then square each difference.
3. Sum all squared differences.
4. Divide the summation by the size of the
population (number of data values), N.
5. Calculate the square root of the result.
√ ∑ ( xi − 2
Standard Deviation μ)
σ=
Calculate the standard N
deviation for the data array
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
1. Calculate the mean. μ =
∑ x i 524 47.64
N 11
2. Subtract the mean from each data value and square each
2
(
difference. x i − μ )
(2 - 47.64)2 = 2083.01 (59 - 47.64)2 = 129.05
(5 - 47.64)2 = 1818.17 (60 - 47.64)2 = 152.77
(48 - 47.64)2 = 0.13 (62 - 47.64)2 = 206.21
(49 - 47.64)2 = 1.85 (63 - 47.64)2 = 235.93
(55 - 47.64)2 = 54.17 (63 - 47.64)2 = 235.93
(58 - 47.64)2 = 107.33
Standard Deviation Variation
√ ∑ ( x i − μ ) = √ 456.78 = 21.4
2
N
A Note about Standard Deviation
• Two distinct calculations
– Population Standard Deviation
• The measure of the spread of data within a
population.
• Used when you have a data value for every
member of the entire population of interest.
– Sample Standard Deviation
• An estimate of the spread of data within a larger
population.
• Used when you do not have a data value for every
member of the entire population of interest.
• Uses a subset (sample) of the data to generalize
the results to the larger population.
A Note about Standard Deviation
Population Sample
Standard Deviation Standard Deviation
√ ∑ ( xi −
√ ∑ ( xi −
2 2
μ) x)
σ= s=
N n −1
Procedure:
1. Calculate the sample mean,.
s=
√ ∑ ( xi −
n −1
x)
x =
∑ xi l
e s
th n a
ly tio n
ia la ea
n s
e
t
n cu m
s al n
E e c atio
m u l
= sample mean sa pop
is a sample.
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
∑ x i 524
1. Calculate the sample mean. x =
n 11
47.64
2. Subtract the sample mean from each data value and
2
(
square the difference. x i − x )
(2 - 47.64)2 = 2083.01 (59 - 47.64)2 = 129.05
(5 - 47.64)2 = 1818.17 (60 - 47.64)2 = 152.77
(48 - 47.64)2 = 0.13 (62 - 47.64)2 = 206.21
(49 - 47.64)2 = 1.85 (63 - 47.64)2 = 235.93
(55 - 47.64)2 = 54.17 (63 - 47.64)2 = 235.93
(58 - 47.64)2 = 107.33
Sample Standard Deviation Variation
√ ∑ ( x i − x ) = √ 502.46 = 22.4
2
n −1
A Note about Standard Deviation
Population Sample
Standard Deviation Standard Deviation
√ ∑ ( xi −
√ ∑ ( xi −
2 2
μ) x)
σ= s=
N n −1
As n → N, s → σ
A Note about Standard Deviation
Population Sample
Standard Deviation Standard Deviation
Given the ACT score of
√ ∑ ( xi −
√ ∑
2 2
μ) ( xi − x )
every student in your
σ= s= class, use the
N n − 1 standard
population
deviation formula to find
σ = population standard deviation s = the standard
sample deviation of
standard deviation
xi = individual data value ( x1, x2, x3, …) ACT
xi = individual data scores
value ( x , x , x , …)
1 2 3
√ √
every student in your
∑
class, use the( x
σ = deviation formula
i−
sample μ)
2
s=
∑ ( xi − x)
2
standard N n −1
to estimate the standard
deviation of the ACT
scores
σ of allstandard
= population students at
deviation s = sample standard deviation
xi = individual
your data value ( x , x , x , …)
school. 1 2 3 xi = individual data value ( x1, x2, x3, …)
μ = population mean = sample mean
N = size of population n = size of sample
Histogram Distribution
-6 to -16 -5 to 5 6 to 16
Class Intervals
Histogram Distribution
-16 to -6 -5 to 5 6 to 16
Histogram Distribution
Example
1, 7, 15, 4, 8, 8, 5, 12, 10
1, 4, 5, 7, 8, 8, 10, 12,15
Frequency
1 to 5 6 to 10 11 to 15
Histogram Distribution
1 to 5 6 to 10 11 to 15
Histogram Distribution
3
Frequency
0
7 45 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
Length (in.)
MINIMUM MAXIMUM
= 0.745 in. = 0.760 in.
Class Intervals
Dot Plot Distribution
0 3 -1 -3
3 2 1 0
-1 -1 2 1
0 1 -1 -2
1 2 1 0
-2 -4 0 0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
Dot Plot Distribution
0 3 -1 -3
3 2 1 0
-1 -1 2 1
0 1 -1 -2
1 2 1 0
-2 -4 0 0
Frequency
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
Normal Distribution Distribution
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
Data Elements
Normal Distribution Distribution
Does the greatest frequency of the
data values occur at about the
mean value?
Mean Value
Frequency
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
Data Elements
Normal Distribution Distribution
Does the curve decrease
on both sides away from
the mean?
Mean Value
Frequency
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
Data Elements
Normal Distribution Distribution
Is the curve symmetric
about the mean?
Mean Value
Frequency
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
Data Elements
What if things are not equal?
Mean = = 0.083
Standard Deviation = s = 1.77 (sample)
Normal Distribution Distribution
0.08 + 1.77
= 1.88
0.08 + - 1.77
= -1.69
68 %
s s
-1.77 +1.77
0.08
Data Elements
Normal Distribution Distribution
= 3.62
0.08 + -3.54
0.08 + 3.54
= - 3.46
95 %
2σ 2σ
- 3.54 + 3.54
0.08
Data Elements