6Sampling Distribution
6Sampling Distribution
Learning objectives
At the end of this topic, students will be able to:
• Describe sampling and sampling distribution
12 March 2025 2
• A sampling distribution is a distribution
of all possible values of a statistic
computed from samples of the same
size randomly selected from the same
population.
• Serves to answer probability questions
about sample statistics.
• When sampling a discrete, finite
population, a sampling distribution can
be constructed.
Example:
• Age of individuals is a random variable.
μ
x i
N
18 20 22 24
21
4
σ
i
(x μ) 2
2.236
N
Now consider all possible samples of size
st nd
n=2
1 2 Observation 1st 2nd Observation
Obs 18 20 22 24 Obs 18 20 22 24
18 18,18 18,20 18,22 18,24 18 18 19 20 21
20 20,18 20,20 20,22 20,24 20 19 20 21 22
22 22,18 22,20 22,22 22,24 22 20 21 22 23
24 24,18 24,20 24,22 24,24 24 21 22 23 24
• 16 possible samples • 16 Sample Means
(with replacement)
Sample means Freq P( )
18 1 0.0625
19 2 0.1250
20 3 0.1875
21 4 0.2500
22 3 0.1875
23 2 0.1250
24 1 0.0625
Sampling distribution of all sample means
16 Sample Sample
Means Means
1st 2nd Observation Distribution
Obs 18 20 22 24 P(x)
.3
18 18 19 20 21
.2
20 19 20 21 22
.1
22 20 21 22 23
0 _
24 21 22 23 24 18 19 20 21 22 23 24 x
Summary measures of this sampling distribution: Add
the 16 sample means & divide by 16. Also calculate
the SD of the sample means.
μx
x
18 19 21 24
i
21
N 16
σx
i x
(x μ ) 2
N
(18 - 21)2 (19 - 21)2 (24 - 21)2
1.58
16
Comparing the population with its
sampling distribution
Population Sample means
N=4 distribution
μ 21 σ 2.236 μx 21n = σ2x 1.58
P(x) P(x)
.3 .3
.2 .2
.1 .1
0 0 18 19 20 21 22 23 24
_
18 20 22 24
x Mean
• We note that the mean of the sampling
distribution of has the same value as
the mean of the original population.
= σ2 (N-n)
n (N-1)
– Finite population correction, (N-n)/(N-1)
Sampling Error
• Sample statistics are used to estimate
population parameters
ex: X is an estimate of the population mean, μ
• Problems:
– Different samples provide different estimates of
the population parameter
– Sample results have potential variability, thus
sampling error exits
Standard deviation vs. standard error
12 March 2025 23
Calculating sampling error
• Sampling error:
The difference between a value (a statistic)
computed from a sample and the corresponding
value (a parameter) computed from a population
All AUH
Tigre live 100
births, 2002 sample
12 March 2025 27
Could the difference of 0.23 kg =(3.5kg-3.27kg) be real or
could it be purely due to chance in sampling?
‘apparent’ difference between population mean and the
random sample mean that is purely due to chance in
sampling is called the sampling error
Sampling error does not mean that a mistake has been made
in the process of sampling but variation experienced due to
the process of sampling
12 March 2025 28
Sampling error reflects the difference between
the value derived from the sample and the true
population value
The only way to eliminate sampling error is to
enumerate the entire population
Note:
• The sampling error may be positive or
negative (x may be greater than or
less than μ)
• The expected sampling error decreases
as the sample size increases
Properties of sampling distribution of mean
σ
μ x μ and σx
n
b. The mean, μ, of the distribution of sample
mean is equal to the mean of the
population from which the samples were
drawn
c. The variance of the distribution of sample
mean is equal to the variance of the
population divided by the sample size
Properties of normal distribution
0.34 0.34
-
Unimodal and symmetrical, i.e. one half of distribution is mirror
image of the other half
Probability distribution: area under normal curve is 1
For a normal distribution with mean and standard deviation
1 contains approximately 68% of area under the normal curve
1.96 contains approximately 95% of area under the normal
curve
2.58 contains approximately 99% of area under the normal
curve
12 March 2025 33
B. Sampling from non-normally distributed populations
• When the sampling is done from a non-normally
distributed population, the central limit theorem is used.
• The larger the sample size, the better will be the normal
approximation to the sampling distribution of the mean.
• We can apply the Central Limit Theorem:
– Even if the population is not normal, sample means
from the population will be approximately normal as
long as the sample size is large enough.
Then, the sampling distribution will have
μ x μ
σ
and σx
n
The sampling
distribution
becomes almost
As the n↑ normal
sample regardless of
size gets shape of
large population
enough…
x
If the population is not normal
Population Distribution
Sampling distribution
properties:
Central Tendency
μ x μ
μ x
Variation Sampling Distribution
σ (becomes normal as n increases)
σx Larger
n Smaller
sample size
sample
size
μx x
Below is a graph of results from a sampling activity. Samples were taken at
increasing sizes, from 4 cases to 98 cases. You can see that as sample size
increases, not only do the sample means become closer to the population
mean, but fluctuations in sample means becomes smaller.
• Generally, as n increases, the sample
mean and sample variance S2 approach
the values of the true population
parameters µ and σ2, respectively.
• The average of the sample means based
on repeated samples of size n approaches
the population mean µ as the number of
samples selected gets large.
E (x) = µ
• The estimator x is said to be unbiased
How large is large enough?
• For most distributions, n > 30 will give a sampling
distribution that is nearly normal
(x μ)
Then z
σ N n
n N 1
• When the population is much larger than
the sample, the difference between σ2/n
and (σ2/n)[(N-n)/(N-1)] will be negligible.
SE =
• To convert to the SND, we use the formula
where
and p(1 p)
μ p p σp
n
(where p = population proportion)
z-Value for Proportions
Standardize p to a z value with the formula:
p p p p
z
σp p(1 p)
n
• If sampling is without
replacement and n > 5% of p(1 p) N n
σp
the population size, then σ p n N 1
must use the FPC(Finite
Population Correction) factor:
Example 1
• According to a recent estimate, 19.4% of the
adult male population was obese. What is the
probability that in a random sample of size 150
from this population fewer than 15% will be
obese?
Note: npq = 150x0.194x0.806 = 24 > 5.
• n = 150, p = .194, Find P( p < 15)
•
• Find the z score
Find σ:p σp
p(1 p)
.4(1 .4)
.03464
n 200
Convert to
standard .40 .40 .45 .40
normal: P(.40 p .45) P z
.03464 .03464
P(0 z 1.44)
Use standard normal table: P(0 ≤ z ≤ 1.44) = .4251
Standardized
Sampling Distribution Normal Distribution
.4251
Standardize
variance, .