Seminar Week 3 - In-Class - Fullpage
Seminar Week 3 - In-Class - Fullpage
Seminar Week 3 - In-Class - Fullpage
Week 3
Sampling Distribution & Confidence Interval Estimation
Charanjit Kaur
Learning Objectives
Textbook references:
Berenson et al Basic Business Statistics 5th edition,
Chapter 6 Sections 6.1-6.2
Chapter 7, Sections 7.1-7.5.
Chapter 8, Sections 8.1-8.3
Probability Distribution
• The histogram is a picture of the probability density in crude form
• Probability density describes how “dense” the distribution is over a data range
• Allows us to calculate the probability related to the variable of interest
• In statistics, we use a smooth mathematical function to model the probability density function
(pdf)
• The area under the curve represents the probability
The basics of Sampling Distribution
• Sample statistic is only an estimate of the truth
• Since the sample may vary, any sample statistic is not exact and has
variation/error around them. The smaller the error, the greater the accuracy.
• Assume we take data samples repeatedly, and compute sample means as the
statistic for each set of sample. Then we would have the sampling distribution
of the sample mean to portray its variability.
𝑺𝒕𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏
ഥ ∼ 𝑵 𝑴𝒆𝒂𝒏,
𝒙
𝒔𝒂𝒎𝒑𝒍𝒆 𝒔𝒊𝒛𝒆
• This is true regardless of the shape of the population distribution
In other words, regardless of the distribution of the original population, the distribution of the sample
mean is approximately normal, IF the sample size is large enough.
This approximation gets better as the sample size increases. [Still assuming ‘large’ population and simple
random sampling.]
5
The following are based on all possible samples of size n.
1 6
1 6
1 6
6
Sampling Distribution of the Sample Mean
Key takeaways:
• The sample mean 𝒙
ഥ is centred around the true mean
• Its uncertainty is measured by the standard error
𝑠
𝑆𝐸 𝑥ҧ =
𝑛
• The standard error is always smaller than the standard deviation
• Large sample size → 𝒙
ഥ is more precise estimate of the true population mean
Example on Purchase Amounts 8
Suppose a random sample of 15 customers who entered this popular retail store was taken.
c) Calculate the probability that their mean purchase amount is not more than $30.
Example on Weekly Income of Graduates 15
A recent report states that the average weekly income of graduates, one year after
graduation, is $600. Suppose the distribution of weekly income is non-normal and has
standard deviation $100. Calculate the probability that 50 randomly selected graduates
have a mean weekly income of less than $580.
Estimation of the true population mean
There are two types of estimates:
1) Point Estimate
A single value that estimates a population parameter
2) Interval Estimate
A range of values within which the population parameter probably lies. This range is
known as a confidence interval estimate
Confidence interval = plausible range of the unknown population mean given some level of
probability
17
Confidence Interval: Basic Format
σ σ OR σ
𝑋−𝑍 < 𝜇 < 𝑋+𝑍 ഥ±𝒁
𝒙
𝑛 𝑛 𝒏
ഥ
𝑿 Margin of error
Point Estimate A value that embodies the Standard error - how much the sample
Estimate by 𝑋ത desired level of confidence mean varies from its average value in
(Zcrit) repeated experiments (how the sample
mean varies from sample to sample)
σ
( 𝑛)
18
Width of the Confidence Interval
1–a
σ σ
𝑥lj − 𝑍 𝑥lj + 𝑍
𝑛 𝑛
Note:
• (1-𝛼) is referred to as the level of confidence
• 𝛼 is referred to as the level of significance. It is the probability left in the “tail ends” of the confidence intervals
E.g. for a 95% confidence interval, 𝛼 = 1 − 0.95 = 0.05
Confidence Interval in Repeated Sampling Context
(Source: Lind, Marchal and Wathen, Statistical Techniques in Business Economics, 2021, 18th edition)
Confidence Interval Estimation
𝜎 𝜎
𝑋ሜ − 𝑍α/2 < 𝜇 < 𝑋ሜ + 𝑍α/2
𝑛 𝑛
𝑝 1−𝑝 𝑝 1−𝑝
𝑝 − 𝑍α/2 < 𝜋 < 𝑝 + 𝑍α/2 Estimate the true proportion
𝑛 𝑛
21
22
Factors that affect the width of a Confidence Interval Estimate
If the standard deviation (𝜎) ↑, the spread of the distribution is larger (assuming all other factors remain
unchanged)
𝑠𝑡𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
standard error ↑, width ↑, estimate is less precise
𝑛
𝑠𝑡𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
If the sample size (n) ↑,standard error ↓, width ↓, estimate is more precise (assuming all
𝑛
other factors remain unchanged)
The bigger the sample, the more information we have to increase the precision of the interval estimate of the
sample mean, the narrower the interval.
If the level of confidence (1-α) ↑, critical value changes, width ↑ , the estimate is less precise (assuming all
other factors remain unchanged)
The more confident we are, the more values we need to include in our confidence interval, the wider the
interval.
σ
ഥ ± 𝑍α/2
𝒙
𝒏
Example 1
During an audit of a company, the auditor wishes to estimate the mean
outstanding account balance using a sample of 80 accounts. The sample of 80
accounts have a mean value of $86.05 and a standard deviation of $22.38.
More than half of our potential market has tried our frozen food product