Seminar Week 3 - In-Class - Fullpage

ETX1100/5900 Business Statistics
Week 3
Sampling Distribution & Confidence Interval Estimation
Charanjit Kaur
Learning Objectives
Describe the role of statistical inference and apply inference

methods to single populations.
Textbook references:
Berenson et al Basic Business Statistics 5th edition,
Chapter 6 Sections 6.1-6.2
Chapter 7, Sections 7.1-7.5.
Chapter 8, Sections 8.1-8.3
Probability Distribution
• The histogram is a picture of the probability density in crude form
• Probability density describes how “dense” the distribution is over a data range
• Allows us to calculate the probability related to the variable of interest
• In statistics, we use a smooth mathematical function to model the probability density function
(pdf)
• The area under the curve represents the probability
The basics of Sampling Distribution
• Sample statistic is only an estimate of the truth
• Since the sample may vary, any sample statistic is not exact and has
variation/error around them. The smaller the error, the greater the accuracy.
• Assume we take data samples repeatedly, and compute sample means as the
statistic for each set of sample. Then we would have the sampling distribution
of the sample mean to portray its variability.
A sampling distribution is the probability distribution of a sample statistic.

The Central Limit Theorem
Statistical theory gives us a result (Central Limit Theorem):
• If the sample size 𝒏 is large,
𝑺𝒕𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏
ഥ ∼ 𝑵 𝑴𝒆𝒂𝒏,
𝒙
𝒔𝒂𝒎𝒑𝒍𝒆 𝒔𝒊𝒛𝒆
• This is true regardless of the shape of the population distribution
In other words, regardless of the distribution of the original population, the distribution of the sample
mean is approximately normal, IF the sample size is large enough.
This approximation gets better as the sample size increases. [Still assuming ‘large’ population and simple
random sampling.]
5
The following are based on all possible samples of size n.
1 6
1 6
1 6
6
Sampling Distribution of the Sample Mean
Key takeaways:
• The sample mean 𝒙
ഥ is centred around the true mean
• Its uncertainty is measured by the standard error
𝑠
𝑆𝐸 𝑥ҧ =
𝑛
• The standard error is always smaller than the standard deviation
• Large sample size → 𝒙
ഥ is more precise estimate of the true population mean
Example on Purchase Amounts 8
Suppose it is known that the distribution of purchase amounts by customers entering a

popular retail store is normal with mean $25 and standard deviation $8.
a) Calculate the probability that a randomly selected customer spends less than $35 at
this store.
b) Find the dollar amount such that 75% of all customers spend no more than this
amount.
Suppose a random sample of 15 customers who entered this popular retail store was taken.
c) Calculate the probability that their mean purchase amount is not more than $30.
Example on Weekly Income of Graduates 15
A recent report states that the average weekly income of graduates, one year after
graduation, is $600. Suppose the distribution of weekly income is non-normal and has
standard deviation $100. Calculate the probability that 50 randomly selected graduates
have a mean weekly income of less than $580.
Estimation of the true population mean
There are two types of estimates:
1) Point Estimate
A single value that estimates a population parameter
2) Interval Estimate
A range of values within which the population parameter probably lies. This range is
known as a confidence interval estimate
Point estimates do not indicate uncertainty (sampling error).

Better approach: give a range of values within which the unknown population parameter is
thought likely to lie. We refer to this range of plausible values as a confidence interval
Confidence interval = plausible range of the unknown population mean given some level of
probability
17
Confidence Interval: Basic Format
σ σ OR σ
𝑋−𝑍 < 𝜇 < 𝑋+𝑍 ഥ±𝒁
𝒙
𝑛 𝑛 𝒏
ഥ
𝑿  Margin of error
Point Estimate A value that embodies the Standard error - how much the sample
Estimate  by 𝑋ത desired level of confidence mean varies from its average value in
(Zcrit) repeated experiments (how the sample
mean varies from sample to sample)
σ
( 𝑛)
18
Width of the Confidence Interval
1–a
σ σ
𝑥ǉ − 𝑍 𝑥ǉ + 𝑍
𝑛 𝑛
Lower Confidence Limit Width of the Upper Confidence Limit

/ Lower boundary (LL) / Upper boundary (UL)
confidence interval
The width of a confidence interval indicates the precision of the estimate.
Note:
• (1-𝛼) is referred to as the level of confidence
• 𝛼 is referred to as the level of significance. It is the probability left in the “tail ends” of the confidence intervals
E.g. for a 95% confidence interval, 𝛼 = 1 − 0.95 = 0.05
Confidence Interval in Repeated Sampling Context
We select a sample of 𝑛 observations

repeatedly and and for each sample we
construct a 95% confidence interval for
the population mean.
We could expect 95% of intervals to

contain the population mean.
While 5% of the intervals would not
contain the population mean.
(Source: Lind, Marchal and Wathen, Statistical Techniques in Business Economics, 2021, 18th edition)
Confidence Interval Estimation
𝜎 𝜎
𝑋ሜ − 𝑍α/2 < 𝜇 < 𝑋ሜ + 𝑍α/2
𝑛 𝑛
Estimate the true mean

𝑆 𝑆
𝑋ሜ − 𝑡𝑛−1,α/2 < 𝜇 < 𝑋ሜ + 𝑡𝑛−1,α/2
𝑛 𝑛
𝑝 1−𝑝 𝑝 1−𝑝
𝑝 − 𝑍α/2 < 𝜋 < 𝑝 + 𝑍α/2 Estimate the true proportion
𝑛 𝑛
21
22
Factors that affect the width of a Confidence Interval Estimate
If the standard deviation (𝜎) ↑, the spread of the distribution is larger (assuming all other factors remain
unchanged)
𝑠𝑡𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
standard error ↑, width ↑, estimate is less precise
𝑛
𝑠𝑡𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
If the sample size (n) ↑,standard error ↓, width ↓, estimate is more precise (assuming all
𝑛
other factors remain unchanged)
The bigger the sample, the more information we have to increase the precision of the interval estimate of the
sample mean, the narrower the interval.
If the level of confidence (1-α) ↑, critical value changes, width ↑ , the estimate is less precise (assuming all
other factors remain unchanged)
The more confident we are, the more values we need to include in our confidence interval, the wider the
interval.
σ
ഥ ± 𝑍α/2
𝒙
𝒏
Example 1
During an audit of a company, the auditor wishes to estimate the mean
outstanding account balance using a sample of 80 accounts. The sample of 80
accounts have a mean value of $86.05 and a standard deviation of $22.38.
Calculate a 95% confidence interval estimate of the mean outstanding account

balance. Show your working and interpret your result.
Example 2: Average house prices
Refer to the data file provided for Seminar 3 (in week 3 on Moodle)
Houses in the school zone are more expensive, on average compared to

houses outside the school zone.
Let’s use the concepts of confidence interval to validate/invalidate this claim
Example 3: Confidence Interval for proportions– Marketing Survey
Refer to the data file provided for Seminar 3 (in week 3 on Moodle)
More than half of our potential market has tried our frozen food product
Let’s use the concepts of confidence interval for proportions to

validate/invalidate this claim

Seminar Week 3 - In-Class - Fullpage

Uploaded by

Copyright:

Available Formats

Seminar Week 3 - In-Class - Fullpage

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Seminar Week 3 - In-Class - Fullpage

Uploaded by

Copyright:

Available Formats

ETX1100/5900 Business Statistics

Describe the role of statistical inference and apply inference

A sampling distribution is the probability distribution of a sample statistic.

Suppose it is known that the distribution of purchase amounts by customers entering a

Point estimates do not indicate uncertainty (sampling error).

Lower Confidence Limit Width of the Upper Confidence Limit

The width of a confidence interval indicates the precision of the estimate.

We select a sample of 𝑛 observations

We could expect 95% of intervals to

Estimate the true mean

Calculate a 95% confidence interval estimate of the mean outstanding account

Houses in the school zone are more expensive, on average compared to

Let’s use the concepts of confidence interval for proportions to

You might also like