Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
12 views

Tutorial Confidence Interval

Uploaded by

zqweo23
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Tutorial Confidence Interval

Uploaded by

zqweo23
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

The Confidence Intervals

Tutorial
Instructor: Prof. Jize ZHANG
Tutors: Wenjun JIANG
(wjiangbb)

CIVL 2160 – Modelling Systems with Uncertainties


1
Confidence intervals
(guess)
𝜃෠ - Δ 𝜃෠ θ 𝜃෠ + Δ θ: Unknown value

0 10 መ The value we guess


𝜃:
Δ Δ

> 50%

2
Confidence intervals
A plausible range of values for the population parameter is called a confidence interval.

An example to better understand:


• Using only a sample statistic to estimate a parameter is like fishing in a murky lake
with a spear, and using a confidence interval is like fishing with a net.
• We can throw a spear where we saw a fish, but we will probably miss. If we toss a net
in that area, we have a good chance of catching the fish.

Confidence intervals consist of a lower limit and an upper limit. It includes a level of
confidence, which is a number that tells us just how likely it is that the true value is
contained within the interval.

3
Confidence intervals
A plausible range of values for the population parameter is called a confidence interval.

An example to better understand:


• Using only a sample statistic to estimate a parameter is like fishing in a murky lake
with a spear, and using a confidence interval is like fishing with a net.
• We can throw a spear where we saw a fish, but we will probably miss. If we toss a net
in that area, we have a good chance of catching the fish.

Confidence intervals consist of a lower limit and an upper limit. It includes a level of
confidence, which is a number that tells us just how likely it is that the true value is
contained within the interval.

4
General formula (Population mean)
Assumptions:
• Population is normally distributed
• Population standard deviation σ is known
• If population is not normal, use large sample (n > 30)

Then, a 1 − 𝛼 100% confidence interval for the mean 𝝁 is:


Source: Statistics How To
𝝈
ഥ ± 𝒛𝜶Τ𝟐
𝒙
𝒏
where 𝑥ҧ is the point estimate, also the sample mean.

𝝈
→ A “95% confidence interval” for estimating the population mean is 𝒙
ഥ ± 𝟏. 𝟗𝟔
𝒏
5
Example 1
A sample of 400 circuits from a large normal population has a mean resistance of 2.20
ohms. We know from past testing that the population standard deviation is 0.35 ohms.
Determine a 95% confidence interval for the true mean resistance of the population.

What do we know? Solution:


• Mean resistance ➔ quantitative 𝜎
95%CI = 𝑥ҧ ± 1.96
• 𝑛 = 400 𝑛
• 𝑥ҧ = 2.20 ohms and 𝜎 = 0.35 ohms = 2.20 ± 1.96 0.0175
• 95% confidence ➔ 𝑧 = 1.96 = 2.20 ± 0.0343
= 2.1657 ohms, 2.2343 ohms

https://poe.com/s/WRuXCG5Yss5j8eCmPYlc 6
Confidence intervals

Confidence intervals

Population mean Population proportion

𝜎 known 𝜎 unknown
Solved What can we do?

7
Student t-distribution (𝜎 unknown)
Assumptions:
• Population standard deviation σ is unknown (we can substitute the sample standard
deviation, s)
• Population is normally distributed
• If population is not normal, use large sample (n > 30)

Then, a 1 − 𝛼 100% confidence interval for the mean 𝝁 is: Margin of Error

𝒔 𝝈
ഥ ± 𝒕𝜶Τ𝟐,𝒏−𝟏
𝒙 ഥ ± 𝒛 𝜶Τ 𝟐
𝒙
𝒏 𝒏

The 𝒕𝜶Τ𝟐,𝒏−𝟏 value depends on degrees of freedom (d.o.f.)


d. o. f. = 𝑛 − 1
8
Example 2
Suppose the annual maximum stream flow of a given river has been observed for 10
years yielding the following statistics:
Sample mean = 𝑥ҧ = 10000 cfs
Sample variance = 𝑠 2 = 9 × 106 cfs 2
a) Establish the two-sided 90% confidence interval on the mean annual maximum
stream flow. Assume a normal population.
b) If it is desired to estimate the mean annual maximum stream flow to within
± 1000 cfs with 90% confidence, how many additional years of observation will be
required? Assume the sample (not the true value) variance based on the new set of
data will be approximately 9 × 106 cfs 2 .

9
Example 2 (cont.)
Suppose the annual maximum stream flow of a given river has been observed for 10
years yielding the following statistics:
Sample mean = 𝑥ҧ = 10000 cfs
Sample variance = 𝑠 2 = 9 × 106 cfs 2
a) Establish the two-sided 90% confidence interval on the mean annual maximum
stream flow. Assume a normal population.

https://poe.com/s/EKIlXkDsOQLs3zCBzVrv 10
Example 2 (cont.)
Suppose the annual maximum stream flow of a given river has been observed for 10
years yielding the following statistics:
Sample mean = 𝑥ҧ = 10000 cfs
Sample variance = 𝑠 2 = 9 × 106 cfs 2
a) Establish the two-sided 90% confidence interval on the mean annual maximum
stream flow. Assume a normal population.
What do we know? Solution:
• Sample mean ➔ quantitative 𝑠
90%CI = 𝑥ҧ ± 1.833
• 𝑛 = 10 ➔ t-distribution ➔ d. o. f. = 9 𝑛
• 𝑥ҧ = 104 cfs and 𝑠 2 = 9 × 106 cfs 2 = 10000 ± 1.833 948.683
• 90% confidence ➔ 𝑡0.05,9 = 1.833 = 10000 ± 1738.936
= 8261 cfs, 11739 cfs
https://poe.com/s/EKIlXkDsOQLs3zCBzVrv 11
Example 2 (cont.)
b) If it is desired to estimate the mean annual maximum stream flow to within
± 1000 cfs with 90% confidence, how many years of observation will be required?
Assume the sample (not the true value) variance based on the new set of data will be
approximately 9 × 106 cfs 2 .

What do we know? Solution:


• Sample mean ➔ quantitative 𝑠
𝐸 = 𝑡0.05Τ𝑛−1
• 𝑥ҧ = 104 cfs and 𝑠 2 = 9 × 106 cfs 2 𝑛
𝑡0.05Τ𝑛−1 1
• 90% confidence ➔ 𝑡0.05,𝑛−1 ⇒ ≤
𝑛 3
• Margin of Error within ±1000 cfs Then we can use ChatGPT or refer the
table to get n.

https://poe.com/s/9YCYT7PCHBcKK1DsiJQj Oooops! 12
Example 2 (cont.)
b) If it is desired to estimate the mean annual maximum stream flow to within
± 1000 cfs with 90% confidence, how many additional years of observation will be
required? Assume the sample (not the true value) variance based on the new set of
data will be approximately 9 × 106 cfs 2 .

Refer to table
We see that a sample size of 27 will
do, hence an additional (27 – 10) = 17
years of observation will be required.
Coding
https://colab.research.google.com/drive/1JvV5jXdI
hFdqPjsi2c8bQrbzibqX11_x?usp=sharing

13
Confidence intervals

Confidence intervals

Population mean Population proportion


What can we do?

𝜎 known 𝜎 unknown
Solved Solved

14
General formula (Population proportion)
Upper and lower confidence limits for the population proportion are calculated with the
formula

𝑝Ƹ 1 − 𝑝Ƹ
𝑝Ƹ ± 𝑧𝛼Τ2
𝑛
where 𝑝Ƹ is the sample proportion, normal with 𝜇𝑝ො = 𝑝

Note: must have np > 10 and n 1 − p > 10

𝑝ො 1−𝑝ො
→ A “95% confidence interval” for estimating the proportion is 𝑝Ƹ ± 𝟏. 𝟗𝟔
𝑛

15
Determining sample size

Determining sample size

For the mean For the proportion

Margin of Error Margin of Error


For t-distribution, the
procedure is similar,
  
E = z 2   E = z 2
(
p 1− p )
with 𝑡𝛼Τ2,𝑛−1  n n
Necessary sample size Necessary sample size

n=
( z ) 
 2
2 2

n=
2
(
( z 2 ) p 1 − p )
2
E E2
16
Example 3
a) A national survey of 900 women golfers was conducted to learn how women golfers
view their treatment at golf courses in United States. The survey found that 396 of
the women golfers were satisfied with the availability of tee times. Suppose one
wants to develop a 95% confidence interval estimate for the proportion of the
population of women golfers satisfied with the availability of tee times.

What do we know? Solution:


• 𝑛 = 900 𝑝Ƹ 1 − 𝑝Ƹ
95%CI = 𝑝Ƹ ± 1.96
• 𝑝Ƹ = 0.44 𝑛
• 95% confidence ➔ 𝑧 = 1.96 = 0.44 ± 1.96 0.0165
= 0.44 ± 0.0324
= 0.4076, 0.4724

https://poe.com/s/rBy47vWyeRMRoLQI0TAY 17
Example 3 (cont.)
b) Suppose the survey director wants to estimate the population proportion with a
margin of error of 0.025 at 95% confidence. How large a sample size is needed to
meet the required precision? (A previous sample of similar units yielded 0.44 for the
sample proportion.)

What do we know? Solution:


• 𝑝Ƹ = 0.44 𝑝Ƹ 1 − 𝑝Ƹ
𝐸 = 𝑧𝛼Τ2
• 95% confidence ➔ 𝑧 = 1.96 𝑛
2
• Margin of Error = 0.025 𝑧𝛼Τ2 𝑝Ƹ 1 − 𝑝Ƹ
⇒𝑛=
𝐸2
1.962 0.44 0.56
=
0.0252
= 1514.5 ⇒ 1515
https://poe.com/s/wUGCGx7EiHMoDkq6Gz15 18
Supplementary examples to help understand
A random sample of 50 college students were asked how many exclusive relationships
they have been in so far. This sample yielded a mean of 3.2. Estimate the true average
number of exclusive relationships using this sample (Assume the population standard
deviation is 1.74).

Solution:
𝜎 1.74
95%CI = 𝑥ҧ ± 1.96 = 3.2 ± 1.96 = 2.7, 3.7
𝑛 50

Judge whether the following conclusions are correct based on the calculations.

19
Supplementary examples to help understand (cont.)
• True or False and explain: We are 95% confident that the average number of
exclusive relationships college students in this sample have been in is between 2.7
and 3.7.
𝜎
False. The confidence interval 𝑥ҧ ± 1.96 definitely (100%) contains the
𝑛
sample mean 𝑥,ҧ not just with probability 95%.

• True or False and explain: 95% of college students have been in 2.7 to 3.7 exclusive
relationships.
False. The confidence interval is for covering the population mean 𝜇, not for
covering 95% of the entire population. If 95% of college students have been in
2.7 to 3.7 exclusive relationships, the standard deviation won’t be as large as
1.74. 20
Supplementary examples to help understand (cont.)
• True or False and explain: There is 0.95 probability that the true mean number of
exclusive relationships of college students falls in the interval (2.7, 3.7).
• True or False and explain: The interval (2.7, 3.7) has probability of 0.95 of enclosing
the true mean number of exclusive relationships of college students.
Both are False. The population mean 𝜇 is a fixed number, not random. It is either
in the interval (2.7, 3.7), or not in the interval. There is no uncertainty involved.

21

You might also like