0% found this document useful (0 votes)

105 views

Interval Estimation: Part 1: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018

The document discusses interval estimation and confidence intervals. It introduces the concept of quantifying uncertainty around point estimates using standard error and interval estimates. Confidence intervals are presented as a common type of interval estimate that provide a range of values that is likely to contain the true population parameter based on the sample data and a given level of confidence, such as a 95% confidence interval. Examples are provided to illustrate how to construct confidence intervals for the mean when the data is normally distributed.

Uploaded by

Anonymous na314kKjOA

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

105 views

Interval Estimation: Part 1: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018

Uploaded by

Anonymous na314kKjOA

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Interval estimation: Part 1

(Module 3)
Statistics (MAST20005) & Elements of Statistics (MAST90058)
Semester 2, 2018

Contents
1 The need to quantify uncertainty 1

2 Standard error 2

3 Confidence intervals 3
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 Important distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.4 Pivots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.5 Common scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Aims of this module

• Introduce the idea of quantifying uncertainty and describe some methods for doing so
• Explain interval estimation, particularly confidence intervals, which are the most common type of interval
estimate
• Describe some important probability distribution that appear in many statistical procedures
• Work through some common, simple inference scenarios

1 The need to quantify uncertainty

Statistics: the big picture

We have learnt how to do basic inference, using point estimates. What’s next?

1
How useful are point estimates?

Example: surveying Melbourne residents as part of a disability study. The results will be used to set a budget for
disability support.
Estimate from survey: 5% of residents are disabled
What can we conclude?
Estimate from a second survey: 2% of residents are disabled
What can we now conclude?
What other information would be useful to know?

Going beyond point estimates

• Point estimates are usually only a starting point
• Insufficient to conclusively answer real questions of interest
• Perpetual lurking questions:
– How confident are you in the estimate?
– How accurate is it?
• We need ways to quantify and communicate the uncertainty in our estimates.

2 Standard error
Report sd(Θ̂)?

Previously, we calculated the variance of our estimators.

q
Reminder: sd(Θ̂) = var(Θ̂)
This tells us a typical amount by which the estimate will vary from one sample to another, and thus (for an unbiased
estimator) how close to the true parameter value it is likely to be.
Can we just report that? (Alongside our estimate, θ̂)
Problem: this is usually an expression that depends on the parameter values, which we don’t know and are trying to
estimate.

Estimate sd(Θ̂)!

We know how to deal with parameter values. . . we estimate them!

Let’s estimate the standard deviation of our estimator.
A common approach: substitute point estimates into the expression for the variance.
Example:
p(1−p) p̂(1−p̂)
Consider the sample proportion, p̂ = X/n. We know that var(p̂) = n . Therefore, an estimate is var(p̂)
c = n .
If we take a sample of size n = 100 and observe x = 30, we get
p̂ = 30/100 = 0.3,
r r
p̂(1 − p̂) 0.3 × 0.7
sd(p̂) =
b = = 0.046.
n 100

We refer to this estimate as the standard error and write:

se(p̂) = 0.046

2
Standard error

The standard error of an estimate is the estimated standard deviation of the estimator.
Notation:
• Parameter: θ
• Estimator: Θ̂
• Estimate: θ̂
• Standard deviation of the estimator: sd(Θ̂)
• Standard error of the estimate: se(θ̂)
Note: some people also refer to the standard deviation of the estimator as the standard error. This is potentially
confusing, best to avoid doing this.

Reporting the standard error

There are many ways that people do this.

Suppose that p̂ = 0.3 and se(p̂) = 0.046.
Here are some examples:
• 0.3 (0.046)
• 0.3 ± 0.046
• 0.3 ± 0.092 [= 2 × se(p̂)]
This now gives us some useful information about the (estimated) accuracy of our estimate.

Back to the disability example

More info:
• First survey: 5% ± 4%
• Second survey: 2% ± 0.1%
What would we now conclude?
What result should we use for setting the disability support budget?

3 Confidence intervals

3.1 Introduction

Interval estimates

Let’s go one step further. . .

The form est ± error can be expressed as an interval, (est − error, est + error).
This is an example of an interval estimate.
More general and more useful than just reporting a standard error. For example, it can cope with skewed (asymmetric)
sampling distributions.
How can we calculate interval estimates?

3
Example

Random sample (iid): X1 , . . . , Xn ∼ N(µ, 1)

The sampling distribution of the sample mean is X̄ ∼ N(µ, n1 ). Since we know that Φ−1 (0.025) = −1.96, we can
write:
X̄ − µ
Pr −1.96 < √ < 1.96 = 1 − 2 × 0.025 = 0.95
σ/ n
or, equivalently,
1 1
Pr µ − 1.96 √ < X̄ < µ + 1.96 √ = 0.95
n n

0.4 shaded area is 0.95

0.3
Stan. Norm. PDF

0.2

0.1

0.0
−3 −2 −1 0 1 2 −3
x

Rearranging gives:
1 1
Pr X̄ − 1.96 √ < µ < X̄ + 1.96 √ = 0.95
n n
√ √
This says that the interval (X̄ − 1.96/ n, X̄ + 1.96/ n) has probability 0.95 of containing the parameter µ.
We use this as an interval estimator.
√ √
The resulting interval estimate, (x̄ − 1.96/ n, x̄ + 1.96/ n) is called a 95% confidence interval for µ.

Sampling distribution of the interval estimator

• Is this an estimator?
• Does it have a sampling distribution?
• What does it look like?
• There are two statistics here, the endpoints of the interval:
Pr(L < µ < U) = 0.95

• They will have a joint (bivariate) sampling distribution

Example

For the previous example:

• Realisations of the interval will have a fixed width but a random location
• The randomness is due to X̄
• Sampling distribution:

1 1
L ∼ N µ − 1.96 √ ,
n n

1 1
U ∼ N µ + 1.96 √ ,
n n
1
U − L = 2 × 1.96 √
n

4
• Can write it more formally as a bivariate normal distribution:
" # !
µ − 1.96 √1n 1 1

L 1
∼ N2 ,
U µ + 1.96 √1n n 1 1

• Note that here we have perfect correlation, cor(L, U ) = 1

• Usually, easier and more useful to think about realisations of the actual interval. . .

Interpretation
• This interval estimator is a random interval and is calculable from our sample. The parameter is fixed and
unknown.
• Before the sample is taken, the probability the random interval contains µ is 95%.
• After the sample is taken, we have a realised interval. It no longer has a probabilistic interpretation; it either
contains µ or it doesn’t.
• This makes the interpretation somewhat tricky. We argue simply that it would be unlucky if our interval did
not contain µ.
• In this example, the interval happens to be of the form, est ± error. This will be the case for many of the
confidence intervals we derive.

Example (more general)

Random sample (iid): X1 , . . . , Xn ∼ N(µ, σ 2 ), and assume that we know the value of σ 2 .
2
The sampling distribution of the sample mean is X̄ ∼ N(µ, σn ). Let Φ−1 (1 − α/2) = c, so we can write:

X̄ − µ
Pr −c < √ <c =1−α
σ/ n

or, equivalently,
σ σ
Pr µ − c √ < X̄ < µ + c √ =1−α
n n

Rearranging gives:
σ σ
Pr X̄ − c √ < µ < X̄ + c √ =1−α
n n

5
The following random interval contains µ with probability 1 − α:
σ σ
(X̄ − c √ , X̄ + c √ )
n n

Observe x̄ and construct the interval. This gives a 100 · (1 − α)% confidence interval for the population mean µ.

Worked example

Suppose X ∼ N(µ, 362 ) represents the lifetime of a light bulb, in hours. Test 27 bulbs, observe x̄ = 1478.
Let c = Φ−1 (0.975). A 95% confidence interval for µ is:

σ 36
x̄ ± c √ = 1478 ± 1.96 √ = [1464, 1492]
n 27

In other words, we have good evidence that the mean lifetime for a light bulb is approximately 1,460–1,490 hours.

Example (CLT approximation)

√
• If the distribution is not normal, we can use the Central Limit Theorem if n is large enough, (X̄ − µ)/(σ/ n) ≈
N (0, 1)
• Example: X is the amount of orange juice consumed (g/day) by an Australian. Know σ = 96. Sampled 576
Australians and found x̄ = 133 g/day.
• An approximate 90% CI for the mean amount of orange juice consumed by an Australian, regardless of the
underlying distribution for individual orange juice consumption, is:

96
133 ± 1.645 √ = [126, 140]
576

• In some studies, n is small because observations are expensive.

Shaded
Probability is 90%

-1.645 1.645

3.2 Definition

Definitions
• An interval estimate is a pair of statistics defining an interval that aims to convey an estimate (of a parameter)
with uncertainty.
• A confidence interval is an interval estimate constructed such that the corresponding interval estimator has a
specified probability, known as the confidence level, of containing the true value of the parameter being estimated.
• We often use the abbreviation CI for ‘confidence interval’.

6
General technique for deriving a CI
• Start with an estimator, T , whose sampling distribution is known
• Write the central probability interval based on its sampling distribution,

Pr (π0.025 < T < π0.975 ) = 0.95

• The endpoints will depend on the parameter, θ, so can write it as,

Pr (a(θ) < T < b(θ)) = 0.95

• Invert it to get a random interval for the parameter,

Pr b−1 (T ) < θ < a−1 (T ) = 0.95

• Substitute observed value, t, to get an interval estimate,

b−1 (t), a−1 (t)

Challenge problem (exponential distribution)

Take a random sample of size n from an exponential distribution with rate parameter λ.
1. Derive an exact 95% confidence interval for λ.
2. Suppose your sample is of size 9 and has sample mean 3.93.
(a) What is your 95% confidence interval for λ?
(b) What is your 95% confidence interval for the population mean?
3. Repeat the above using the CLT approximation (rather than an exact interval).

Recap
• A point estimate is a single number that is our ‘best guess’ at the true parameter value. In other words, it is
meant to be the ‘most plausible’ value for the parameter, given the data.
• However, this doesn’t allow us to adequately express our uncertainty of this estimate.
• An interval estimate aims to provide a range of values that are plausible based on the observed data. This
allows us to more adequately express our uncertainty of the estimate, by giving an indication of the various
plausible alternative true values.
• The most common type of interval estimate is a confidence interval.

Graphical presentation of CIs

Draw CIs as ‘error bars’

There are various graphical styles that people use

● ● ● ●

7
Width of CIs

The width of a CI is controlled by various factors:

• inherent variation in the data
• choice of estimator
• confidence level
• sample size
For example, the width for the normal distribution example was:
σ
2c √
n
where c = Φ−1 (1 − α/2).

Interpreting CIs
• Narrower width usually indicates stronger/greater evidence about the plausible true values for the parameter
being estimated
• Very wide CI ⇒ usually cannot conclude much other than that we have insufficient data
• Moderately wide CI ⇒ conclusions often depend on the location of the interval
• Narrow CI ⇒ more confident about the possible true values, often can be more conclusive
• What constitutes ‘wide’ or ‘narrow’, and how conclusive/useful the CI actually is, will depend on the context of
the study question

3.3 Important distributions

Three important distributions

• χ2 -distribution
• t-distribution
• F -distribution

Chi-squared distribution
• Also written in text as χ2 -distribution
• Single parameter: k > 0, known as the degrees of freedom
• Notation: T ∼ χ2k or T ∼ χ2 (k)
• The pdf is:
k t
t 2 −1 e− 2
f (t) = k , t≥0
2 2 Γ( k2 )

• Mean and variance:

E(T ) = k
var(T ) = 2k

• The distribution is bounded below by zero and is right-skewed

• Arises as the sum of iid standard normal rvs:
Zi ∼ N(0, 1) ⇒ T = Z12 + · · · + Zk2 ∼ χ2k

• When sampling from a normal distribution, the sample variance follows a χ2 -distribution:
(n − 1)S 2
∼ χ2n−1
σ2

8
Student’s t-distribution
• Also known as simply the t-distribution
• Single parameter: k > 0, the degrees of freedom (same as for χ2 )
• Notation: T ∼ tk or T ∼ t(k)
• The pdf is:
− k+1
Γ( k+1 ) t2
2

f (t) = √ 2 k 1+ , −∞ < t < ∞

kπ Γ( 2 ) k

• Mean and variance:

E(T ) = 0, if k > 1
k
var(T ) = , if k > 2
k−2
• The t-distribution is similar to a standard normal but with ‘wide’ tails
• As k → ∞, then tk → N(0, 1)
• If Z ∼ N(0, 1) and U ∼ χ2 (r) then
Z
T =p ∼ tr
U/r

• This arises when considering the sampling distributions of statistics from a normal distribution, in particular:
X̄−µ
√
σ/ n X̄ − µ
T =q = √ ∼ tn−1
(n−1)S 2 S/ n
σ2 /(n − 1)

F -distribution
• Also known as the Fisher-Snedecor distribution
• Parameters: m, n > 0, the degrees of freedom (same as before)
• Notation: W ∼ Fm,n or W ∼ F(m, n)
• If U ∼ χ2m and V ∼ χ2n are independent then

U/m
F = ∼ Fm,n
V /n

• This arises when comparing sample variances (see later)

3.4 Pivots

Pivots

Recall our general technique that starts with a probability interval using a statistic with a known sampling distribution:

Pr (a(θ) < T < b(θ)) = 0.95

The easiest way to make this technique work is by finding a function of the data and the parameters, Q(X1 , . . . , Xn ; θ),
whose distribution does not depend on the parameters. In other words, it is a random variable that has the same
distribution regardless of the value of θ.
The quantity Q(X1 , . . . , Xn ; θ) is called a pivot or a pivotal quantity.

9
Remarks about pivots
• The value of the pivot can depend on the parameters, but its distribution cannot.
• Since pivots are a function of the parameteres as well as the data, they are usually not statistics.
• If a pivot is also a statistic, then it is called an ancillary statistic.

Examples of pivots
• We have already seen the following result for sampling from a normal distribution with known variance:

X̄ − µ
Z= √ ∼ N(0, 1).
σ/ n

Therefore, Z is a pivot in this case.

• If we know the distribution of the pivot, we can use it to write a probability interval, and start deriving a
confidence interval.
• For example, in the normal case with known variance,

X̄ − µ
Pr a < √ <b
σ/ n

where a and b are fixed values that do not depend on µ.

3.5 Common scenarios

Common scenarios: overview

Normal distribution:
• Inference for a single mean
– Known σ
– Unknown σ
• Comparison of two means
– Known σ
– Unknown σ
– Paired samples
• Inference for a single variance
• Comparison of two variances
Proportions:
• Inference for a single proportion
• Comparison of two proportions

Normal, single mean, known σ

Random sample (iid): X1 , . . . , Xn ∼ N(µ, σ 2 ), and assume that we know the value of σ.
We’ve seen this scenario already in previous examples.
Use the pivot:
X̄ − µ
Z= √ ∼ N(0, 1).
σ/ n

10
Normal, single mean, unknown σ

Random sample (iid): X1 , . . . , Xn ∼ N(µ, σ 2 ), and σ is unknown.

A pivot for µ in this case is:
X̄−µ
√
σ/ n X̄ − µ
T =q = √ ∼ tn−1
(n−1)S 2 S/ n
σ2 /(n − 1)
where tn−1 is the t-distribution with n − 1 degrees of freedom.
Now proceed as before.
Given α, let c be the (1 − α/2) quantile of tn−1 . We then write:

X̄ − µ
Pr −c < √ < c = 1 − α.
S/ n

Rearranging gives:
S S
Pr X̄ − c √ < µ < X̄ + c √ =1−α
n n

and for observed x̄ and s, a 100 · (1 − α)% confidence interval for µ is

s s
x̄ − c √ , x̄ + c √ .
n n

Example (normal, single mean, unknown σ)

X ∼ N(µ, σ 2 ) is the amount of butterfat produced by a cow. Examining n = 20 cows results in x̄ = 507.5 and
s = 89.75. Let c be the 0.95 quantile of t19 , we have c = 1.729. Therefore, a 90% confidence interval for µ is,

89.75
507.50 ± 1.729 √ = [472.80, 542.20]
20

> butterfat
[1] 481 537 513 583 453 510 570 500 457 555 618 327
[13] 350 643 499 421 505 637 599 392

> t.test(butterfat, conf.level = 0.9)

One Sample t-test

data: butterfat
t = 25.2879, df = 19, p-value = 4.311e-16
alternative hypothesis: true mean is not equal to 0
90 percent confidence interval:
472.7982 542.2018
sample estimates:
mean of x
507.5

> sd(butterfat)
[1] 89.75082

> qqnorm(butterfat, main = "")

> qqline(butterfat, probs = c(0.25, 0.75))
This gives us the following QQ plot. . .

11
650
600
550
Sample Quantiles

500
450
400
350
−2 −1 0 1 2

Theoretical Quantiles

Remarks
• CIs based on a t-distribution (or a normal distribution) are of the form:

estimate ± c × standard error

for an appropriate quantile, c, which depends on the sample size (n) and the confidence level (1 − α).
• The t-distribution is appropriate if the sample is from a normally distributed population.
• Can check using a QQ plot (in this example, looks adequate).
• If not normal but n is large, can construct approximate CIs using the normal distribution (as we did in a previous
example). This is usually okay if the distribution is continuous, symmetric and unimodal (i.e. has a single ‘mode’,
or maximum value).
• If not normal and n small, distribution-free methods can be used. We will cover these later in the semester.

Normal, two means, known σ

Suppose we have two populations, with means µX and µY , and want to know how much they differ.
2
Random samples (iid) from each population: X1 , . . . , Xn ∼ N(µX , σX ) and Y1 , . . . , Ym ∼ N(µY , σY2 )
The two samples must be independent of each other.
2
Assume σX and σY2 are known. Then we have the following pivot (why?):

X̄ − Ȳ − (µX − µY )
q 2 2
∼ N(0, 1)
σX σY
n + m

Defining c as in previous examples, we then write,

 
X̄ − Ȳ − (µ − µ )
Pr −c < q 2 X 2 Y < c = 1 − α
σX σY
n + m

Rearranging as usual gives the 100 · (1 − α)% confidence interval for µX − µY as

r
2
σX σ2
x̄ − ȳ ± c + Y
n m

. . . but it is rare to know the population variances!

12
Normal, two means, unknown σ, many samples

2
What if we don’t know σX and σY2 ?
If n and m are large, we can just replace σX and σY by estimates, e.g. the sample standard deviations SX and SY .
Rationale: these will be good estimates when the sample size is large.
The (approximate) pivot is then:
X̄ − Ȳ − (µX − µY )
q 2 2
≈ N(0, 1)
SX SY
n + m

This gives the following (approximate) CI for µX − µY :

r
s2X s2
x̄ − ȳ ± c + Y
n m

Normal, two means, unknown σ, common variance

But what if the sample sizes are small?

2
If we assume a common variance, σX = σY2 = σ 2 , we can find a pivot, as follows.
Firstly,
X̄ − Ȳ − (µX − µY )
Z= q ∼ N(0, 1)
σ2 σ2
n + m

Also, since the samples are independent,

2
(n − 1)SX (m − 1)SY2
U= + ∼ χ2n+m−2
σ2 σ2
because U is the sum of independent χ2 random variables.
Moreover, U and Z are independent. So we can write,
Z
T =p ∼ tn+m−2
U/(n + m − 2)

Substituting and rearranging gives,

X̄ − Ȳ − (µX − µY )
T = q
SP n1 + m 1

where s
2 + (m − 1)S 2
(n − 1)SX Y
SP =
n+m−2
is the pooled estimate of the common variance.
Note that the unknown σ has disappeared (cancelled out), therefore making T a pivot (why?).
We can now find the quantile c so that
Pr(−c < T < c) = 1 − α
and rearranging as usual gives a 100 · (1 − α)% confidence interval for µX − µY :
r
1 1
x̄ − ȳ ± c · sP +
n m
where s
(n − 1)s2X + (m − 1)s2Y
sP =
n+m−2

13
Example (normal, two means, unknown common variance)

Two independent groups of students take the same test. Assume the scores are normally distributed and have a
common unknown population variance.
We have sample sizes n = 9 and m = 15, and get the following summary statistics: x̄ = 81.31, ȳ = 78.61, s2x = 60.76,
s2y = 48.24.
The pivot has df 9 + 15 − 2 = 22 degrees of freedom. Using the 0.975 quantile of t22 , which is 2.074, the 95% confidence
interval is:
r r
8 × 60.76 + 14 × 48.24 1 1
81.31 − 78.61 ± 2.074 +
22 9 15
= [−3.65, 9.05]

t density 22 df

Shaded
Probability is
95%

-2.074 2.074

Normal, two means, unknown σ, different variances

2
What if the sample sizes are small and pretty sure that σX 6= σY2 ?
Then we can use Welch’s approximation:

X̄ − Ȳ − (µX − µY )
W = q 2 2
SX SY
n + m

which approximately follows a tr -distribution with degrees of freedom given by:

2 2
2
SX SY
n + m
r= SX4 4
SY
n2 (n−1) + m2 (m−1)

This is often the default for constructing confidence intervals.

Example (normal, two means, unknown different variances)

We measure the force required to pull wires apart for two types of wire, X and Y . We take 20 measurements for each
wire.
1 2 3 4 5 6 7 8 9 10
X 28.8 24.4 30.1 25.6 26.4 23.9 22.1 22.5 27.6 28.1
Y 14.1 12.2 14.0 14.6 8.5 12.6 13.7 14.8 14.1 13.2

11 12 13 14 15 16 17 18 19 20
X 20.8 27.7 24.4 25.1 24.6 26.3 28.2 22.2 26.3 24.4
Y 12.1 11.4 10.1 14.2 13.6 13.1 11.9 14.8 11.1 13.5

14
30
25
20
15
10
X Y

Some heavily edited R output. . .

Different variances:
> t.test(X, Y,
+ conf.level = 0.95)

t = 18.8003
df = 33.086
95% CI: 11.23214 13.95786
Pooled variance:
> t.test(X, Y,
+ conf.level = 0.95,
+ var.equal = TRUE)

t = 18.8003
df = 38
95% CI: 11.23879 13.95121

Remarks
• From box plots: look like very different population means and possibly different spreads
• The Welch approximate t-distribution is appropriate so a 95% confidence interval is 11.23–13.96
• If we assumed equal variances, the confidence interval becomes slightly narrower, 11.24–13.95
• Not a big difference!

Normal, paired samples

• As before, we are interested in the difference between the means of two sets of observations, µD = µX − µY
• This time, we observe the measurements in pairs, (X1 , Y1 ), . . . , (Xn , Yn )
• Each pair is observed independently of each other pair, but the members of each pair could be related
• We can exploit this extra information (the relationship within pairs) to both simplify and improve our estimate
• Let Di = Xi − Yi be the differences of each pair
2
• Often reasonable to assume Di ∼ N(µD , σD )
• We can now use our method of inference for a single mean!
• A 100 · (1 − α)% confidence interval for µD is:
sd
d¯ ± c √
n
where c is the 1 − α/2 quantile of tn−1 .

15
Example (normal, paired samples)

The reaction times (in seconds) to a red or green light for 8 people are given in the following table. Find a 95% CI for the mean
difference in reaction time.
Red (X) Green (Y ) D =X −Y
1 0.30 0.24 0.06
2 0.43 0.27 0.16
3 0.23 0.36 −0.13
4 0.32 0.41 −0.09
5 0.41 0.38 0.03
6 0.58 0.38 0.20
7 0.53 0.51 0.02
8 0.46 0.61 −0.15

Summary statistics: n = 8, d¯ = 0.0125, sd = 0.129

95% CI:
0.129
0.0125 ± 2.365 √
8
= [−0.095, 0.12]

(2.365 is the 0.975 quantile of t7 )

Normal, single variance

Random sample (iid): X1 , . . . , Xn ∼ N(µ, σ 2 )

This time we wish to infer σ, rather than µ
A pivot for σ is:
(n − 1)S 2
∼ χ2n−1
σ2

Now we need the α/2 and 1 − α/2 quantiles of χ2n−1 . Call these a and b. In other words,

(n − 1)S 2

Pr a < < b =1−α
σ2

Rearranging gives

a 1 b
1−α = Pr < <
(n − 1)S 2 σ2 (n − 1)S 2
(n − 1)S 2 (n − 1)S 2

= Pr < σ2 <
b a

So a 100 · (1 − α)% confidence interval is

(n − 1)s2 (n − 1)s2

,
b a

Example (normal, single variance)

Sample n = 13 seeds from a N(µ, σ 2 ) population.

Observe a mean sprouting time of x̄ = 18.97 days, and sample variance s2 = 128.41/12.
A 90% confidence interval for σ 2 is:
128.41 128.41
, = [6.11, 24.6]
21.03 5.226
with the 0.05 and 0.95 quantiles from a χ212 distribution being 5.226 and 21.03.

16
Normal, two variances

Now we wish to compare the variances of two normally distributed populations. Random samples (iid) from each
2
population: X1 , . . . , Xn ∼ N(µX , σX ) and Y1 , . . . , Ym ∼ N(µY , σY2 )
2
We will compute a confidence interval for σX /σY2 . Start by defining:
2
2
h i
SY (m−1)SY
2
σY 2
σY
/(m − 1)
2
SX
= h 2
(n−1)SX
i
σ2 σ2
/(n − 1)
X X

This is the ratio of independent χ2 random variables divided by their degrees of freedom and hence has an Fm−1,n−1
distribution. This doesn’t depend on the parameters and is thus a pivot.
We now need the α/2 and 1 − α/2 quantiles of Fm−1,n−1 . Call these c and d. In other words,

SY2 /σY2 S2 σ2 S2
1 − α = Pr(c < 2 2 < d) = Pr(c X
2 < X
2 <d X )
SX /σX SY σY SY2

2
Rearranging gives the 100 · (1 − α)% confidence interval for σX /σY2 as
2
s2

s
c x2 , d x2
sy sy

Example (normal, two variances)

Continuing from the previous example, n = 13 and 12s2x = 128.41. A sample of m = 9 seeds from a second strain
gave 8s2y = 36.72.
The 0.01 and 0.99 quantiles of F8,12 are 0.176 and 4.50.
2
Then a 98% confidence interval for σX /σY2 is

128.41/12 128.41/12
0.176 , 4.50 = [0.41, 10.49]
36.72/8 36.72/8

Not very useful! Too wide.

Single proportion
• Observe n Bernoulli trials with unknown probability p of success
• We want a confidence interval for p
• Recall that the sample proportion of successes p̂ = X̄ is the maximum likelihood estimator for p and is unbiased
for p
• The central limit theorem shows for large n,
X − np p̂ − p
p =p ≈ N (0, 1)
np(1 − p) p(1 − p)/n

• Rearranging the corresponding probability statement as usual and estimating p by p̂ gives the approximate
100 · (1 − α)% confidence interval as r
p̂(1 − p̂)
p̂ ± c
n

17
Example (single proportion)
• In the Newspoll of 3rd April 2017, 36% of 1,708 voters sampled said they would vote for the Government first
if an election were held on that day. What is a 95% confidence interval for the population proportion of voters
who would vote for the Government first?
• The sample proportion has an approximate normal distribution since the sample size is large so the required
confidence interval is: r
0.36 × 0.64
0.36 ± 1.96 = [0.337, 0.383]
1708
• It might be nice to round to the nearest percentage for this example. This gives us the final interval: 34%–38%

Example 2 (single proportion)

• In a survey, y = 185 out of n = 351 voters favour a particular candidate. Note that 185/351 = 0.527. An
approximate 95% confidence interval for the proportion of the population supporting the candidate is
r
0.527 × 0.573
0.527 ± 1.96 = [0.475, 0.579]
351

• The candidate is not guaranteed to win despite p̂ > 0.5!

Two proportions
• We now wish to compare proportions between two different samples: Y1 ∼ Bi(n1 , p1 ), Y2 ∼ Bi(n2 , p2 )
• Use the approximate pivot
p̂ − p̂2 − (p1 − p2 )
q1 ≈ N(0, 1)
p1 (1−p1 ) p2 (1−p2 )
n1 + n2

• This gives the approximate CI s

p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
p̂1 − p̂2 ± c +
n1 n2

Example (two proportions)

Following on from the previous Newspoll example. . .

• At the previous poll, with 1,824 voters sampled, there were 37% of voters who reported that they would vote for
the Government first. Has the vote dropped? What is a 90% confidence interval for the difference in proportions
in the population on the two occasions?
• The CI is r
0.36 × 0.64 0.37 × 0.63
0.36 − 0.37 ± 1.6449 + = [−0.037, 0.017]
1708 1824
• This interval comfortably surrounds 0, meaning there is no evidence of a change in public opinion.
• This analysis allows for sampling variability in both polls, so is the preferred way to infer whether the vote has
dropped.

Example 2 (two proportions)

Two detergents. First successful in 63 out of 91 trials, the second in 42 out of 79.
Summary statistics: p̂1 = 0.692, p̂2 = 0.532
90% confidence interval for the difference in proportions is:
r
0.692 × 0.308 0.532 × 0.468
0.692 − 0.532 ± 1.645 +
91 79
= [0.038, 0.282]

Very wide! Need greater sample size to get more certainty.

YSQ-S3 Questionnaire
89% (19)
YSQ-S3 Questionnaire
3 pages
AVTC7 - Lesson 15 - Task 1 - Multiple Charts - Before Class
No ratings yet
AVTC7 - Lesson 15 - Task 1 - Multiple Charts - Before Class
13 pages
Guidelines For Reliability Based Design
100% (1)
Guidelines For Reliability Based Design
236 pages
Adam Huseynli Task 1
No ratings yet
Adam Huseynli Task 1
32 pages
GRE Math ERROR Log
No ratings yet
GRE Math ERROR Log
7 pages
Simon 词伙
No ratings yet
Simon 词伙
14 pages
CAE Result New Wordlist Unit 1
No ratings yet
CAE Result New Wordlist Unit 1
8 pages
MAST20005 Module01 Notes
No ratings yet
MAST20005 Module01 Notes
20 pages
MATH 136 1015 Final - Exam
No ratings yet
MATH 136 1015 Final - Exam
13 pages
© Barish Namazov: A Calculator Is Allowed On The Following Questions
100% (1)
© Barish Namazov: A Calculator Is Allowed On The Following Questions
6 pages
SAT Test Prep!
From Everand
SAT Test Prep!
L Mohan Arun
1/5 (1)
Worksheet Redox Kohes Year 11
100% (1)
Worksheet Redox Kohes Year 11
2 pages
Polyflow Extrusion WS06 Inverse Extrusion
No ratings yet
Polyflow Extrusion WS06 Inverse Extrusion
26 pages
Interval Estimation: Part 2: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
No ratings yet
Interval Estimation: Part 2: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
9 pages
Module07 Notes
No ratings yet
Module07 Notes
14 pages
Module05 Notes
No ratings yet
Module05 Notes
19 pages
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
No ratings yet
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
12 pages
MAST20005 Statistics Assignment 3
No ratings yet
MAST20005 Statistics Assignment 3
8 pages
Wordlist (PDF) Split Merge
100% (1)
Wordlist (PDF) Split Merge
24 pages
IELTS Vocabulary
No ratings yet
IELTS Vocabulary
23 pages
Tutprac 1
No ratings yet
Tutprac 1
8 pages
اوراق عمل انجليزي ثالث متوسط منهج Super Goal 5 الوحدة 1 2 3 4 5 6 ف1 1436 1437
No ratings yet
اوراق عمل انجليزي ثالث متوسط منهج Super Goal 5 الوحدة 1 2 3 4 5 6 ف1 1436 1437
3 pages
CLOZE TEST FOR POST by Kundan PDF
100% (1)
CLOZE TEST FOR POST by Kundan PDF
52 pages
IELTS Writing Tips For Academic Task 1
No ratings yet
IELTS Writing Tips For Academic Task 1
4 pages
Past Simple Verb Ed: It Is Time That If Only
No ratings yet
Past Simple Verb Ed: It Is Time That If Only
1 page
Childhood - Lesson Notes
No ratings yet
Childhood - Lesson Notes
9 pages
IELTS Writing Task 1
0% (1)
IELTS Writing Task 1
2 pages
B2PLUS Wordlist English
No ratings yet
B2PLUS Wordlist English
62 pages
Cambridge Practice Test For IELTs 11 Academic
No ratings yet
Cambridge Practice Test For IELTs 11 Academic
144 pages
Dokumen - Tips Cambridge Vocabulary For Ielts Advanced Band65plus With Answers and Audio CD Contents2pdf
No ratings yet
Dokumen - Tips Cambridge Vocabulary For Ielts Advanced Band65plus With Answers and Audio CD Contents2pdf
1 page
1 41篇Simon雅思小作文满分模板
No ratings yet
1 41篇Simon雅思小作文满分模板
41 pages
Advanced Writing
No ratings yet
Advanced Writing
2 pages
IELTS Speaking Linking Words
No ratings yet
IELTS Speaking Linking Words
8 pages
Ielts Writing Task 2
No ratings yet
Ielts Writing Task 2
5 pages
Languagecert Test of English (Lte) A1-C2 Examination Writing Practice Paper 1
No ratings yet
Languagecert Test of English (Lte) A1-C2 Examination Writing Practice Paper 1
7 pages
Writing Task-1 Diagram (Vocabulary + Template)
No ratings yet
Writing Task-1 Diagram (Vocabulary + Template)
3 pages
5-8月雅思题库题卡版
No ratings yet
5-8月雅思题库题卡版
34 pages
CAE Listening Part 4
No ratings yet
CAE Listening Part 4
2 pages
Vocabulary Word List COMPLETE IELS BAND 6.5-7.5
No ratings yet
Vocabulary Word List COMPLETE IELS BAND 6.5-7.5
8 pages
Vocabulary For Academic IELTS Writing Task 1 (Part 4)
No ratings yet
Vocabulary For Academic IELTS Writing Task 1 (Part 4)
12 pages
Spectral Methods For Differential Problems
No ratings yet
Spectral Methods For Differential Problems
167 pages
Writing Task 2 - Advantages Disadvantages
No ratings yet
Writing Task 2 - Advantages Disadvantages
5 pages
IELTS FOUNDATION - Study - Skills (Dragged)
No ratings yet
IELTS FOUNDATION - Study - Skills (Dragged)
7 pages
Core dSAT Vocabulary - Set 4
No ratings yet
Core dSAT Vocabulary - Set 4
2 pages
Advanced Grammar For IELTS
No ratings yet
Advanced Grammar For IELTS
18 pages
AAS 2024 Intake - Country Profile
No ratings yet
AAS 2024 Intake - Country Profile
3 pages
Speaking Part 3 First
No ratings yet
Speaking Part 3 First
12 pages
GRE PREP Ten Steps To Success
No ratings yet
GRE PREP Ten Steps To Success
4 pages
Vocabulary For Academic IELTS Writing Task 1 (Part 3)
No ratings yet
Vocabulary For Academic IELTS Writing Task 1 (Part 3)
16 pages
Useful GRE Formulae
100% (1)
Useful GRE Formulae
22 pages
Ielts Writing Task 2
No ratings yet
Ielts Writing Task 2
52 pages
Ielts Speaking Sample Questions
No ratings yet
Ielts Speaking Sample Questions
17 pages
IELTS Journal Writing Task 2
No ratings yet
IELTS Journal Writing Task 2
130 pages
GRE 30 Day Study Plan - CrunchPrep. Weekly
No ratings yet
GRE 30 Day Study Plan - CrunchPrep. Weekly
1 page
Vocabulary / Test 7 (60 Adet Soru) Başarmak Için Yesdđl!
No ratings yet
Vocabulary / Test 7 (60 Adet Soru) Başarmak Için Yesdđl!
5 pages
Idioms GMAT
No ratings yet
Idioms GMAT
11 pages
GRE High Frequency Words
100% (1)
GRE High Frequency Words
50 pages
Academic Test3 PDF
33% (3)
Academic Test3 PDF
13 pages
IELTS Tips & Strategies
No ratings yet
IELTS Tips & Strategies
15 pages
The Guide To IELTS Academic Writing Task 1
No ratings yet
The Guide To IELTS Academic Writing Task 1
24 pages
PACKHAM'S SHIPPING AGENCY - Customer Quotation Form: Example
No ratings yet
PACKHAM'S SHIPPING AGENCY - Customer Quotation Form: Example
12 pages
Blocks 5 & 6: The Firm
No ratings yet
Blocks 5 & 6: The Firm
14 pages
Quantitative Diagnostic Quiz Part 1
100% (1)
Quantitative Diagnostic Quiz Part 1
1 page
HW12 Sol
No ratings yet
HW12 Sol
9 pages
Gates Grade5
No ratings yet
Gates Grade5
2 pages
MAST20005 Statistics Assignment 2
No ratings yet
MAST20005 Statistics Assignment 2
9 pages
MAST20005 Statistics Assignment 1
No ratings yet
MAST20005 Statistics Assignment 1
10 pages
Year 3 (Entry Into Year 4) 25 Hour Revision Booklet English
No ratings yet
Year 3 (Entry Into Year 4) 25 Hour Revision Booklet English
117 pages
Grade 6 Find Percentage of A Quantity: Fill in The Blanks
No ratings yet
Grade 6 Find Percentage of A Quantity: Fill in The Blanks
3 pages
Esters Worksheet PDF
No ratings yet
Esters Worksheet PDF
3 pages
Tut 1
No ratings yet
Tut 1
2 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Worksheet Redox Kohes Year 11-Answers
No ratings yet
Worksheet Redox Kohes Year 11-Answers
6 pages
Year 2 Independent Writing Activities
100% (1)
Year 2 Independent Writing Activities
42 pages
Year 5 Maths Worksheets
No ratings yet
Year 5 Maths Worksheets
2 pages
Student Subscriber: Application For Admission As A
No ratings yet
Student Subscriber: Application For Admission As A
3 pages
VCE Chemistry Unit 1 Revision The Mole Concept
No ratings yet
VCE Chemistry Unit 1 Revision The Mole Concept
4 pages
Worksheet 0001 Treble Clef Notes PDF
No ratings yet
Worksheet 0001 Treble Clef Notes PDF
1 page
All About That Bass
No ratings yet
All About That Bass
12 pages
Year 11 Coordinate Geometry Worksheet 2
No ratings yet
Year 11 Coordinate Geometry Worksheet 2
1 page
Differential Evolution
No ratings yet
Differential Evolution
11 pages
TMF632 Party Conformance
No ratings yet
TMF632 Party Conformance
26 pages
Aspen RMSV14 Release Notes
No ratings yet
Aspen RMSV14 Release Notes
37 pages
Practical Statistical Training Using SPSS Software: Trainer - Hailegebriel Yirdaw
No ratings yet
Practical Statistical Training Using SPSS Software: Trainer - Hailegebriel Yirdaw
35 pages
Tips & Tricks
No ratings yet
Tips & Tricks
17 pages
Fullprof Ps
No ratings yet
Fullprof Ps
34 pages
Space and Geometry Dissertation
No ratings yet
Space and Geometry Dissertation
60 pages
Iannis Xenakis - Musique Stochastique
No ratings yet
Iannis Xenakis - Musique Stochastique
14 pages
v2 AC4772 Autocad Parametrics
No ratings yet
v2 AC4772 Autocad Parametrics
36 pages
6.2 TOI OTM Planning Functional Overview
No ratings yet
6.2 TOI OTM Planning Functional Overview
22 pages
Analysis of Intercity Bus Public Transport Safety Perception Modeling Using Conjoint
No ratings yet
Analysis of Intercity Bus Public Transport Safety Perception Modeling Using Conjoint
7 pages
BookSlides 11 The Art of Machine Learning For Predictive Data Analytics
No ratings yet
BookSlides 11 The Art of Machine Learning For Predictive Data Analytics
27 pages
jonsdottir2019
No ratings yet
jonsdottir2019
5 pages
Nokia MHA Parameters
No ratings yet
Nokia MHA Parameters
38 pages
Coursework Zaznobin
No ratings yet
Coursework Zaznobin
29 pages
Zivot - Lectures On Structural Change PDF
No ratings yet
Zivot - Lectures On Structural Change PDF
14 pages
Technological Changes Brought by bIM To Facade Design PDF
No ratings yet
Technological Changes Brought by bIM To Facade Design PDF
21 pages
Sensitivity PDF
No ratings yet
Sensitivity PDF
3 pages
Method of Moments
No ratings yet
Method of Moments
10 pages
Hydrocyclone Design AI
No ratings yet
Hydrocyclone Design AI
57 pages
Parametric and Non Parametric Test c1
No ratings yet
Parametric and Non Parametric Test c1
46 pages
7.1 Hwork
No ratings yet
7.1 Hwork
7 pages
STA6166 HW1 Ramin Shamshiri Solution
No ratings yet
STA6166 HW1 Ramin Shamshiri Solution
9 pages
Chapter6 01
No ratings yet
Chapter6 01
65 pages
Tutorial_Data_Analysis_With_CANape_EN
No ratings yet
Tutorial_Data_Analysis_With_CANape_EN
42 pages
WOOFF Parametric Urbanism
No ratings yet
WOOFF Parametric Urbanism
134 pages
ME780A 2016 Fall Outline
No ratings yet
ME780A 2016 Fall Outline
4 pages