Lecture 7 With Solutions1
Lecture 7 With Solutions1
Kavitha Neelangol 1
Lecture 7
1 / 42
Topics covered
2 / 42
Hypothesis Testing
3 / 42
Introduction
Pretend that Snickers (or some other chocolate bar brand) publicly
claims that their product contains, on average, 6 peanuts per bar. Of
course, there is some inherent error at the manufacturing plant — so not
all bars will contain exactly 6.
Say you opened up a bar and counted only 5 peanuts. This might make
you want to investigate whether or not their claim is true.
So you buy 20 more chocolate bars. If you found that the average
number of peanuts was 4 in that sample, you would probably decide that
Snickers is providing false advertising.
What if the average was 5.9 peanuts, or 5.8, or 5.7, 5.6, 5.5...at what
average would we become concerned that Snickers was lying?
4 / 42
Introduction
The lower that probability, the more evidence that they might be lying.
The higher that probability, the less evidence that they might be lying.
5 / 42
General Example of a Hypothesis Test
Suppose it is assumed that the mean of something is 8, but we want to
disprove that and check whether the mean is actually 25. Thus, we wish to
test:
H0 : µ = 8 against HA : µ = 25
So if H0 is true then our observations come from:
6 / 42
General Example of a Hypothesis Test
The key intuition is: how far away from 8 must the observations (x∗ ) be
for us to reject the null hypothesis that the true value of µ is 8?
Suppose our sample gives us x̄= 12. In this case the data is more likely to
come from the first histogram (µ = 8), rather than the one with µ = 25.
Similarly, if x̄= 55 then the data is more likely to have come from the
second histogram (µ = 25), rather than the one with µ = 8.
7 / 42
General Example of a Hypothesis Test
The procedure is to choose a point c (called critical value) and then reject
H0 if our observations are more extreme than c, and not reject H0 if our
observations are less extreme than c.
The area beyond c is known as the Rejection Region.
8 / 42
General Example of a Hypothesis Test
Now let’s think, for a given c, what is the probability that we reject H0 by
mistake (I.e., when the true value for µ was actually 8)?
9 / 42
General Example of a Hypothesis Test
In jargon, the probability of rejecting H0 by mistake is called “type 1
error”, or level of significance. It is expressed with the letter α.
Our problem of finding a meaningful critical value c has been solved: at
first, choose a confidence interval. Then find the point c corresponding to
the interval α. Having c, one only needs to compare it to the observation
(test statistic) x∗ from the data. If x∗ > c, then we reject H0 : µ = 8,
knowing that we might be wrong in doing so with probability α.
The most frequent values used for α are 0.01, 0.05 and 0.1.
10 / 42
General Example of a Hypothesis Test
Recall if x∗ > c then we reject H0 , and if x∗ < c then we do not reject
H0 ,.
Another way of making this decision, is to find the area beyond x∗ and
compare it with α.
The area beyond x∗ is called the p-value.
If p-value< α then we reject H0
If p-value> α we do not reject H0 .
11 / 42
General Steps of a Hypothesis Test
Step 2: Collect data, and calculate the appropriate test statistic (x∗ ).
For each type of test (Ie. For mean, difference of means, proportions, etc)
will have a different test statistic.
12 / 42
The Conjecture
When hypothesis testing we are interested in making an inference about the
population.
Thus we will always be making conjectures about a population parameter (we
will use θ for the general case).
H 0 : θ = θ0
The alternative hypothesis depends on what you wish to test. In this course
we cover 3 different types of alternate hypothesis:
13 / 42
The Test Statistic
As we go through different types of tests, I will show you the formula for the
corresponding test statistic.
Usually we use a Z ∗ or T ∗ to denote the test statistic (depending on which
distribution the estimator follows.)
The general form of a test statistic is:
relevant statistic - hypothesized parameter
test statistic =
standard error of the relevant statistic
relevant statistic - θ0
test statistic =
standard error of the relevant statistic
14 / 42
Decision Rule: Rejection Region
15 / 42
Decision Rule: Rejection Region
If the test statistic follows a Z, the test statistic is denoted with Z ∗ , and:
Test Tail HA Rejection Region
two θ 6= θ0 |Z ∗ | > zα/2
upper θ > θ0 Z ∗ > zα
lower θ < θ0 Z ∗ < −zα
zα/2 , zα , or −zα are the cut off points for the RR, so they are called the critical
value of the test.
If the test statistic follows a Student-t, the test statistic is denoted with T ∗ , and:
Test Tail HA Rejection Region
two θ 6= θ0 |T ∗ | > tα/2
upper θ > θ0 T ∗ > tα
lower θ < θ0 T ∗ < −tα
tα/2 , tα , or −tα are our cut off points for the RR, also known as the critical
value of the test.
16 / 42
Decision Rule: P Value
17 / 42
Decision Rule: P Value
18 / 42
Hypothesis Tests for a Single Mean - when σ is known
Here our parameter of interest is: µ. Which will be used in H0 , and HA .
H0 : µ = µ0 vs HA : µ 6= µ0 or µ > µ0 or µ < µ0
19 / 42
Hypothesis Tests for a Single Mean
Example 8.1: MADD claims that the mean blood alcohol concentration
(BAC) of drunk drivers is over 0.1. To test this claim, a random sample of 36
drivers arrested for drinking and driving was taken and yielded a mean BAC
of 0.19. The population variance of 0.049. Use a 5% significance level to test
the claim with a rejection region.
20 / 42
Hypothesis Tests for a Single Mean
Example 8.2: Using the same info as Example 8.1 test the claim with a
p-value.
= 1 − 0.9927 = 0.0073
p − value = 0.0073 < 0.05 = α
So our results lead us to Reject H0
Thus, there is sufficient evidence to conclude that the mean BAC exceeds 0.1.
21 / 42
Hypothesis Tests for a Single Mean
Example 8.1(b): Using the same info as Example 8.1 test the claim that the
mean differs from 0.2.
Example 8.1(c): Using the same info as Example 8.1 test the claim that the
mean is below 0.25.
= 0.0516
p − value = 0.0516 > 0.05 = α
So our results lead us to Fail to Reject H0
Thus, there is insufficient evidence to conclude that the mean BAC is below
0.25.
23 / 42
Hypothesis Tests for a Single Mean
Thus, Examples 8.1(b) and 8.1(c) are only to demonstrate the different types
of hypothesis tests (two-tailed, lower tailed, upper tailed). It is not good
practice to perform multiple tests on different hypotheses about one
parameter.
24 / 42
Hypothesis Tests for a Single Mean
25 / 42
Hypothesis Tests for a Single Mean - when σ is unknown
Here our parameter of interest is: µ. Which will be used in H0 , and HA .
H0 : µ = µ0 vs HA : µ 6= µ0 or µ > µ0 or µ < µ0
If all three of the above conditions are satisfied then the test statistic is:
x̄ − µ0
T∗ = √
s/ n
which follows a Student-t with n − 1 degrees of freedom.
26 / 42
Hypothesis Tests for a Single Mean
Example 8.3: The lengths of femur bones in newborns is assumed to be
normally distributed. It is claimed that the mean length is 75mm. A sample of
16 newborns had a mean femur length of 74.6mm and standard deviation of
0.81mm. Test with 1% significance, if the true mean differs from 75mm,
using a rejection region.
Solution: Step 1: H0 : µ = 75 HA : µ 6= 75
Step 2: Since n < 30 and σ is unknown, our test stat is:
T ∗ = x̄−µ
√ 0 = 74.6−75
s/ n
√
0.81/ 16
= −1.98
Step 3: Since two tail test our critical value is:
tα/2,n−1 = t0.01/2,16−1 = t0.005,15 = 2.947.
|T ∗ | = | − 1.98| = 1.98 < 2.947 = tα/2,n−1
Then T ∗ is not in the RR. So our results lead us to Fail to Reject H0
Thus, there is insufficient evidence to conclude that the mean differs from
75mm.
27 / 42
Hypothesis Tests for a Single Mean
28 / 42
Hypothesis Tests for a Single Mean
Drawing this out with a bell curve we can see that the P(T > 1.98) must be
between 0.025 and 0.05. (Ie. 0.025 < P(T > 1.98) < 0.05).
29 / 42
Hypothesis Tests for a Single Mean
30 / 42
Hypothesis Tests for a Proportion
H0 : p = p0
vs
HA : p 6= p0 or p > p0 or p < p0
The test statistic when making inference on the population proportion:
p̂ − p0
Z∗ = q
p0 (1−p0 )
n
31 / 42
Hypothesis Tests for a Proportion
Example 8.4: It is believed that less than 20% of engineering students are
female. A random sample of 508 engineering students had 110 female
students. Use a 5% significance level to test if less than 20% of engineering
students are female.
Since lower tail test our p-value is: P(Z < Z ∗ ) = P(Z < 0.93) = 0.8238.
p − val = 0.8238 > 0.05 = α
Since p − val < α. So our results lead us to Fail to Reject H0
33 / 42
Errors of Hypothesis Tests
H0 true HA true
Reject H0 Type I error no error
FTR H0 no error Type II error
34 / 42
Errors of Hypothesis Tests
35 / 42
Errors of Hypothesis Tests
A common way to think about this is with the court systems belief of
“innocent until proven guilty.”
Here we assume the defendant is innocent. (H0 : innocent)
Then we run a test (Ie. have a trial) to try and show that there is enough
evidence to reject H0 and assume that the person is guilty. (HA : guilty)
H0 true HA true
Actually innocent Actually guilty
Reject H0
Evidence of guilt Type I error no error
FTR H0
Not enough evidence no error Type II error
36 / 42
Errors of Hypothesis Tests
H0 true HA true
Actually innocent Actually guilty
Reject H0
Evidence of guilt Type I error no error
FTR H0
Not enough evidence no error Type II error
It is believed that sending an innocent person to jail is far worse than letting a
guilty person free (for non-violent crimes).
Thus, the Type I error is sending an innocent person to jail.
And a Type II error is setting a guilty person free.
Power is the probability of sending a guilty person to jail.
37 / 42
Errors of Hypothesis Tests
H0 : µ = 8 HA : µ = 25
38 / 42
Hypothesis Tests using a CI
39 / 42
Hypothesis Tests using a CI
Example 8.5: A sample of 64 test marks was taken. The sample had mean of
73 and a population standard deviation 6. Test whether the true mean is
different than a 75, use 5% significance (using a CI).
Solution: H0 : µ = 75 HA : µ 6= 75
Since this is a two-sided test, we can use a CI.
α = 0.05, so we will build a 100(1 − α) = 95% CI for µ.
Recall, from Ch 6, since n = 64 > 30 the 95% CI for µ is:
σ 6
x̄ ± zα/2 √ = 73 ± z0.05/2 √
n 64
73 ± 1.96 × 0.75 = (71.53, 74.47)
Since 75, is not in the confidence interval, we are 95% percent confident that
the true mean is not 75. Thus we reject H0 and conclude that there is sufficient
evidence that the true mean is not 75.
40 / 42
Hypothesis Tests using a CI
Example 8.5(b): Use the info in Example 8.5 to test whether the true mean is
different than a 74, use 5% significance (using a CI).
Solution: H0 : µ = 74 HA : µ 6= 74
Since this is a two-sided test, we can use a CI.
α = 0.05, so we will build a 100(1 − α) = 95% CI for µ.
Recall, from Ch 6, since n = 64 > 30 the 95% CI for µ is:
σ 6
x̄ ± zα/2 √ = 73 ± z0.05/2 √
n 64
73 ± 1.96 × 0.75 = (71.53, 74.47)
Since 74, is in the confidence interval, we are not 95% percent confident that
the true mean is not 74. Thus we fail to reject H0 and conclude that there is
insufficient evidence that the true mean is not 74.
41 / 42
Hypothesis Tests with CIs
42 / 42