Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
6 views

Lecture 7 With Solutions1

eddddddddddddddddddddddddddddddddddddd
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Lecture 7 With Solutions1

eddddddddddddddddddddddddddddddddddddd
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Stats 2B03 - Statistical Methods for Science

Kavitha Neelangol 1

1 Department of Mathematics and Statistics, McMaster University

Lecture 7

1 / 42
Topics covered

Today we will be going through the following topics -


Hypothesis testing introduction
Test statistics
Rejection region
P-value
Hypothesis testing for mean(σ known)
Hypothesis testing for mean(σ unknown)
Hypothesis testing for proportion
Type I error
Type II error
Testing using confidence interval

2 / 42
Hypothesis Testing

3 / 42
Introduction

Pretend that Snickers (or some other chocolate bar brand) publicly
claims that their product contains, on average, 6 peanuts per bar. Of
course, there is some inherent error at the manufacturing plant — so not
all bars will contain exactly 6.

Say you opened up a bar and counted only 5 peanuts. This might make
you want to investigate whether or not their claim is true.

So you buy 20 more chocolate bars. If you found that the average
number of peanuts was 4 in that sample, you would probably decide that
Snickers is providing false advertising.

What if the average was 5.9 peanuts, or 5.8, or 5.7, 5.6, 5.5...at what
average would we become concerned that Snickers was lying?

4 / 42
Introduction

A probabilistic way of looking at it:


Assuming that Snickers is not lying, what is the probability that I would
find an average of only 4 peanuts in a sample of 20 bars?

The lower that probability, the more evidence that they might be lying.

The higher that probability, the less evidence that they might be lying.

This is the essence of hypothesis testing.

5 / 42
General Example of a Hypothesis Test
Suppose it is assumed that the mean of something is 8, but we want to
disprove that and check whether the mean is actually 25. Thus, we wish to
test:
H0 : µ = 8 against HA : µ = 25
So if H0 is true then our observations come from:

And if HA is true then our observations come from:

6 / 42
General Example of a Hypothesis Test

The key intuition is: how far away from 8 must the observations (x∗ ) be
for us to reject the null hypothesis that the true value of µ is 8?
Suppose our sample gives us x̄= 12. In this case the data is more likely to
come from the first histogram (µ = 8), rather than the one with µ = 25.
Similarly, if x̄= 55 then the data is more likely to have come from the
second histogram (µ = 25), rather than the one with µ = 8.

7 / 42
General Example of a Hypothesis Test

The procedure is to choose a point c (called critical value) and then reject
H0 if our observations are more extreme than c, and not reject H0 if our
observations are less extreme than c.
The area beyond c is known as the Rejection Region.

8 / 42
General Example of a Hypothesis Test

Now let’s think, for a given c, what is the probability that we reject H0 by
mistake (I.e., when the true value for µ was actually 8)?

9 / 42
General Example of a Hypothesis Test
In jargon, the probability of rejecting H0 by mistake is called “type 1
error”, or level of significance. It is expressed with the letter α.
Our problem of finding a meaningful critical value c has been solved: at
first, choose a confidence interval. Then find the point c corresponding to
the interval α. Having c, one only needs to compare it to the observation
(test statistic) x∗ from the data. If x∗ > c, then we reject H0 : µ = 8,
knowing that we might be wrong in doing so with probability α.
The most frequent values used for α are 0.01, 0.05 and 0.1.

10 / 42
General Example of a Hypothesis Test
Recall if x∗ > c then we reject H0 , and if x∗ < c then we do not reject
H0 ,.
Another way of making this decision, is to find the area beyond x∗ and
compare it with α.
The area beyond x∗ is called the p-value.
If p-value< α then we reject H0
If p-value> α we do not reject H0 .

11 / 42
General Steps of a Hypothesis Test

Step 1: Make a conjecture.


Here we write out our null (H0 ) and alternate (HA ) hypothesis.
H0 is what is assumed to be true.
HA is what we would like to test.

Step 2: Collect data, and calculate the appropriate test statistic (x∗ ).
For each type of test (Ie. For mean, difference of means, proportions, etc)
will have a different test statistic.

Step 3: Based on the test statistic calculated use a decision rule


to see if we reject H0 , or fail to reject H0 .
The decision rule, may be done as a rejection region or a p-value.

12 / 42
The Conjecture
When hypothesis testing we are interested in making an inference about the
population.
Thus we will always be making conjectures about a population parameter (we
will use θ for the general case).

The null hypothesis is always θ equals some assumed value, θ0 .

H 0 : θ = θ0

The alternative hypothesis depends on what you wish to test. In this course
we cover 3 different types of alternate hypothesis:

Option 1: HA : θ 6= θ0 (two-tailed test)


Option 2: HA : θ > θ0 (upper-tailed test)
Option 3: HA : θ < θ0 (lower-tailed test)

13 / 42
The Test Statistic

As we go through different types of tests, I will show you the formula for the
corresponding test statistic.
Usually we use a Z ∗ or T ∗ to denote the test statistic (depending on which
distribution the estimator follows.)
The general form of a test statistic is:
relevant statistic - hypothesized parameter
test statistic =
standard error of the relevant statistic
relevant statistic - θ0
test statistic =
standard error of the relevant statistic

14 / 42
Decision Rule: Rejection Region

Everytime we do a hypothesis test, we need to make a conclusion about the


conjecture.
This decision can either be made from use of a Rejection Region or a p-value.
Here we calculate a rejection region (RR), and:
if our test statistic lies in the RR then we reject H0
if our test statistic lies outside of the RR then we fail to reject H0
The rejection region can be thought of the region of seeing very unlikely
occurences if H0 is true. To find the rejection region we need to know if:

The test stat is from a Z or t distribution.


The tail of the test (determined from HA Ie. upper, lower or two-tail)

15 / 42
Decision Rule: Rejection Region
If the test statistic follows a Z, the test statistic is denoted with Z ∗ , and:
Test Tail HA Rejection Region
two θ 6= θ0 |Z ∗ | > zα/2
upper θ > θ0 Z ∗ > zα
lower θ < θ0 Z ∗ < −zα
zα/2 , zα , or −zα are the cut off points for the RR, so they are called the critical
value of the test.
If the test statistic follows a Student-t, the test statistic is denoted with T ∗ , and:
Test Tail HA Rejection Region
two θ 6= θ0 |T ∗ | > tα/2
upper θ > θ0 T ∗ > tα
lower θ < θ0 T ∗ < −tα
tα/2 , tα , or −tα are our cut off points for the RR, also known as the critical
value of the test.
16 / 42
Decision Rule: P Value

We can also make a decision from calculating a p-value, and comparing it


with our significance level α.
Here we calculate a p-value, and:
if p-value< α then we reject H0
if p-value> α then we fail to reject H0
The p-value is the probability of seeing a more extreme test statistic than the
one observed, if H0 is true.
To find the p-value we need to know if:
The test stat is from a Z or t distribution.
The tail of the test (determined from HA Ie. upper, lower or two-tail)

17 / 42
Decision Rule: P Value

Here are the specified p-values.

If the test statistic follows a Z. So the test statistic is denoted with Z ∗


Test Tail HA P-value
two θ=6 θ0 p-val = 2 × P(Z > |Z ∗ |)
upper θ > θ0 p-val = P(Z > Z ∗ )
lower θ < θ0 p-val = P(Z < Z ∗ )
If the test statistic follows a t. So the test statistic is denoted with T ∗
Test Tail HA P-value
two θ 6= θ0 p-val = 2 × P(t > |T ∗ |)
upper θ > θ0 p-val = P(t > T ∗ )
lower θ < θ0 p-val = P(t < T ∗ )

18 / 42
Hypothesis Tests for a Single Mean - when σ is known
Here our parameter of interest is: µ. Which will be used in H0 , and HA .

H0 : µ = µ0 vs HA : µ 6= µ0 or µ > µ0 or µ < µ0

Assumptions when using a Z test statistic:


1. The sample is a random sample
2. The value of the population standard deviation (σ) is known.
3. Either or both of these conditions are satisfied: The population is
normally distributed, or n > 30.
If all three of the above conditions are satisfied then the test statistic is:
x̄ − µ0
Z∗ = √
σ/ n
which follows a standard normal (N(0, 1)).

19 / 42
Hypothesis Tests for a Single Mean

Example 8.1: MADD claims that the mean blood alcohol concentration
(BAC) of drunk drivers is over 0.1. To test this claim, a random sample of 36
drivers arrested for drinking and driving was taken and yielded a mean BAC
of 0.19. The population variance of 0.049. Use a 5% significance level to test
the claim with a rejection region.

Solution: Step 1: H0 : µ = 0.1 HA : µ > 0.1


x̄−µ
Step 2: Since n ≥ 30 our test stat is: Z ∗ = √0
σ/ n
= √0.19−0.1

0.049/ 36
= 2.44
Step 3: Since upper tail test our critical value is: zα = z0.05 = 1.645.
Z ∗ = 2.44 > 1.645 = zα
Then Z ∗ is in the RR. So our results lead us to Reject H0
Thus, there is sufficient evidence to conclude that the mean BAC exceeds 0.1.

20 / 42
Hypothesis Tests for a Single Mean

Example 8.2: Using the same info as Example 8.1 test the claim with a
p-value.

Solution: Step 1: H0 : µ = 0.1 HA : µ > 0.1


x̄−µ
Step 2: Since n ≥ 30 our test stat is: Z ∗ = √0
σ/ n
= √0.19−0.1

0.049/ 36
= 2.44
Step 3: Since upper tail test our p value is:

p − value = P(Z > Z ∗ ) = P(Z > 2.44) = 1 − P(Z < 2.44)

= 1 − 0.9927 = 0.0073
p − value = 0.0073 < 0.05 = α
So our results lead us to Reject H0
Thus, there is sufficient evidence to conclude that the mean BAC exceeds 0.1.

21 / 42
Hypothesis Tests for a Single Mean
Example 8.1(b): Using the same info as Example 8.1 test the claim that the
mean differs from 0.2.

Solution: Step 1: H0 : µ = 0.2 HA : µ 6= 0.2


x̄−µ
Step 2: Since n ≥ 30 our test stat is: Z ∗ = √0
σ/ n
= √0.19−0.2

0.049/ 36
= −0.27
Step 3: Since two tailed test our p value is:

p − value = 2 × P(Z > |Z ∗ |) = 2 × P(Z > | − 0.27|)

= 2 × P(Z > 0.27) = 2 × (1 − P(Z < 0.27))


= 2 × (1 − 0.6061) = 0.7878
p − value = 0.7878 > 0.05 = α
So our results lead us to Fail to Reject H0
Thus, there is insufficient evidence to conclude that the mean BAC differs
from 0.2.
22 / 42
Hypothesis Tests for a Single Mean

Example 8.1(c): Using the same info as Example 8.1 test the claim that the
mean is below 0.25.

Solution: Step 1: H0 : µ = 0.25 HA : µ < 0.25


x̄−µ
Step 2: Since n ≥ 30 our test stat is: Z ∗ = √0
σ/ n
= √0.19−0.25

0.049/ 36
= −1.63
Step 3: Since this is a lower tailed test our p value is:

p − value = P(Z < Z ∗ ) = P(Z < −1.63)

= 0.0516
p − value = 0.0516 > 0.05 = α
So our results lead us to Fail to Reject H0
Thus, there is insufficient evidence to conclude that the mean BAC is below
0.25.
23 / 42
Hypothesis Tests for a Single Mean

Note: When constructing a hypothesis test you should have H0 and HA


defined prior to collecting data (Ie. finding x̄). Otherwise, it may look like you
are fishing for significant results.

Thus, Examples 8.1(b) and 8.1(c) are only to demonstrate the different types
of hypothesis tests (two-tailed, lower tailed, upper tailed). It is not good
practice to perform multiple tests on different hypotheses about one
parameter.

24 / 42
Hypothesis Tests for a Single Mean

Try on your own:


Exercises:
Redo Examples 8.1 & 8.2 but test that the mean is different than 0.1.
(Ans: Z ∗ = 2.44, p-val = 0.0146)
Redo Examples 8.1 & 8.2 but test that the mean is below 0.15.
(Ans: Z ∗ = 1.08, p-val = 0.8559)
Redo Examples 8.1 & 8.2 but test that the mean is above 0.05.
(Ans: Z ∗ = 3.79, p-val = 0.0001)
Redo Examples 8.1(b) and 8.1(c) with rejection regions instead of
p-values.

25 / 42
Hypothesis Tests for a Single Mean - when σ is unknown
Here our parameter of interest is: µ. Which will be used in H0 , and HA .

H0 : µ = µ0 vs HA : µ 6= µ0 or µ > µ0 or µ < µ0

Assumptions when using a T test statistic:


1. The sample is a random sample
2. The value of the population standard deviation (σ) is NOT known.
3. Either or both of these conditions are satisfied: The population is
normally distributed, or n > 30.

If all three of the above conditions are satisfied then the test statistic is:
x̄ − µ0
T∗ = √
s/ n
which follows a Student-t with n − 1 degrees of freedom.
26 / 42
Hypothesis Tests for a Single Mean
Example 8.3: The lengths of femur bones in newborns is assumed to be
normally distributed. It is claimed that the mean length is 75mm. A sample of
16 newborns had a mean femur length of 74.6mm and standard deviation of
0.81mm. Test with 1% significance, if the true mean differs from 75mm,
using a rejection region.

Solution: Step 1: H0 : µ = 75 HA : µ 6= 75
Step 2: Since n < 30 and σ is unknown, our test stat is:
T ∗ = x̄−µ
√ 0 = 74.6−75
s/ n

0.81/ 16
= −1.98
Step 3: Since two tail test our critical value is:
tα/2,n−1 = t0.01/2,16−1 = t0.005,15 = 2.947.
|T ∗ | = | − 1.98| = 1.98 < 2.947 = tα/2,n−1
Then T ∗ is not in the RR. So our results lead us to Fail to Reject H0
Thus, there is insufficient evidence to conclude that the mean differs from
75mm.
27 / 42
Hypothesis Tests for a Single Mean

Let’s approximate the p-value of the hypothesis in Example 8.3.

This is a two-tailed test, with T ∗ = −1.98 and n = 16


The p-value is defined to be 2 × P(T > |T ∗ |) with n − 1 = 15 df.

p − val = 2 × P(T > | − 1.98|) = 2 × P(T > 1.98)


To estimate P(T > 1.98) we look in the t-table in the row with 15 df and find
2 numbers that 1.98 falls between. Here we get 1.753 (onetail = 0.05 so
P(T > 1.753) = 0.05) and 2.131 (onetail = 0.025 so
P(T > 2.131) = 0.025).

28 / 42
Hypothesis Tests for a Single Mean

Drawing this out with a bell curve we can see that the P(T > 1.98) must be
between 0.025 and 0.05. (Ie. 0.025 < P(T > 1.98) < 0.05).

p − val = 2 × P(T > 1.98)


2 × 0.025 < p − val < 2 × 0.05
0.05 < p − val < 0.1
Thus the p − val > 0.05 which is larger than 0.01 = α therefore we Fail to
Reject H0

29 / 42
Hypothesis Tests for a Single Mean

Try on your own:


Exercises:
Redo Example 8.3 but test that the mean is below 75.
(Ans: T ∗ = −1.98, 0.025<p-val<0.05)
Redo Example 8.3 but test that the mean is above 74, and the sample
variance is 0.81.
(Ans: T ∗ = 2.6667, 0.005<p-val<0.01)

30 / 42
Hypothesis Tests for a Proportion

Here our parameter of interest is: p (Population Proportion).

H0 : p = p0

vs

HA : p 6= p0 or p > p0 or p < p0
The test statistic when making inference on the population proportion:

p̂ − p0
Z∗ = q
p0 (1−p0 )
n

which follows a standard normal (N(0, 1)).

Where p̂ is the sample proportion, p̂ = Xn .

31 / 42
Hypothesis Tests for a Proportion

Example 8.4: It is believed that less than 20% of engineering students are
female. A random sample of 508 engineering students had 110 female
students. Use a 5% significance level to test if less than 20% of engineering
students are female.

Solution: H0 : p = 0.2 HA : p < 0.2


110/508−0.2
The test stat is: Z ∗ = q p̂−p0 = q = 0.93172
p0 (1−p0 ) 0.2(1−0.2)
n 508

Since lower tail test our p-value is: P(Z < Z ∗ ) = P(Z < 0.93) = 0.8238.
p − val = 0.8238 > 0.05 = α
Since p − val < α. So our results lead us to Fail to Reject H0

Thus, there is insufficient evidence to conclude that the proportion of female


engineering students is below 20%.
32 / 42
Hypothesis Tests for a Proportion

Try on your own:


Exercises:
Redo Example 8.4 but with a RR approach.
(Ans: Z ∗ = 0.932 > −1.645 = −zα )
Use the sample in Example 8.4 to test if the proportion of female
engineering students differs from 0.25. (Hint: HA : p 6= 0.25)
(Ans: Z ∗ =-1.74, p-val = 0.0819, FTR)
Use the sample in Example 8.4 to test if the proportion of female
engineering students is below from 0.3.
(Ans: Z ∗ =-4.11, p-val = 0.0001, Reject)
Use the sample in Example 8.4 to test if the proportion of female
engineering students is above from 0.18.
(Ans: Z ∗ =2.14 pval = 0.0162, Reject)

33 / 42
Errors of Hypothesis Tests

Type I error occurs when we incorrectly reject H0 .


Type II error occurs when we fail to reject H0 , when HA is actually true.
P(Type I error) = α
P(Type II error) = β

H0 true HA true
Reject H0 Type I error no error
FTR H0 no error Type II error

Power: the power of a test is the probability of correctly rejecting H0 .


Power = 1 − β

34 / 42
Errors of Hypothesis Tests

35 / 42
Errors of Hypothesis Tests

A common way to think about this is with the court systems belief of
“innocent until proven guilty.”
Here we assume the defendant is innocent. (H0 : innocent)
Then we run a test (Ie. have a trial) to try and show that there is enough
evidence to reject H0 and assume that the person is guilty. (HA : guilty)

H0 true HA true
Actually innocent Actually guilty
Reject H0
Evidence of guilt Type I error no error
FTR H0
Not enough evidence no error Type II error

36 / 42
Errors of Hypothesis Tests

H0 true HA true
Actually innocent Actually guilty
Reject H0
Evidence of guilt Type I error no error
FTR H0
Not enough evidence no error Type II error

It is believed that sending an innocent person to jail is far worse than letting a
guilty person free (for non-violent crimes).
Thus, the Type I error is sending an innocent person to jail.
And a Type II error is setting a guilty person free.
Power is the probability of sending a guilty person to jail.

37 / 42
Errors of Hypothesis Tests

If we think back to our “General Hypothesis Example".


A type I error occurs if µ = 8, but we get x∗ that is larger than c.
A type II error occurs if µ = 25, but we get x∗ that is less than c.
Recall:

H0 : µ = 8 HA : µ = 25

38 / 42
Hypothesis Tests using a CI

If we are performing a two-tailed test, then we can also use a confidence


interval (CI) for our decision rule.
So if we are testing:
H0 : θ = θ0 HA : θ 6= θ0
then we can build a CI for θ.
If θ0 is in the CI, then we are 100(1-α)% confident about the interval
containing the true value of θ, thus we fail to reject (FTR) H0 .
If θ0 is NOT in the CI, then we are 100(1-α)% confident that θ is not θ0 , thus
we would reject H0 .
Note: A CI can only be used for a two-tailed test.

39 / 42
Hypothesis Tests using a CI

Example 8.5: A sample of 64 test marks was taken. The sample had mean of
73 and a population standard deviation 6. Test whether the true mean is
different than a 75, use 5% significance (using a CI).
Solution: H0 : µ = 75 HA : µ 6= 75
Since this is a two-sided test, we can use a CI.
α = 0.05, so we will build a 100(1 − α) = 95% CI for µ.
Recall, from Ch 6, since n = 64 > 30 the 95% CI for µ is:
σ 6
x̄ ± zα/2 √ = 73 ± z0.05/2 √
n 64
73 ± 1.96 × 0.75 = (71.53, 74.47)
Since 75, is not in the confidence interval, we are 95% percent confident that
the true mean is not 75. Thus we reject H0 and conclude that there is sufficient
evidence that the true mean is not 75.
40 / 42
Hypothesis Tests using a CI

Example 8.5(b): Use the info in Example 8.5 to test whether the true mean is
different than a 74, use 5% significance (using a CI).
Solution: H0 : µ = 74 HA : µ 6= 74
Since this is a two-sided test, we can use a CI.
α = 0.05, so we will build a 100(1 − α) = 95% CI for µ.
Recall, from Ch 6, since n = 64 > 30 the 95% CI for µ is:
σ 6
x̄ ± zα/2 √ = 73 ± z0.05/2 √
n 64
73 ± 1.96 × 0.75 = (71.53, 74.47)
Since 74, is in the confidence interval, we are not 95% percent confident that
the true mean is not 74. Thus we fail to reject H0 and conclude that there is
insufficient evidence that the true mean is not 74.

41 / 42
Hypothesis Tests with CIs

Try on your own:


Exercises: Use a CI to test the following.
Redo Example 8.1 but test that the mean is different than 0.1.
Redo Example 8.3.
Use Example 8.4 to test if the proportion of female engineering students
differs from 0.25.

42 / 42

You might also like