Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
15 views

Lecture 10

This document provides a summary of key concepts from a lecture on two-sample hypothesis tests. It discusses comparing two means for independent samples when the population variances are known or unknown. It also covers comparing two proportions and two variances. The key tests covered include the z-test for means when variances are known, the pooled t-test when variances are unknown but assumed equal, and Welch's t-test when variances are unknown and unequal. Examples are provided to demonstrate how to calculate test statistics and interpret results.

Uploaded by

22002809
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Lecture 10

This document provides a summary of key concepts from a lecture on two-sample hypothesis tests. It discusses comparing two means for independent samples when the population variances are known or unknown. It also covers comparing two proportions and two variances. The key tests covered include the z-test for means when variances are known, the pooled t-test when variances are unknown but assumed equal, and Welch's t-test when variances are unknown and unequal. Examples are provided to demonstrate how to calculate test statistics and interpret results.

Uploaded by

22002809
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

LECTURE 10

Two-sample Hypothesis Tests

Lecturer: Nguyen Thi Thu Van


Email: van.nguyen@isb.edu.vn
Content
 Comparing Two Means

 Confidence Interval for the Difference of Two Means

 Comparing Two Means: Independent Samples

 Comparing Two Means: Paired Samples

 Comparing Two Proportions

 Confidence Interval for the Difference of Two


Proportions

 Comparing Two Variances


Two-sample Tests
 A test performed on the data of two random samples,
each independently obtained from a different given
population.
 Aim to determine whether the difference between these
two populations is statistically significant. 𝜇1 ≠ 𝜇2

𝑋ത1
𝜇1 𝑋ത2

𝜇2
Two-Sample Tests

Two-Sample Tests

Population Population
Means, Means, Population Population
Independent Related Proportions Variances
Samples Samples
Comparing Two Means
Difference Between Two Means & Independent Samples
Test hypothesis or form a confidence interval for the difference between
two population means, μ1 – μ2 where the point estimate for the difference
is 𝑋1 − 𝑋2

Lower-tail test Upper-tail test Two-tail test


H0: μ1  μ2 H0: μ1 ≤ μ2 H0: μ1 = μ2
H1: μ1 < μ2 H1: μ1 > μ2 H1: μ1 ≠ μ2

meaning meaning meaning

H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0


H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0

a a a/2
Hypothesis Tests for µ1 - µ2 with σ1 and σ2 known

Population means, Assumptions


independent
samples  Samples are randomly and
independently drawn.
σ1 and σ2 known  Populations are normally
distributed or both sample
σ1 and σ2 unknown,
assumed equal sizes are at least 30.

σ1 and σ2 unknown,
not assumed equal
Hypothesis tests for µ1 - µ2 with σ1 and σ2 known
 The test statistic:

ZSTAT 
 X  X   μ  μ 
1 2 1 2

  12  22 
  
 n1 n2 

 The confidence interval for μ1 – μ2:

X  X   Z
1 2 a/2
  12  22 
  
 n1 n2 
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown assumed equal

Population means, Assumptions


independent
samples  Samples are randomly and
independently drawn
σ1 and σ2 known  Populations are normally
distributed or both sample
σ1 and σ2 unknown,
assumed equal sizes are at least 30

σ1 and σ2 unknown,
not assumed equal
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown and assumed equal
 The pooled variance
S 2

n1  1 S1  n 2  1 S2
2 2

(n1  1)  (n 2  1)
p

 The test statistic:


where tSTAT has t STAT 
 X  X   μ  μ 
1 2 1 2

1 1 
d.f. = (n1 + n2 – 2) S   
2
p
 n1 n 2 
 The confidence interval

where tα/2 has X  X   t


1 2 a/2
1 1 
S   
2
p
 n1 n 2 
d.f. = n1 + n2 – 2
Pooled-Variance t Test Example NYSE NASDAQ

Number 21 25

Sample 3.27 2.53


mean

Sample 1.3 1.16


SD

 You are a financial analyst for a brokerage firm. Is


there a difference in dividend yield between stocks
listed on the NYSE & NASDAQ?

 Assuming both populations are approximately normal


with equal variances, is there a difference in mean
yield (a = 0.05)? You collect the above data.
Pooled-Variance t Test Example: Calculating the Test Statistic
H0: μ1 - μ2 = 0
H1: μ1 - μ2 ≠ 0

The test statistic is:

t STAT 
X  X   μ
1 2 1  μ2 

3.27  2.53  0  2.040
1 1  1 1 
S   
2
1.5021  
p
 n1 n 2   21 25 

n
S2  1
 1S1
2
 n 2  1S 2
2

21  1 1.30 2
 25  1 1.16 2
 1.5021
(n1  1)  (n 2  1) (21 - 1)  (25  1)
P
Pooled-Variance t Test Example: Hypothesis Test Solution

Reject H0 Reject H0

a = 0.05
df = 21 + 25 - 2 = 44 .025 .025

Critical Values: t = ± 2.0154 -2.0154 0 2.0154 t


2.040
Test Statistic:

3.27  2.53
t STAT   2.040
 1 1 
1.5021   
 21 25 
Conclusion: Reject H0 at a = 0.05. There is evidence of a difference in means.
Pooled-Variance t Test Example: Confidence Interval for µ1 - µ2
Since we rejected H0 can we be 95% confident that
µNYSE > µNASDAQ?

95% Confidence Interval for µNYSE - µNASDAQ

X  X   t
1 2 a/2 p
1 1 
S     0.74  2.0154  0.3628  (0.009, 1.471)
2

 n1 n 2 

Since 0 is less than the entire interval, we can be 95%


confident that µNYSE > µNASDAQ
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown, not assumed equal

Population means,
Assumptions
independent  Samples are randomly and
samples independently drawn.
 Populations are normally
σ1 and σ2 known
distributed or both sample

σ1 and σ2 unknown, sizes are at least 30.


assumed equal  Population variances are
σ1 and σ2 unknown, unknown and cannot be
not assumed equal assumed to be equal.
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown and not assumed equal
 The test statistic:

 
2
 S1 2 S 2 2 
X  X 2   μ1  μ 2  
n  n 


1
  2
2 
t STAT
2 2
tSTAT has 1
2
S S  S1 2   S22 
1
 2
d.f. ν: 
n 
 
n 

n1 n 2  1   2 
n1  1 n2  1

Welch’s rule:
df = min(n1 – 1, n2 – 1)

 The confidence interval:

X 1 
 X 2  ta
2
S12 S22

n1 n 2
Difference Between Two Means & Paired Samples
 The average score of subjects on the posttest is different
than the average of those same subjects on the pretest.

 People will listen longer to a female telephone marketer


than the very same people will listen to a male telephone
marketer.
 On average, soldiers weighed less
after they completed basic training
than they weighed before they
started.
 and so forth.
Difference Between Two Means & Paired Samples
 Tests means of 2 related populations
 Paired samples

 Repeated measures (before/after)

 Use difference between paired values:

Di = X1i - X2i
 Assumptions:
 Both Populations Are Normally Distributed

 Or, if not Normal, use large samples


Related Populations - Paired Difference Test
 The ith paired difference: Di = X1i - X2i
 The point estimate for the paired n

difference population mean μD:


D i
D i 1
n
 The sample standard deviation:
n
 n is the number of pairs in the  i
(D  D ) 2

SD  i 1

paired sample n 1

D  μD
 The test statistic for μD where t STAT 
SD
tSTAT has n - 1 d.f. n
Paired Difference Test: Possible Hypotheses & CI
Lower-tail test Upper-tail test Two-tail test

H0: μD  0 H0: μD ≤ 0 H0: μD = 0


H1: μD < 0 H1: μD > 0 H1: μD ≠ 0

a a a/2 a/2

-ta ta -ta/2 ta/2


Reject H0 if tSTAT < -ta Reject H0 if tSTAT > ta Reject H0 if tSTAT < -ta/2
or tSTAT > ta/2
Where tSTAT has n - 1 d.f.

SD
The confidence interval for μD is D  ta / 2
n
Paired Difference Test: Example
 Assume you send your salespeople to a “customer
service” training workshop. Has the training made a
difference in the number of complaints? You collect
the following data:

 Di
Salesperson Number of complaints Difference
Before After D = n
A 6 4 -2
B 20 6 -14 = -4.2
 (D  D)
C 3 2 -1 2
D 0 0 0 SD  i

F 4 0 -4
n 1
-21  5.67
Has the training made a difference in the
number of complaints (at the 0.01 level)?

H0: μD = 0
H1: μD  0 Reject Reject

a/2 a/2
a = .01 D = - 4.2 - 4.604 4.604
- 1.66
t0.005 = ± 4.604
d.f. = n - 1 = 4
Conclusion: Do not reject 𝐻0 . There
is insufficient evidence there is
D  μ D  4.2  0
t STAT    1.66 significant change in the number of
SD / n 5.67/ 5
complaints.
Comparing Two Proportions
Two Population Proportions
 Time magazine reported the result of a telephone poll of 800
adult Americans. The question posed of the Americans who
were surveyed was: "Should the federal tax on cigarettes be
raised to pay for health care reform?” Is there sufficient
evidence at the 𝛼 = 0.05, say, to conclude that the true
populations – smokers and non-smokers – differ significantly?
Assumptions about Normality

 We have assumed a normal distribution for the


statistic p1  p2
 For a test of two proportions, the criterion for
normality is np ≥ 10 and n(1 − p) ≥ 10 for each
sample.

 If either sample proportion is not normal, their


difference cannot safely be assumed normal.
Two Population Proportions Test for Zero Difference
Lower-tail test Upper-tail test Two-tail test
H0: π1 – π2  0 H0: π1 – π2 ≤ 0 H0: π1 – π2 = 0
H1: π1 – π2 < 0 H1: π1 – π2 > 0 H1: π1 – π2 ≠ 0

a a a/2 a/2

-za za -za/2 za/2


Reject H0 if ZSTAT < -Za Reject H0 if ZSTAT > Za Reject H0 if ZSTAT < -Za/2
or ZSTAT > Za/2

The point estimate for the difference is p1  p2


Two Population Proportions Test for Zero Difference
 The pooled estimate for the overall X1  X 2
p
n1  n 2
proportion, where X1 and X2 are the number of
items of interest in samples 1 and 2
 p1  p2    π1  π2 
 The test statistic Z STAT 
 1 1 
p (1  p )   
 n1 n2 

X1  X 2 X1 X2
p , p1  , p2 
n1  n2 n1 n2
Hypothesis Test Example: Two population Proportions
Is there a significant difference between the
proportion of men and the proportion of
women who will vote Yes on Proposition A?

 In a random sample, 36 of 72 men and 35 of


50 women indicated they would vote “Yes”.

 Test at the .05 level of significance.


Hypothesis Test Example: Two population Proportions
 Hypotheses
H0: π1 – π2 = 0
H1: π1 – π2 ≠ 0
 The sample proportions
 Men: p1 = 36/72 = 0.50
 Women: p2 = 35/50 = 0.70

 The pooled estimate for the overall proportion

X 1  X 2 36  35 71
p    0 .582
n1  n2 72  50 122
Reject H0
z STAT 
 p1  p2    π1  π2 
Reject H0
 1 1 .025

p ( 1  p)   
 n1 n2  .025


 .50  .70   0   2 .20
 1 1 
.582 ( 1  .582 )    -1.96 1.96
 72 50  -2.20

Critical Values = ±1.96


For a = .05

Conclusion: Reject 𝐻0 . There is evidence of a difference in


proportions who will vote yes between men and women.
Two Population Proportions Test for Difference
Lower-tail test Upper-tail test Two-tail test

H0: π1 – π2  D0 H0: π1 – π2 ≤ D0 H0: π1 – π2 = D0


H1: π1 – π2 < D0 H1: π1 – π2 > D0 H1: π1 – π2 ≠ D0

(p1  p 2 )  ( 1   2 )
Z STAT 
p1 (1  p1 ) p 2 (1  p 2 )

n1 n2
p1 (1  p1 ) p 2 (1  p 2 )
 p1  p 2   Za/2 
n1 n2
Comparing Two Variances
Comparing Two Variances

Hypothesis Test for


Variances

Test for a Single Test for Two


Population Population
Variances Variances

Chi – square F - Test


Test Statistic Statistic
F - Statistic
 For variance tests, we'll use the F-test to determine whether two

𝜎12
variances are different by considering 𝐹 =
𝜎22

𝑠12
 Test statistic is the ratio between two sample variances 𝐹𝑆𝑇𝐴𝑇 =
𝑠22

 Degrees of freedom for the top: 𝑑𝑓1 = 𝑛1 − 1

Degrees of freedom for the bottom: 𝑑𝑓2 = 𝑛2 − 1


Example of Comparing Two Variances

 You are a financial NYSE NASDAQ

Number 21 25
analyst for a brokerage
Sample mean 3.27 2.53
firm. Is there a Sample SD 1.3 1.16

difference in the
variances between the
NYSE & NASDAQ at
the a = 0.05 level?
Solution.
 Form the hypothesis test:

H 0: σ 12  σ 22 = no difference between variances


H 1: σ 12  σ 22 = a difference between variances

 Significance level a = 0.05

 Numerator d.f. = n1 – 1 = 21 –1 = 20

 Denominator d.f. = n2 – 1 = 25 –1 = 24

 FR = F.025, 20, 24 = 2.33 (FINV(0.025, 20, 24)

 FL = 1/ F.025, 24, 20 = 0.41 (or FINV(0.975, 20, 24)


 The test statistic H0: σ12 = σ22
H1: σ12 ≠ σ22
S12 1.302
FSTAT  2  2
 1.256
S 2 1.16
a/2 = .025
F
Reject H0 Do not Reject H0
reject H0
FL = 0.41 1 FR = 2.33
 FSTAT = 1.256 is not in the rejection region, so we do not reject
H0

Conclusion: There is insufficient evidence of a difference in


variances at a = .05
-- The End of Topic --
Thank You!

You might also like