Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
3 views

Hypothesis Testing for two populations (Excel Tutorial)

Hypothesis testing tutorial for two populations

Uploaded by

williamdugo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Hypothesis Testing for two populations (Excel Tutorial)

Hypothesis testing tutorial for two populations

Uploaded by

williamdugo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Hypothesis Testing (two populations)

To conduct hypothesis test for two populations (difference between two population means), we can use
the Data Analysis Toolkit. This tool offers convenient options to perform this test. As learned in the
lecture, this hypothesis test can be categorized in one of three cases:

• CASE I: Population standard deviations 𝜎1 and 𝜎2 are known


• CASE II: Population standard deviations 𝜎1 and 𝜎2 are unknown but are assumed equal (𝜎1 = 𝜎2 )
• CASE III: Population standard deviations 𝜎1 and 𝜎2 are unknown and cannot be assumed equal
(𝜎1 ≠ 𝜎2 )
In all three cases, we will assume that we are conducting a hypothesis test on the difference between the
two population means (𝜇1 − 𝜇2 ) and we are testing in the alternative hypothesis that this difference is
either < or > or ≠ to a difference 𝑑0 .

CASE I: Population standard deviations 𝜎1 and 𝜎2 are known


To perform this test, we will open the Data Analysis Toolkit and,
1. Choose the “z-Test: Two Sample For Means” option from the menu.
2. Select in “Variable 1 Range” and “Variable 2 Range” the data for variables 1 and 2, respectively.
3. In the “Hypothesized Mean Difference” box enter 𝑑0 .
4. In the “Variable 1 Variance (known)” and “Variable 2 Variance (known)” boxes enter 𝜎1 and 𝜎2 ,
respectively.
5. Check the “Labels” box if the data you selected in step 2 contained the headers (labels).
6. In the Alpha box specify the required significance level of the test.
7. Specify the location of the output, and,
8. Hit Enter.
This will print the output at your specified location which contains:

• The sample averages (𝑥̅1 and 𝑥̅2 )


• The known population variances (𝜎12 and 𝜎12 )
• The sample sizes (𝑛1 and 𝑛2 )
• The hypothesized difference (𝑑0 )
• The test statistic (𝑧)
• The 𝑃(𝑍 ≤ 𝑧) and critical 𝑧 for one-tail and two-tailed tests.
Note: The two P(Z<=z) values provided are the p-values for the one-tailed and two-tailed tests.
Note: we did not specify if our test is one-tailed or two tailed. The Data Analysis tool provides the p-
value for both. Depending on our own test, we will use the relevant one.
Note: if your test is two-tailed, you do not need to multiply the probability provided under the two-tailed
test result. It is already multiplied by 2. The probability provided is the p-value for the two-tailed test.
Note: if the way you set up your competing hypotheses results in a hypothesized difference (𝑑0 ) that is
negative, you’d need to change the order of the populations so that you will end up with a positive
hypothesized difference (𝑑0 ). The Data Analysis Tool only accepts positive 𝑑0 .

Prepared by: Dr. Behrouz Bakhtiari DeGroote School of Business


CASE II: Population standard deviations 𝜎1 and 𝜎2 are unknown but are
assumed equal (𝜎1 = 𝜎2 )
To perform this test, we will open the Data Analysis Toolkit and,
1. Choose the “t-Test: Two-Sample Assuming Equal Variances” option from the menu.
2. Select in “Variable 1 Range” and “Variable 2 Range” the data for variables 1 and 2, respectively.
3. In the “Hypothesized Mean Difference” box enter 𝑑0 .
4. Check the “Labels” box if the data you selected in step 2 contained the headers (labels).
5. In the Alpha box specify the required significance level of the test.
6. Specify the location of the output, and,
7. Hit Enter.
This will print the output at your specified location which contains:

• The sample averages (𝑥̅1 and 𝑥̅2 )


• The sample variances (𝑠12 and 𝑠12 )
• The sample sizes (𝑛1 and 𝑛2 )
• The pooled variance (𝑠𝑝2 )
• The hypothesized difference (𝑑0 )
• The degree of freedom for the test statistic (𝑑𝑓)

Prepared by: Dr. Behrouz Bakhtiari DeGroote School of Business


• The test statistic (𝑡)
• The 𝑃(𝑇 ≤ 𝑡) and critical 𝑡 for one-tail and two-tailed tests.
Note: The two P(T<=t) values provided are the p-values for the one-tailed and two-tailed tests.
Note: we did not specify if our test is one-tailed or two tailed. The Data Analysis tool provides the p-
value for both. Depending on our own test, we will use the relevant one.
Note: if your test is two-tailed, you do not need to multiply the probability provided under the two-tailed
test result. It is already multiplied by 2. The probability provided is the p-value for the two-tailed test.
Note: if the way you set up your competing hypotheses results in a hypothesized difference (𝑑0 ) that is
negative, you’d need to change the order of the populations so that you will end up with a positive
hypothesized difference (𝑑0 ). The Data Analysis Tool only accepts positive 𝑑0 .

CASE III: Population standard deviations 𝜎1 and 𝜎2 are unknown and


cannot be assumed equal (𝜎1 ≠ 𝜎2 )
To perform this test, we will open the Data Analysis Toolkit and,
8. Choose the “t-Test: Two-Sample Assuming Unequal Variances” option from the menu.
9. Select in “Variable 1 Range” and “Variable 2 Range” the data for variables 1 and 2, respectively.
10. In the “Hypothesized Mean Difference” box enter 𝑑0 .
11. Check the “Labels” box if the data you selected in step 2 contained the headers (labels).

Prepared by: Dr. Behrouz Bakhtiari DeGroote School of Business


12. In the Alpha box specify the required significance level of the test.
13. Specify the location of the output, and,
14. Hit Enter.
This will print the output at your specified location which contains:

• The sample averages (𝑥̅1 and 𝑥̅2 )


• The sample variances (𝑠12 and 𝑠12 )
• The sample sizes (𝑛1 and 𝑛2 )
• The hypothesized difference (𝑑0 )
• The degree of freedom for the test statistic (𝑑𝑓)
• The test statistic (𝑡)
• The 𝑃(𝑇 ≤ 𝑡) and critical 𝑡 for one-tail and two-tailed tests.
Note: The two P(T<=t) values provided are the p-values for the one-tailed and two-tailed tests.
Note: we did not specify if our test is one-tailed or two tailed. The Data Analysis tool provides the p-
value for both. Depending on our own test, we will use the relevant one.
Note: if your test is two-tailed, you do not need to multiply the probability provided under the two-tailed
test result. It is already multiplied by 2. The probability provided is the p-value for the two-tailed test.
Note: if the way you set up your competing hypotheses results in a hypothesized difference (𝑑0 ) that is
negative, you’d need to change the order of the populations so that you will end up with a positive
hypothesized difference (𝑑0 ). The Data Analysis Tool only accepts positive 𝑑0 .

Prepared by: Dr. Behrouz Bakhtiari DeGroote School of Business


Exercises
You can download the Excel file for this example at this link.
We will practice performing hypothesis tests using two populations using data in Excel. See the Excel file
“Hypothesis Testing (two populations) for the data.
Question 1
A consumer advocate researches the length of life between two brands of refrigerators, Brand A and
Brand B. He collects data (measured in years) n the longevity of 40 refrigerators for Brand A and repeats
the sampling for Bran B. The data is provided in the tab “Longevity”. Suppose the researcher wants to
determine if the average life differs between Brand A and Brand B.
a) Specify the competing hypotheses.
b) Calculate the value of the test statistic as well as the p-value assuming 𝜎𝐴2 = 4.4 and 𝜎𝐵2 = 5.2.
c) At 5% significance level, what is the conclusion of the test?

Question 2
The “See Me” marketing agency wants to determine if time of day for television advertisement influences
website searches for a product. They have extracted the number of website searches occurring during a
one-hour period after an advertisement was aired for a random sample of 30 day and 30 evening
advertisements. The data is provided in the tab by the name “Searches”. Suppose an analyst wants to
determine if the average number of website searches differs between the day and evening advertisements.
a) specify the competing hypotheses.
b) Calculate the value of the test statistic and the p-value, assuming that the population variances are
equal.
c) What is the conclusion of the test at 5% significance level?
Question 3
In Question 2, you concluded that the mean number of searches between Day and Evening advertisements
differ. It is not hard to see that the mean number of searches after Evening advertisements is higher
(population 2). You would like to determine if the difference between the two averages is higher than
10,000 searches. In other words, you would like to investigate if the mean number of searches after
Evening advertisements is higher than that of the Day advertisements by more than 10,000. Conduct this
test in the “Searches Version 2” tab.
a) specify the competing hypotheses.
b) Calculate the value of the test statistic and the p-value, assuming that the population variances are
equal.
c) What is the conclusion of the test at 5% significance level?

Prepared by: Dr. Behrouz Bakhtiari DeGroote School of Business

You might also like