Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Chi-Square Test

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Chi-Square Test of Independence

Search

Stat Trek Teach yourself statistics


Skip to main content

Home
Tutorials
AP Statistics
Stat Tables
Stat Tools
Calculators
Books
Help
Overview
AP statistics
Statistics and probability
Matrix algebra

Share with Friends

Chi-Square Test for Independence


This lesson explains how to conduct a chi-square test for independence.
The test is applied when you have two

categorical variables
from a single population. It is used to determine whether
there is a significant association between
the two variables.
For example, in an election survey, voters might be classified
by gender (male or female) and voting preference
(Democrat, Republican, or Independent). We could use a chi-square test for
independence to determine whether gender
is related to voting preference. The sample problem at the end of the lesson
considers this example.

When to Use Chi-Square Test for Independence


The test procedure described in this lesson is appropriate when the following conditions are met:
The sampling method is simple random sampling.
The variables under study are each categorical.
If sample data are displayed in a contingency table,
the expected frequency count for each cell of the table is
at
least 5.
This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data,
and (4) interpret results.

http://stattrek.com/chi-square-test/independence.aspx?Tutorial=AP[24-Apr-16 10:59:41 PM]

Chi-Square Test of Independence

State the Hypotheses


Suppose that Variable A has r levels, and Variable B has c levels.
The null hypothesis states that knowing the level of

Variable A does not help you predict the level of Variable B. That is, the variables are independent.

H0: Variable A and Variable B are independent.

Ha: Variable A and Variable B are not


independent.
The alternative hypothesis is that knowing the level of Variable A can help you predict the level of Variable B.
Note: Support for the alternative hypothesis suggests that the variables
are related; but the relationship is not necessarily
causal, in
the sense that one variable "causes" the other.

Formulate an Analysis Plan


The analysis plan describes how to use sample data to accept or reject the null hypothesis. The plan should specify the
following elements.
Significance level. Often, researchers choose significance levels
equal to 0.01, 0.05, or 0.10; but any value
between 0 and 1 can be used.
Test method. Use the chi-square test for independence
to determine whether there is a significant relationship
between two categorical variables.

Analyze Sample Data


Using sample data, find the degrees of freedom, expected frequencies, test statistic, and the P-value associated with the
test statistic.
The approach described in this section is illustrated in the sample problem at the end of this lesson.
Degrees of freedom. The degrees of freedom (DF) is equal to:

DF = (r - 1) * (c - 1)

where r is the number of levels for one catagorical variable, and c is the number of levels for the other categorical
variable.
Expected frequencies. The expected frequency counts
are computed separately for each level of one categorical
variable at each level of the other categorical variable. Compute r * c expected frequencies, according to the
following formula.

Er,c = (nr * nc) / n

where Er,c is the expected frequency count for level r of Variable A and
level c of Variable B, nr is the total
number of sample observations at level r of Variable A, nc is the total number of sample observations at level c of
Variable B, and n is the total sample size.
Test statistic. The test statistic is a chi-square random variable
(2) defined by
the following equation.
2 = [ (Or,c - Er,c)2 / Er,c ]

http://stattrek.com/chi-square-test/independence.aspx?Tutorial=AP[24-Apr-16 10:59:41 PM]

Chi-Square Test of Independence

where Or,c is the observed frequency count at level r of Variable A and level c of Variable B, and
Er,c is the
expected frequency count at level r of Variable A and level c of Variable B.
P-value. The P-value is the probability of observing a sample statistic as extreme as the test statistic. Since the
test statistic is a chi-square, use the Chi-Square Distribution Calculator
to assess the probability associated with
the test statistic. Use the degrees of freedom computed above.

Interpret Results
If the sample findings are unlikely, given
the null hypothesis, the researcher rejects the null hypothesis.
Typically, this
involves comparing the P-value to the significance level,
and rejecting the null hypothesis when the P-value is less than

the significance level.

Test Your Understanding


Problem
A public opinion poll surveyed a simple random sample of 1000 voters.
Respondents were classified by gender (male or
female) and by voting preference (Republican, Democrat, or Independent).
Results are shown in the
contingency table
below.
Voting Preferences
Row total
Republican Democrat Independent
Male
200
150
50
400
Female
250
300
50
600
Column total 450
450
100
1000
Is there a gender gap? Do the men's voting preferences differ
significantly from the women's preferences? Use a 0.05
level of significance.
Solution
The solution to this problem takes four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample
data, and (4) interpret results. We work through those steps below:
State the hypotheses. The first step is to state the null hypothesis and an alternative hypothesis.
H0: Gender and voting preferences are independent.

Ha: Gender and voting preferences are not


independent.
Formulate an analysis plan. For this analysis,
the significance level is 0.05. Using sample data, we will conduct
a chi-square test for independence.
Analyze sample data. Applying the chi-square
test for independence to sample data, we compute the degrees of
freedom, the expected frequency counts, and
the chi-square test statistic. Based on the
chi-square statistic and the
degrees of freedom, we determine the
P-value.

DF = (r - 1) * (c - 1) = (2 - 1) * (3 - 1) = 2

Er,c = (nr * nc) / n

E1,1 = (400 * 450) / 1000 = 180000/1000 = 180

http://stattrek.com/chi-square-test/independence.aspx?Tutorial=AP[24-Apr-16 10:59:41 PM]

Chi-Square Test of Independence

E1,2 = (400 * 450) / 1000 = 180000/1000 = 180

E1,3 = (400 * 100) / 1000 = 40000/1000 = 40

E2,1 = (600 * 450) / 1000 = 270000/1000 = 270

E2,2 = (600 * 450) / 1000 = 270000/1000 = 270

E2,3 = (600 * 100) / 1000 = 60000/1000 = 60


2 = [ (Or,c - Er,c)2 / Er,c ]

2 = (200 - 180)2/180 + (150 - 180)2/180 + (50 - 40)2/40

+
(250 - 270)2/270 + (300 - 270)2/270 + (50 - 60)2/60
2 =
400/180 + 900/180 +
100/40 + 400/270 +
900/270 + 100/60
2 = 2.22 + 5.00 + 2.50 + 1.48 + 3.33 + 1.67 = 16.2
where DF is the degrees of freedom, r is the number of levels of gender, c is the number of levels of the voting
preference,
nr is the number of observations from level r
of gender,
nc is the number of observations from level c
of voting preference,
n is the number of observations in the sample,
Er,c is the expected frequency count when
gender is level r and voting preference is level c, and
Or,c is the observed frequency count when gender is level r
voting preference is level c.
The P-value is the probability that a chi-square statistic
having 2 degrees of freedom is more extreme than 16.2.
We use the Chi-Square Distribution Calculator
to find P(2 > 16.2) = 0.0003.
Interpret results. Since the P-value (0.0003) is
less than the significance level (0.05), we cannot accept the
null
hypothesis. Thus, we conclude that there is a relationship between gender and voting preference.
Note: If you use this approach on an exam, you may also want to mention
why this approach is appropriate.
Specifically, the approach is appropriate because the sampling method was simple random sampling, the
variables
under study were categorical, and the expected frequency count was at least 5 in each cell of the contingency table.

< Previous lesson

AP Statistics Tutorial
Exploring Data

The basics

Charts and graphs


Regression
Categorical data
Planning a Study
Surveys
Experiments
Anticipating Patterns
Probability
Random variables
Discrete variables
Continuous variables
Sampling distributions
Statistical Inference
Estimation
Estimation problems
Hypothesis testing
Hypothesis tests

http://stattrek.com/chi-square-test/independence.aspx?Tutorial=AP[24-Apr-16 10:59:41 PM]

Next lesson >

Chi-Square Test of Independence

Proportions
Diff between props
Mean
Diff between means
Diff between pairs
Goodness of fit test
Homogeneity
Independence
Regression slope
Appendices
Practice Exam
Notation
AP Statistics Formulas
* AP and Advanced

Placement Program
are registered

trademarks of the
College Board, which
was not involved in
the production of, and
does not endorse this
web site.

http://stattrek.com/chi-square-test/independence.aspx?Tutorial=AP[24-Apr-16 10:59:41 PM]

Chi-Square Test of Independence

The Probability and Statistics Tutor - 10 Hour Course - 3 DVD Set - Learn By Examples!
List Price: $39.99
Buy Used: $30.37
Buy New: $39.99
Statistics, 4th Edition
David Freedman, Robert Pisani, Roger Purves
Buy Used: $23.75
Buy New: $116.66
Barron's AP Statistics with CD-ROM (Barron's AP Statistics (W/CD))
Martin Sternstein Ph.D.
List Price: $29.99
Buy Used: $0.01
Buy New: $6.99
Statistics: Methods and Applications
Thomas Hill, Paul Lewicki
List Price: $80.00
Buy Used: $60.01
Buy New: $80.00
TI-89 Graphing Calculator For Dummies
C. C. Edwards
List Price: $24.99
Buy Used: $0.36
Buy New: $17.73
Forgotten Statistics: A Refresher Course with Applications to Economics and Business
Douglas Downing Ph.D., Jeff Clark Ph.D.
List Price: $16.99
Buy Used: $1.99
Buy New: $12.76
Casio FX-CG10 PRIZM Color Graphing Calculator (Black)
List Price: $129.99
http://stattrek.com/chi-square-test/independence.aspx?Tutorial=AP[24-Apr-16 10:59:41 PM]

Chi-Square Test of Independence

Buy Used: $70.00


Buy New: $96.12
Approved for AP Statistics and Calculus

About Us
Contact Us
Privacy
Terms of Use
Resources
Advertising
The contents of this webpage are copyright 2016 StatTrek.com. All Rights Reserved.
View Mobile Version

http://stattrek.com/chi-square-test/independence.aspx?Tutorial=AP[24-Apr-16 10:59:41 PM]

You might also like