Chi Square
Chi Square
Chi Square
Chi-Square tests:
1. The Chi-Square Goodness of Fit Test – Used to determine whether or not a categorical variable
follows a hypothesized distribution.
2. The Chi-Square Test of Independence – Used to determine whether or not there is a significant
association between two categorical variables.
Note that both of these tests are only appropriate to use when you’re working with categorical
variables. These are variables that take on names or labels and can fit into categories. Examples
include:
Suppose a researcher would like to know if a die is fair. She decides to roll it 50 times and record
the number of times it lands on each number.
She can use a Chi-Square Goodness of Fit Test to determine if the distribution of values follows
the theoretical distribution that each value occurs the same number of times.
Suppose we want to know if the percentage of M&M’s that come in a bag are as follows: 20%
yellow, 30% blue, 30% red, 20% other. To test this, we open a random bag of M&M’s and count
how many of each color appear.
We can use a Chi-Square Goodness of Fit Test to determine if the distribution of colors is equal to
the distribution we specified.
The Chi-Square Test of Independence
You should use the Chi-Square Test of Independence when you want to
determine whether or not there is a significant association between two
categorical variables.
Here are some examples of when you might use this test:
Example 1: Voting Preference & Gender
Researchers want to know if gender is associated with political party
preference in a certain town so they survey 500 voters and record their
gender and political party preference.
They can perform a Chi-Square Test of Independence to determine if
there is a statistically significant association between voting preference
and gender.
Example 2: Favorite Color & Favorite Sport
Researchers want to know if a person’s favorite color is associated with their
favorite sport so they survey 100 people and ask them about their preferences
for both.
They can perform a Chi-Square Test of Independence to determine if there is a
statistically significant association between favorite color and favorite sport.
Example 3: Education Level & Marital Status
Researchers want to know if education level and marital status are associated
so they collect data about these two variables on a simple random sample of
2,000 people.
They can perform a Chi-Square Test of Independence to determine if there is a
statistically significant association between education level and marital status.
A survey on cars had conducted in 2011 and determined that 60% of car
owners have only one car, 28% have two cars, and 12% have three or
more. Supposing that you have decided to conduct your own survey and
have collected the data below, determine whether your data supports
the results of the study.
Use a significance level of 0.05 (the critical value for a 0.05 significance
level with df = 2 is 5.99. ). Also, given that, out of 129 car owners, 73
had one car and 38 had two cars.
Let’s compare it to the chi-square value for the significance level 0.05.
The degrees for freedom = 3 – 1 = 2
the critical value for a 0.05 significance level with df = 2 is 5.99.
That means that 95 times out of 100, a survey that
agrees with a sample will have a χ2 value of 5.99 or less.
The Chi-square statistic is only 0.7533, so we will accept
the null hypothesis.
Use a significance level of 0.05 (the critical value for a
0.05 significance level with df = 4 is 9.488 ).
Here, we have 5 categories. We haven’t estimated any parameters, but we have
calculated the expected numbers by assuming that the total is the same as for the
actual numbers. So the number of degrees of freedom to use is 5 -1 = 4