Chi Square Test
Chi Square Test
Chi Square Test
• Q- can you find out the mean value of colour ( Red, blue, green,
yellow).? Yes/No
1- Define Hypothesis
• The χ2 assumes that the data for the study is obtained through random selection, i.e.
they are randomly picked from the population.
• The categories are mutually exclusive i.e. each subject fits in only one category. For
e.g.- from our above example – the number of people who lunched in your restaurant
on Monday can’t be filled in the Tuesday category.
• The data should be in the form of frequencies or counts of a particular category and
not in percentages.
• The data should not consist of paired samples or groups or we can say the
observations should be independent of each other.
Example:
Here independent variable is C.G.P.A with the categories 9-10, 8-9, 7-8,
6-7, and below 6.
Question:
The statistical question here is: whether or not the observed frequencies
of placed students are equally distributed for different C.G.P.A
categories (so that our theoretical frequency distribution contains the
same number of students in each of the C.G.P.A categories)
Contingency table
Calculate Chi Square Value.
2.Step 2: Square each value obtained in step 1, i.e. (O-E)2. For example: for the C.G.P.A category 10-
9, the value obtained in step 1 is 10. It becomes 100 on squaring. Apply similar operation for all
the categories
3.Step 3: Divide all the values obtained in step 2 by the related expected frequencies i.e. (O-E) 2/E.
For example: for the C.G.P.A category 10-9, the value obtained in step 2 is 100. On dividing it with
the related expected frequency which is 20, it becomes 5. Apply similar operation for all the
categories
4.Step 4: Add all the values obtained in step 3 to get the chi-square value. In this case, the chi-
square value comes out to be 32.5
5.Step 5: Once we have calculated the chi-square value, the next task is to compare it with the
critical chi-square value. We can find this in the below chi-square table against the degrees of
freedom (number of categories – 1) and the level of significance:
Chi Square
Table
Finally.
• In this case, the degrees of freedom are 5-1 = 4. So, the critical value
at 5% level of significance is 9.49.
• Our obtained value of 32.5 is much larger than the critical value of
9.49. Therefore, we can say that the observed frequencies are
significantly different from the expected frequencies. In other
words, C.G.P.A is related to the number of placements that occur in
the department of statistics.
Chi Square Distribution.
chi-square
The chi-square test helps us to solve the problem in feature
selection by testing the relationship between the features in
machine Learning.
Degrees of freedom:
• EXCEL FORMULA:CHISQ.TEST