Chi Square Test: Comparing Observed and Expected Frequencies with Chi Square

1. What is Chi-Square Test and Why is it Useful?

One of the most common statistical tests in data analysis is the chi-square test. It is used to compare the observed frequencies of different categories or groups with the expected frequencies based on some hypothesis or assumption. The chi-square test can help us answer questions such as:

- Is there a significant difference between the preferences of men and women for a certain product?

- Does the distribution of blood types in a sample match the expected distribution for the population?

- Are two variables, such as smoking and lung cancer, independent or associated with each other?

The chi-square test is based on the following idea: if the observed frequencies are close to the expected frequencies, then the hypothesis or assumption is likely to be true. On the other hand, if the observed frequencies are far from the expected frequencies, then the hypothesis or assumption is likely to be false. The chi-square test quantifies the degree of discrepancy between the observed and expected frequencies by calculating a statistic called the chi-square value, denoted by $$\chi^2$$.

The chi-square value is computed by summing up the squared differences between the observed and expected frequencies, divided by the expected frequencies, for each category or group. The formula is:

$$\chi^2 = \sum_{i=1}^k \frac{(O_i - E_i)^2}{E_i}$$

Where $$k$$ is the number of categories or groups, $$O_i$$ is the observed frequency for the $$i$$-th category or group, and $$E_i$$ is the expected frequency for the $$i$$-th category or group.

The larger the chi-square value, the greater the discrepancy between the observed and expected frequencies, and the more likely the hypothesis or assumption is false. The smaller the chi-square value, the smaller the discrepancy between the observed and expected frequencies, and the more likely the hypothesis or assumption is true.

To determine whether the chi-square value is large enough to reject the hypothesis or assumption, we need to compare it with a critical value that depends on the level of significance and the degrees of freedom of the test. The level of significance, denoted by $$\alpha$$, is the probability of rejecting the hypothesis or assumption when it is actually true. The degrees of freedom, denoted by $$df$$, is the number of categories or groups minus one. The critical value, denoted by $$\chi^2_{\alpha, df}$$, is the value that the chi-square value must exceed to reject the hypothesis or assumption at the given level of significance and degrees of freedom. The critical value can be obtained from a chi-square distribution table or a calculator.

The steps to perform a chi-square test are:

1. State the hypothesis or assumption and the alternative hypothesis. For example, the hypothesis or assumption could be that the preferences of men and women for a certain product are the same, and the alternative hypothesis could be that they are different.

2. Collect the data and construct a contingency table that shows the observed frequencies for each category or group. For example, the contingency table could show the number of men and women who prefer product A, B, or C.

3. Calculate the expected frequencies for each category or group based on the hypothesis or assumption. For example, the expected frequency for men who prefer product A could be obtained by multiplying the total number of men by the proportion of men and women who prefer product A.

4. Calculate the chi-square value using the formula above.

5. Choose a level of significance, such as 0.05, and find the critical value from a chi-square distribution table or a calculator using the degrees of freedom.

6. Compare the chi-square value with the critical value and draw a conclusion. If the chi-square value is greater than the critical value, reject the hypothesis or assumption and accept the alternative hypothesis. If the chi-square value is less than or equal to the critical value, do not reject the hypothesis or assumption.

To illustrate the chi-square test with an example, suppose we want to test whether the distribution of blood types in a sample of 100 people matches the expected distribution for the population. The expected distribution for the population is:

| Blood Type | A | B | AB | O |

| Percentage | 40 | 10 | 5 | 45 |

The observed frequencies for the sample are:

| Blood Type | A | B | AB | O |

| Frequency | 36 | 14 | 8 | 42 |

The hypothesis or assumption is that the observed frequencies are equal to the expected frequencies, and the alternative hypothesis is that they are not equal. The chi-square value is calculated as:

$$\chi^2 = \frac{(36 - 40)^2}{40} + \frac{(14 - 10)^2}{10} + \frac{(8 - 5)^2}{5} + \frac{(42 - 45)^2}{45} = 3.5$$

The degrees of freedom is $$4 - 1 = 3$$. If we choose a level of significance of 0.05, the critical value from a chi-square distribution table or a calculator is $$\chi^2_{0.05, 3} = 7.815$$. Since the chi-square value is less than the critical value, we do not reject the hypothesis or assumption and conclude that the distribution of blood types in the sample matches the expected distribution for the population.

2. Goodness-of-Fit, Test of Independence, and Test of Homogeneity

chi-square tests are a type of statistical analysis that can be used to compare observed and expected frequencies of categorical data. There are three main types of chi-square tests, each with a different purpose and application. In this section, we will discuss the following types of chi-square tests:

1. goodness-of-fit test: This test is used to determine whether a sample of categorical data fits a certain distribution or proportion. For example, we can use a goodness-of-fit test to check if a six-sided die is fair by comparing the observed frequencies of each outcome with the expected frequencies of 1/6 for each outcome.

2. Test of independence: This test is used to determine whether two categorical variables are independent or associated. For example, we can use a test of independence to check if there is a relationship between gender and political preference by comparing the observed frequencies of each combination of gender and preference with the expected frequencies based on the marginal totals.

3. Test of homogeneity: This test is used to determine whether two or more populations have the same distribution of a categorical variable. For example, we can use a test of homogeneity to check if the preference for a certain brand of soda is the same across different age groups by comparing the observed frequencies of each combination of age group and soda preference with the expected frequencies based on the overall proportion of each preference.

All three types of chi-square tests use the same formula to calculate the test statistic, which is:

$$\chi^2 = \sum \frac{(O-E)^2}{E}$$

Where O is the observed frequency and E is the expected frequency. The expected frequency is calculated differently depending on the type of test. The test statistic follows a chi-square distribution with a certain degree of freedom, which is also determined by the type of test. The p-value of the test is the probability of obtaining a test statistic as extreme or more extreme than the observed one, assuming the null hypothesis is true. The null hypothesis for each type of test is:

- Goodness-of-fit test: The sample data fits the specified distribution or proportion.

- Test of independence: The two categorical variables are independent.

- Test of homogeneity: The populations have the same distribution of the categorical variable.

The alternative hypothesis for each type of test is the negation of the null hypothesis. To perform a chi-square test, we need to set a significance level, which is the maximum probability of making a type I error (rejecting the null hypothesis when it is true). A common choice for the significance level is 0.05. If the p-value of the test is less than or equal to the significance level, we reject the null hypothesis and conclude that there is sufficient evidence to support the alternative hypothesis. If the p-value of the test is greater than the significance level, we fail to reject the null hypothesis and conclude that there is not enough evidence to support the alternative hypothesis.

3. Assumptions and Conditions for Chi-Square Tests

Before applying the chi-square test to compare observed and expected frequencies, it is important to check some assumptions and conditions that ensure the validity and reliability of the test. These are:

1. Independence: The observations or counts in each category must be independent of each other. This means that the outcome of one observation does not affect the outcome of another observation. For example, if we are testing the hypothesis that the proportion of male and female students in a class is equal, we must assume that the gender of each student is independent of the gender of any other student in the class.

2. Randomness: The observations or counts in each category must be obtained from a random sample or a randomized experiment. This means that the sample or the experiment is representative of the population or the phenomenon of interest and is not biased by any external factors. For example, if we are testing the hypothesis that the proportion of defective and non-defective products in a factory is equal, we must ensure that the sample of products we test is randomly selected from the entire production line and not influenced by any quality control measures or human intervention.

3. Sample size: The observations or counts in each category must be large enough to ensure that the chi-square distribution is a good approximation of the sampling distribution of the test statistic. A common rule of thumb is that the expected frequency in each category should be at least 5. For example, if we are testing the hypothesis that the distribution of blood types in a population is the same as the distribution of blood types in a reference population, we must ensure that the sample size is large enough to have at least 5 expected counts for each blood type.

If these assumptions and conditions are met, we can proceed to calculate the chi-square test statistic and compare it with the critical value or the p-value to draw a conclusion about the hypothesis. However, if any of these assumptions and conditions are violated, we may need to use a different test or adjust the test to account for the violation. For example, if the expected frequency in some categories is less than 5, we may need to combine some categories or use a different test such as Fisher's exact test.

4. Steps and Examples

One of the most common applications of the chi-square test is to compare the observed and expected frequencies of categorical data. This can help us to determine whether there is a significant difference between the actual distribution of the data and the expected distribution under a certain hypothesis. For example, we can use the chi-square test to see if the gender ratio of a sample is different from the population, or if the preferences of different groups are independent of each other.

To perform a chi-square test, we need to follow these steps:

1. Define the null and alternative hypotheses. The null hypothesis is usually that there is no difference between the observed and expected frequencies, or that the variables are independent of each other. The alternative hypothesis is the opposite of the null hypothesis.

2. Calculate the expected frequencies. The expected frequencies are based on the assumption that the null hypothesis is true. They can be calculated by multiplying the row and column totals of the contingency table and dividing by the grand total. For example, if we have a 2x2 table with row totals of 60 and 40, and column totals of 50 and 50, the expected frequency for the top left cell is (60 x 50) / 100 = 30.

3. Calculate the chi-square statistic. The chi-square statistic is the sum of the squared differences between the observed and expected frequencies, divided by the expected frequencies. The formula is:

$$\chi^2 = \sum \frac{(O - E)^2}{E}$$

Where O is the observed frequency and E is the expected frequency. For example, if the observed frequency for the top left cell is 35, the chi-square statistic for that cell is (35 - 30)^2 / 30 = 0.833.

4. Determine the degrees of freedom. The degrees of freedom are the number of categories that can vary independently. They can be calculated by subtracting one from the number of rows and columns of the contingency table, and multiplying them together. For example, if we have a 2x2 table, the degrees of freedom are (2 - 1) x (2 - 1) = 1.

5. Find the p-value. The p-value is the probability of obtaining a chi-square statistic as extreme or more extreme than the one we calculated, assuming that the null hypothesis is true. We can find the p-value by using a chi-square distribution table or a calculator. We need to look for the row that matches our degrees of freedom, and the column that corresponds to our chi-square statistic or the closest value to it. For example, if we have a chi-square statistic of 0.833 and 1 degree of freedom, the p-value is 0.361.

6. Draw a conclusion. We need to compare the p-value with a significance level, usually 0.05 or 0.01. If the p-value is less than or equal to the significance level, we reject the null hypothesis and conclude that there is a significant difference between the observed and expected frequencies, or that the variables are not independent of each other. If the p-value is greater than the significance level, we fail to reject the null hypothesis and conclude that there is no significant difference between the observed and expected frequencies, or that the variables are independent of each other.

Let's see an example of how to apply these steps. Suppose we want to test whether the gender of a person affects their preference for chocolate or vanilla ice cream. We have a sample of 100 people, and we ask them to choose their favorite flavor. The results are shown in the table below:

| | Chocolate | Vanilla | Total |

| Male | 25 | 15 | 40 |

| Female| 15 | 45 | 60 |

| Total | 40 | 60 | 100 |

We can perform a chi-square test to see if the gender and the flavor preference are independent of each other. Here are the steps:

1. Define the null and alternative hypotheses. The null hypothesis is that the gender and the flavor preference are independent of each other, meaning that the proportion of males and females who prefer chocolate or vanilla is the same. The alternative hypothesis is that the gender and the flavor preference are not independent of each other, meaning that the proportion of males and females who prefer chocolate or vanilla is different.

2. Calculate the expected frequencies. Based on the null hypothesis, the expected frequencies are calculated by multiplying the row and column totals and dividing by the grand total. The table below shows the expected frequencies for each cell:

| | Chocolate | Vanilla | Total |

| Male | 16 | 24 | 40 |

| Female| 24 | 36 | 60 |

| Total | 40 | 60 | 100 |

3. Calculate the chi-square statistic. The chi-square statistic is the sum of the squared differences between the observed and expected frequencies, divided by the expected frequencies. The table below shows the chi-square statistic for each cell:

| | Chocolate | Vanilla | Total |

| Male | 5.062 | 3.375 | 8.437 |

| Female| 3.375 | 2.250 | 5.625 |

| Total | 8.437 | 5.625 | 14.062|

The chi-square statistic for the whole table is the sum of the chi-square statistics for each cell, which is 14.062.

4. Determine the degrees of freedom. The degrees of freedom are the number of categories that can vary independently. Since we have a 2x2 table, the degrees of freedom are (2 - 1) x (2 - 1) = 1.

5. Find the p-value. The p-value is the probability of obtaining a chi-square statistic as extreme or more extreme than the one we calculated, assuming that the null hypothesis is true. We can use a chi-square distribution table or a calculator to find the p-value. For 1 degree of freedom and a chi-square statistic of 14.062, the p-value is less than 0.001.

6. Draw a conclusion. We need to compare the p-value with a significance level, usually 0.05 or 0.01. Since the p-value is less than 0.001, which is much smaller than 0.05 or 0.01, we reject the null hypothesis and conclude that there is a significant difference between the observed and expected frequencies, or that the gender and the flavor preference are not independent of each other. This means that the proportion of males and females who prefer chocolate or vanilla is different, and that the gender of a person affects their preference for ice cream.

5. Chi-Square Statistic, P-Value, and Effect Size

After conducting a chi-square test, you may wonder what the results mean and how to report them. In this section, we will explain how to interpret the three main components of a chi-square test output: the chi-square statistic, the p-value, and the effect size. We will also provide some examples to illustrate these concepts.

- The chi-square statistic is a measure of how much the observed frequencies deviate from the expected frequencies under the null hypothesis. The larger the chi-square statistic, the more evidence there is against the null hypothesis. The chi-square statistic follows a chi-square distribution with a certain number of degrees of freedom, which depends on the number of categories and the type of chi-square test. For example, if you perform a chi-square test of independence on a 2x2 contingency table, the degrees of freedom are equal to (2-1)x(2-1) = 1. You can use a chi-square table or a calculator to find the critical value of the chi-square statistic for a given level of significance (usually 0.05 or 0.01) and degrees of freedom. If the chi-square statistic is greater than or equal to the critical value, you can reject the null hypothesis and conclude that there is a significant association between the variables.

- The p-value is the probability of obtaining a chi-square statistic as extreme or more extreme than the one observed, assuming that the null hypothesis is true. The smaller the p-value, the more evidence there is against the null hypothesis. You can use a p-value calculator or a software program to find the exact p-value of the chi-square statistic. If the p-value is less than or equal to the level of significance (usually 0.05 or 0.01), you can reject the null hypothesis and conclude that there is a significant association between the variables.

- The effect size is a measure of how strong the association between the variables is. The effect size is independent of the sample size and the level of significance. There are different ways to calculate the effect size for a chi-square test, depending on the type of chi-square test and the number of categories. One of the most common effect size measures is Cramer's V, which ranges from 0 to 1, where 0 indicates no association and 1 indicates a perfect association. Cramer's V is calculated by taking the square root of the ratio of the chi-square statistic to the sample size, and dividing by the smaller of the number of rows or columns minus one. For example, if you perform a chi-square test of independence on a 2x2 contingency table with a sample size of 100 and a chi-square statistic of 25, Cramer's V is equal to $\sqrt{\frac{25}{100}} \div (2-1) = 0.5$. You can use a table or a rule of thumb to interpret the value of Cramer's V. For example, a Cramer's V of 0.1 or less indicates a weak association, a Cramer's V of 0.3 or more indicates a strong association, and a Cramer's V of 0.5 or more indicates a very strong association.

To illustrate these concepts, let's look at an example. Suppose you want to test whether there is a significant association between gender and preference for chocolate or vanilla ice cream. You collect data from 50 randomly selected students and obtain the following contingency table:

| | Chocolate | Vanilla | Total |

| Male | 12 | 8 | 20 |

| Female | 18 | 12 | 30 |

| Total | 30 | 20 | 50 |

You perform a chi-square test of independence and obtain the following results:

- The chi-square statistic is equal to 0.8.

- The degrees of freedom are equal to (2-1)x(2-1) = 1.

- The p-value is equal to 0.37.

- The effect size (Cramer's V) is equal to $\sqrt{\frac{0.8}{50}} \div (2-1) = 0.13$.

To interpret these results, you can compare the chi-square statistic and the p-value to the critical value and the level of significance, respectively. Using a chi-square table or a calculator, you can find that the critical value of the chi-square statistic for a level of significance of 0.05 and 1 degree of freedom is equal to 3.84. Since the chi-square statistic (0.8) is less than the critical value (3.84), you cannot reject the null hypothesis and conclude that there is no significant association between gender and ice cream preference. Alternatively, you can compare the p-value (0.37) to the level of significance (0.05). Since the p-value (0.37) is greater than the level of significance (0.05), you cannot reject the null hypothesis and conclude that there is no significant association between gender and ice cream preference.

To interpret the effect size, you can use a table or a rule of thumb to evaluate the strength of the association. Using a table or a rule of thumb, you can find that a Cramer's V of 0.13 indicates a weak association between gender and ice cream preference. This means that the difference in ice cream preference between males and females is small and not very meaningful.

6. APA Style and Tables

After conducting a chi-square test, you need to report the results in a clear and concise manner. The APA style provides guidelines for formatting and presenting statistical results in academic papers. There are two main components of reporting a chi-square test: the text and the table.

- In the text, you should include the following information:

1. The type of chi-square test you performed (goodness-of-fit, test of independence, or test of homogeneity).

2. The degrees of freedom (df), which is the number of categories minus one for a goodness-of-fit test, or the product of the number of rows minus one and the number of columns minus one for a test of independence or homogeneity.

3. The chi-square value ($\chi^2$), which is the sum of the squared differences between the observed and expected frequencies divided by the expected frequencies.

4. The p-value, which is the probability of obtaining a chi-square value equal to or more extreme than the one observed, assuming the null hypothesis is true.

5. The effect size, which is a measure of the strength of the relationship between the variables. One common effect size for chi-square tests is Cramér's V, which ranges from 0 to 1, with higher values indicating stronger associations.

6. The interpretation of the results, which is based on the p-value and the effect size. You should state whether the results are statistically significant (p < 0.05) or not (p > 0.05), and whether the association between the variables is weak, moderate, or strong.

- In the table, you should display the following information:

1. The observed frequencies, which are the actual counts of the data in each category or cell.

2. The expected frequencies, which are the theoretical counts of the data in each category or cell, based on the null hypothesis.

3. The residuals, which are the differences between the observed and expected frequencies.

4. The standardized residuals, which are the residuals divided by the square root of the expected frequencies. These values indicate how far each cell deviates from the expected value, with larger absolute values indicating greater discrepancies.

5. The table title, which should describe the variables and the type of chi-square test.

6. The table number, which should be in Arabic numerals and follow the order of appearance in the text.

7. The table notes, which should provide any additional information or explanations about the table, such as the source of the data, the symbols used, or the definitions of the variables.

- For example, suppose you conducted a chi-square test of independence to examine the relationship between gender and preference for chocolate or vanilla ice cream among 100 students. You obtained the following results:

| Gender | Chocolate | Vanilla | Total |

| Male | 30 | 20 | 50 |

| Female | 25 | 25 | 50 |

| Total | 55 | 45 | 100 |

| Gender | Chocolate | Vanilla |

| Male | 27.5 | 22.5 |

| Female | 27.5 | 22.5 |

| Gender | Chocolate | Vanilla |

| Male | 2.5 | -2.5 |

| Female | -2.5 | 2.5 |

| Gender | Chocolate | Vanilla |

| Male | 0.47 | -0.47 |

| Female | -0.47 | 0.47 |

You could report the results as follows:

- Text: A chi-square test of independence was performed to examine the relationship between gender and preference for chocolate or vanilla ice cream. The results showed that there was no significant association between the two variables, $\chi^2(1) = 0.45$, p = 0.50, V = 0.07. The effect size was very small, indicating a weak relationship between gender and ice cream preference.

- Table:

|Table 1. Observed and Expected Frequencies, Residuals, and Standardized Residuals for Gender and Ice Cream Preference|

| Gender | Chocolate | Vanilla |

| Male | 30 (27.5) | 20 (22.5) |

| | 2.5 | -2.5 | | | 0.47 | -0.47 |

| Female | 25 (27.5) | 25 (22.5) |

| | -2.5 | 2.5 | | | -0.47 | 0.47 |

Note. Values in parentheses are expected frequencies. Values below are residuals. Values in italics are standardized residuals.

7. Common Mistakes and Limitations of Chi-Square Tests

The chi-square test is a powerful and widely used statistical tool to compare observed and expected frequencies of categorical data. However, like any other statistical method, it has some common mistakes and limitations that need to be considered before applying it to real-world data. In this segment, we will discuss some of these pitfalls and how to avoid or overcome them.

Some of the common mistakes and limitations of chi-square tests are:

- 1. Assuming that the data are independent and identically distributed (i.i.d.). The chi-square test requires that the data are i.i.d., meaning that each observation is independent of the others and follows the same probability distribution. However, this assumption may not hold in some cases, such as when the data are collected from repeated measurements, clustered samples, or longitudinal studies. In such cases, the chi-square test may produce inaccurate or misleading results, as it does not account for the correlation or heterogeneity among the data. To address this issue, one may need to use alternative methods, such as generalized linear models, mixed-effects models, or multilevel models, that can handle dependent or non-identical data.

- 2. Ignoring the effect of sample size and cell counts. The chi-square test is sensitive to the sample size and the cell counts of the contingency table. A large sample size may lead to a significant chi-square value even if the difference between the observed and expected frequencies is small or negligible. On the other hand, a small sample size may result in a non-significant chi-square value even if the difference is large or meaningful. Similarly, a low cell count may cause the chi-square test to lose power or violate the validity conditions. A rule of thumb is that the expected frequency for each cell should be at least 5, and the total sample size should be at least 20. To deal with this problem, one may need to adjust the chi-square value using a correction factor, such as the Yates correction or the continuity correction, or use alternative tests, such as the Fisher's exact test or the G-test, that are more robust to small or sparse data.

- 3. Misinterpreting the p-value and the effect size. The p-value of the chi-square test indicates the probability of observing a chi-square value as large or larger than the one obtained from the data, assuming that the null hypothesis of no difference between the observed and expected frequencies is true. However, the p-value does not tell us how large or important the difference is, or whether it is practically or clinically significant. To measure the magnitude of the difference, one may need to calculate the effect size, such as the phi coefficient, the Cramer's V, or the odds ratio, that can quantify the strength of the association between the categorical variables. Moreover, the p-value and the effect size may not always agree with each other, as a small p-value may correspond to a small effect size, or vice versa. Therefore, one should report and interpret both the p-value and the effect size, and consider the context and the research question when drawing conclusions from the chi-square test.

- 4. Overlooking the assumptions and the validity conditions. The chi-square test is based on some assumptions and validity conditions that need to be checked before applying it to the data. One of the main assumptions is that the data are randomly sampled from the population of interest, and that the population follows a multinomial distribution. Another assumption is that the categories of the variables are mutually exclusive and exhaustive, meaning that each observation belongs to one and only one category, and that all possible categories are included in the analysis. Furthermore, the validity conditions of the chi-square test depend on the type and the number of the variables involved. For example, if the test is used to compare the proportions of a single categorical variable across two or more groups, then the groups should be independent of each other, and the variable should have the same number of categories for each group. If the test is used to examine the relationship between two categorical variables, then the variables should be independent of any other variables that may affect their frequencies, and the contingency table should have at least two rows and two columns. If any of these assumptions or conditions are violated, the chi-square test may not be appropriate or reliable, and one may need to use other methods, such as the Cochran-Mantel-Haenszel test, the log-linear model, or the chi-square test of homogeneity, that can accommodate different types or structures of the data.

These are some of the common mistakes and limitations of chi-square tests that one should be aware of and avoid when conducting or interpreting this type of analysis. By following these guidelines, one can ensure that the chi-square test is used correctly and effectively to compare observed and expected frequencies of categorical data.

8. Fishers Exact Test, Cochran-Mantel-Haenszel Test, and Log-Linear Models

The chi-square test is a widely used method for testing the association between two categorical variables. However, it has some limitations and assumptions that may not always be met in practice. For example, the chi-square test requires that the expected frequencies in each cell of the contingency table are at least 5, and that the sample size is large enough to ensure the validity of the chi-square approximation. Moreover, the chi-square test only measures the overall association between the variables, but does not provide any information about the nature or direction of the relationship. Therefore, depending on the research question and the data characteristics, some alternatives and extensions of the chi-square test may be more appropriate or informative. In this section, we will discuss three of them: Fisher's exact test, Cochran-Mantel-Haenszel test, and log-linear models.

- Fisher's exact test is a non-parametric test that can be used when the expected frequencies are too small for the chi-square test. It is based on the hypergeometric distribution, which models the probability of obtaining a given contingency table under the null hypothesis of no association. The test calculates the exact p-value by summing the probabilities of all tables that are as extreme or more extreme than the observed one. Fisher's exact test is more conservative than the chi-square test, meaning that it is less likely to reject the null hypothesis when it is true. However, it is also more computationally intensive, especially for large tables. An example of using Fisher's exact test is to test the association between gender and color preference in a sample of 20 people, where the expected frequencies are less than 5.

- Cochran-Mantel-Haenszel test is an extension of the chi-square test that can be used when the data are stratified by a third variable. It is a way of controlling for the confounding effect of the stratifying variable on the association between the two main variables. The test calculates a summary chi-square statistic that combines the information from all strata, and tests whether the association is consistent across strata. The test also provides an estimate of the common odds ratio, which measures the strength of the association. An example of using Cochran-Mantel-Haenszel test is to test the association between smoking and lung cancer in a sample of 1000 people, stratified by age group.

- Log-linear models are a generalization of the chi-square test that can be used to model the relationship between two or more categorical variables. They are based on the assumption that the cell counts in the contingency table follow a Poisson distribution, and that the logarithm of the expected counts is a linear function of the variables and their interactions. Log-linear models can test various hypotheses about the association, such as independence, homogeneity, symmetry, or higher-order interactions. They can also estimate the parameters of the model, such as the main effects and the interaction effects, and provide confidence intervals and goodness-of-fit measures. An example of using log-linear models is to test the association between gender, education, and income in a sample of 500 people, and to examine the effects of each variable and their interactions on the expected income.

9. Summary and Key Takeaways

In this article, we have explored the chi-square test, a statistical method that compares the observed and expected frequencies of categorical data. The chi-square test can be used to test various hypotheses, such as whether two variables are independent, whether a variable follows a certain distribution, or whether there is a difference between two or more groups. We have learned how to:

- Calculate the chi-square statistic, which measures the discrepancy between the observed and expected frequencies.

- Calculate the degrees of freedom, which reflect the number of categories and constraints in the data.

- Find the p-value, which indicates the probability of obtaining a chi-square statistic as extreme or more extreme than the one observed, assuming the null hypothesis is true.

- Interpret the results of the chi-square test, based on the p-value and the significance level.

To illustrate the application of the chi-square test, we have used two examples: one involving a survey of ice cream preferences, and another involving a genetic cross of pea plants. In both cases, we have followed these steps:

1. State the null and alternative hypotheses.

2. Define the categories and the expected frequencies.

3. Collect the data and calculate the observed frequencies.

4. Compute the chi-square statistic and the degrees of freedom.

5. Find the p-value using a chi-square table or a calculator.

6. Compare the p-value with the significance level and draw a conclusion.

The chi-square test is a useful tool for analyzing categorical data and testing hypotheses. However, it also has some limitations and assumptions that should be considered, such as:

- The data should be in the form of frequencies, not proportions or percentages.

- The categories should be mutually exclusive and exhaustive.

- The expected frequencies should be at least 5 for each category.

- The observations should be independent of each other.

By understanding the chi-square test and its assumptions, we can use it appropriately and confidently to answer various research questions.

