Key Idea: The chi-squared test of independence (ฯยฒ) tests whether two categorical variables are related, or whether they are independent. You set up a contingency table of observed frequencies, calculate expected frequencies if the variables were independent, and compare. A small p-value means the variables are not independent โ there is a statistically significant association.
โ Hypothesis test structure
๐ GDC method
Example: 2ร2 contingency table: Gender vs Preferred sport Observed: Male/Football=40, Male/Tennis=20, Female/Football=30, Female/Tennis=50. Grand total = 140. Row totals: Male=60, Female=80. Column totals: Football=70, Tennis=70. Expected (Male, Football) = (60 ร 70)/140 = 30 GDC gives: ฯยฒ = 9.33, p = 0.0023. Since p < 0.05, reject Hโ. Evidence that gender and sport preference are associated.
The conclusion must always be in context โ name the two variables. 'Reject Hโ' alone is not a full answer. The test only tells you that association exists โ it does not say how strong or in which direction.
Paper 2 (GDC allowed): State both hypotheses in full before running the test. After: write ฯยฒ, p-value, compare with ฮฑ, and state conclusion with the variable names. Check expected frequencies: After running the test on GDC, view the expected matrix and verify all values โฅ 5. If not, state this as a limitation of the test.