📊 Understanding the Chi‑Squared Test: A Complete Worked Example
The Chi‑Squared (χ²) test is a simple but powerful method for analysing categorical data. It helps determine whether differences in your data are meaningful or simply due to chance.
🔍 What Does the Chi‑Squared Test Do?
The Chi‑Squared test compares:
- what you observed, and
- what you would expect if there were no relationship
The test statistic is:
![]()
Where O = observed frequency and E = expected frequency.
📁 Scenario
You want to test whether DNA (Did Not Attend) rates are related to age group in an outpatient clinic.
📋 Observed Data
| Age Group | Attended | DNA | Total |
|---|---|---|---|
| Under 40 | 40 | 20 | 60 |
| 40 and over | 30 | 10 | 40 |
| Total | 70 | 30 | 100 |
1️⃣ Hypotheses
- Null hypothesis H_0: DNA is independent of age group
- Alternative hypothesis H_1: DNA is not independent of age group
2️⃣ Observed Matrix
![]()
3️⃣ Expected Frequencies
Expected values use:
![]()
Calculations
![]()
![]()
![]()
![]()
📋 Expected Frequencies Table
| Age Group | Attended (E) | DNA (E) | Total |
|---|---|---|---|
| Under 40 | 42 | 18 | 60 |
| 40 and over | 28 | 12 | 40 |
| Total | 70 | 30 | 100 |
4️⃣ Chi‑Squared Calculation
Formula:
![]()
Cell contributions
![]()
![]()
![]()
![]()
Total Chi‑Squared
![]()
5️⃣ Degrees of Freedom
For an (r\times c) table:
![]()
Here:
r=2
c=2
So:
6️⃣ Decision
Critical value at df=1, \alpha =0.05:
![]()
![]()
Since 0.79<3.84, we fail to reject the null hypothesis.
7️⃣ Interpretation
There is no statistically significant association between age group and DNA status**.
Any differences could reasonably be due to chance.
🎯 Final Thoughts
This example shows the full workflow of a Chi‑Squared Test of Independence:
- Clean data table
- Observed vs expected values
- Chi‑Squared calculation
- Degrees of freedom
- Statistical decision