IB Maths AI SL Topic 4 — Hypothesis Testing Paper 2 Uniform · Binomial · Normal ~9 min read

Goodness of Fit Test

A χ² goodness of fit (GOF) test checks whether observed data fits a specific distribution — usually uniform, binomial, or normal. Same test statistic as the independence test, but instead of comparing two variables, you compare the data to a fixed model. Most of the work is computing the expected frequencies from the proposed distribution; the GDC handles everything afterwards.

📘 What you need to know

Idea: do the observed frequencies match what the proposed distribution predicts? If the gap is too big ⇒ reject H₀.
H₀: data follows the proposed distribution. H₁: data does NOT follow it.
Expected frequency = (probability under the model) × (total sample size N).
Uniform: E = Nk for every one of k outcomes.
Binomial / Normal: compute P(X = x) or P(a < X < b) on the GDC, then multiply by N.
Degrees of freedom: ν = k − 1, where k is the number of categories. Decision rule identical to the independence test.

Building the expected frequencies

The expected count for each category is what the proposed distribution would predict, scaled up to your sample size N:

• Uniform: every outcome has probability 1/k, so E = N/k for all categories.
• Binomial B(n, p): use the GDC to find P(X = 0), P(X = 1), … then multiply each by N.
• Normal N(μ, σ²): use the GDC’s normal-cdf to find P(a < X < b) for each class, then multiply by N. Watch out for unbounded end classes.

For each die face, a teal observed bar sits next to an orange expected bar (= 20 if the die were fair). The bars are close on average, so χ² is small (2.9) and the p-value is large (0.715) — consistent with a uniform distribution.

The χ² goodness of fit statistic χ²_calc = Σ (O_i − E_i)²E_i

degrees of freedom: ν = k − 1 (k = number of categories)

🧭 Recipe — full GOF test

State H₀ / H₁: data follows / does not follow the given distribution. Name the distribution clearly (e.g. “B(4, 0.5)” or “N(150, 20²)”).
Compute the expected frequencies: probability under the model × N.
Find ν = k − 1.
GDC: enter observed and expected lists, run the χ² GOF test — returns χ²_calc and p.
Compare and conclude in context: reject H₀ if χ²_calc > critical value (or p < α).

Expected frequencies sum to N: always check. For normal-distribution problems with unbounded ends, the four (or however many) probabilities must add to 1, so the expected frequencies must add to N.

Worked examples

WE 1

State hypotheses for a uniform GOF

A player suspects a six-sided die is biased. He rolls it many times and records the score on each roll. He plans to use a χ² test to decide whether the die is fair.

Write the null and alternative hypotheses.

“Fair die” means every face equally likely H₀: score follows uniform distribution (each face P = 1/6) H₁: score does NOT follow uniform distribution name the distribution explicitly, in context “uniform” is the technical word for “all outcomes equally likely”. H₀ is always specific; H₁ is just its negation.

WE 2

Expected frequencies for a uniform distribution

A spinner has five equally-sized coloured sections. A student spins it 60 times.

(a) Find the expected frequency of each colour, assuming the spinner is fair. (b) Write down the degrees of freedom for the test.

(a) Uniform → each outcome has same expected frequency E = N / k = 60 / 5 = 12 expected = 12 per colour (b) v = k − 1 v = 5 − 1 = 4 v = 4 all expected frequencies should be ≥ 5 for the test to be valid in the IA. Here 12 ≥ 5 — fine.

WE 3

Expected frequency for a binomial GOF

It is claimed that the number of successful free throws made by basketball players in 4 attempts follows the distribution B(4, 0.5). A sample of 160 players is studied.

Find the expected number of players who make exactly 2 successful free throws.

Step 1 — find P(X = 2) using B(4, 0.5) P(X=2) = C(4,2) × 0.5² × 0.5² = 6 × 0.0625 = 0.375 Step 2 — multiply by N expected = 160 × 0.375 = 60 expected = 60 players use the GDC’s binomial PDF directly: binomPdf(4, 0.5, 2) = 0.375. Round only the FINAL expected frequency.

WE 4

Expected frequencies for a normal GOF

The masses (g) of 100 fruits are claimed to follow the distribution N(150, 20²). The data is grouped into four classes.

Find the expected frequency for each class:

mass	m < 130	130 ≤ m < 150	150 ≤ m < 170	m ≥ 170

Use normal-cdf on the GDC with μ = 150, σ = 20 Class 1: P(m < 130) = P(Z < −1) P = 0.1587 → E = 100 × 0.1587 = 15.87 Class 2: P(130 ≤ m < 150) P = 0.3413 → E = 34.13 Class 3: P(150 ≤ m < 170) P = 0.3413 → E = 34.13 Class 4: P(m ≥ 170) P = 0.1587 → E = 15.87 Check sum = N 15.87 + 34.13 + 34.13 + 15.87 = 100 ✓ 15.87 · 34.13 · 34.13 · 15.87 end classes (< or ≥) use UNBOUNDED inputs on the GDC: −∞ and +∞ (or use very large numbers). Always check the four E’s sum to N.

WE 5

Full uniform GOF test using the p-value

The die from WE 1 is rolled 120 times with the following results:

face	1	2	3	4	5	6
obs.	25	18	22	20	15	20

A χ² GOF test is performed at the 5% significance level. The GDC gives χ²_calc ≈ 2.9 and p ≈ 0.715. State the conclusion.

Expected: 120/6 = 20 per face v = 6 − 1 = 5 Compare p with α p = 0.715, α = 0.05 0.715 > 0.05 → accept H₀ insufficient evidence that the die is biased small variations (25, 15) on individual faces are expected by chance — the χ² test asks whether the OVERALL pattern is too far from uniform, and here it isn’t.

WE 6

Normal GOF test using the critical value — reject H₀

For the fruit mass example (WE 4), suppose a sample of 400 fruits gives:

mass	m < 130	130 ≤ m < 150	150 ≤ m < 170	m ≥ 170
obs.	25	130	145	100

Test whether mass follows N(150, 20²) at the 5% level. The critical value is 7.815 and the GDC gives χ²_calc ≈ 45.2.

Hypotheses H₀: mass follows N(150, 20²) H₁: mass does NOT follow N(150, 20²) Expected (for N=400, scale WE 4 by 4) 63.48 · 136.52 · 136.52 · 63.48 (sum 400 ✓) v = 4 − 1 = 3 Compare χ² = 45.2 > cv = 7.815 → reject H₀ sufficient evidence mass does NOT follow N(150, 20²) huge χ² (45) compared to cv (7.8) — observed shifts (25 vs 63.48 in low class, 100 vs 63.48 in high class) drive the test statistic way above the threshold.

💡 Top tips

Expected frequencies sum to N: a fast sanity check before plugging into the GDC.
Uniform expected = N/k: one line, no GDC needed.
For binomial/normal: use GDC’s binomPDF / normalCDF to get each probability, then multiply by N.
Don’t round probabilities early — carry 4−5 dp and round at the final expected frequency.
Unbounded end classes: m < 130 means use −∞ (or 0 if mass) as lower bound; m ≥ 170 means use +∞ as upper bound.

⚠ Common mistakes

Wrong ν: for GOF, ν = k − 1, NOT (rows−1)(cols−1). That’s the independence test.
Forgetting to multiply by N: probabilities are not frequencies; expected freq = P × N.
Using midpoint of class instead of class-interval probability for the normal: use P(a < X < b) per class.
Probabilities don’t sum to 1: a sign you’ve missed an unbounded end. Re-check the class boundaries.
Conclusion without context: “accept H₀” is incomplete — quote the distribution and the variable.

Next up: The t-test. We swap categorical χ² for continuous data: compare the means of two samples (e.g. children vs adults solving a puzzle) using the pooled two-sample t-test. Same workflow — H₀/H₁, GDC, p-value — but now the test is about means rather than distributions.

Need help with AI SL Hypothesis Testing?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.

Book Free Session →

Goodness of Fit Test

📘 What you need to know

Building the expected frequencies

🧭 Recipe — full GOF test

Worked examples

💡 Top tips

⚠ Common mistakes

Need help with AI SL Hypothesis Testing?

Quick Links

Contact us

Follow us