IB Maths AA HL
Topic 4 — Statistics & Probability
Paper 1 & 2
~7 min read
The Binomial Distribution
A binomial random variable counts the number of successes in n independent trials, each with the same probability p of success. If the scenario has fixed trials, two outcomes, independence and constant p, you’re looking at X ~ B(n, p). The mean is np and the variance is np(1 − p) — both in the booklet.
📘 What you need to know
- Notation: X ~ B(n, p) — n trials, probability p of success on each.
- Four conditions: fixed n; trials are independent; exactly two outcomes per trial; p is constant.
- Probability formula: P(X = r) = nCr · pr · (1 − p)n − r — usually computed on the GDC.
- Mean: E(X) = np.
- Variance: Var(X) = np(1 − p).
- Standard deviation: σ(X) = √[np(1 − p)].
- Sampling rule: with replacement → exactly binomial; without replacement → only ≈ binomial if the population is large.
- Cannot use binomial: number of trials not fixed, probability changes (fatigue, learning), or trials aren’t independent.
The four conditions
Before writing X ~ B(n, p), check all four boxes. If any one fails, the binomial model is wrong.
| Condition | What it means | Spot it from |
|---|
| Fixed n | The number of trials is decided in advance and finite | “flipped 20 times”, “sample of 80” |
| Independent | One trial doesn’t affect the next | with replacement, large population |
| Two outcomes | Each trial is success / not-success | “yellow / not yellow”, “defective / not” |
| Constant p | The probability of success doesn’t change | fair coin, fixed defect rate |
“Success” is just a label. A defective bulb, a missed shot, a person without a vaccine — call any of them “success” if it’s the thing you’re counting.
Mean, variance and standard deviation
Mean of a binomial
E(X) = np
Variance of a binomial
Var(X) = np(1 − p)
Standard deviation
σ(X) = √[np(1 − p)]
always the square root of variance
Why np?
n trials × p per trial
long-run average count of successes
What can — and can’t — be modelled binomially
| Scenario | Binomial? | Reason |
|---|
| Number of 6s in 25 dice rolls | YES | fixed n, two outcomes, independent, p = 1/6 |
| Times a coin is flipped until it lands on heads | NO | n is not fixed |
| Red marbles drawn from a small bag without replacement | NO | trials are not independent — p changes |
| Voters in a sample of 1000 from a country of millions | YES (approx.) | large population → independence holds well enough |
| Lengths a tired swimmer can finish under a minute | NO | p drops over time as fatigue sets in |
🧭 Recipe — set up and solve a binomial problem
- Identify the trial (the one event being repeated).
- Identify “success” (the outcome you’re counting) and find p.
- Check the four conditions: fixed n, independent, two outcomes, constant p.
- Write X ~ B(n, p) explicitly.
- Apply E(X) = np or Var(X) = np(1 − p) as needed.
Worked examples
WE 1Set up the model and find E and Var
A coin is biased so that P(heads) = 0.65. The coin is flipped 12 times. Let H be the number of heads. (a) State the distribution of H. (b) Find E(H) and Var(H).
(a) Identify n, p, and check the four conditions
n = 12 (fixed), two outcomes (heads/not heads),
independent flips, constant p = 0.65
→ H ~ B(12, 0.65)
(b) Apply E(X) = np and Var(X) = np(1−p)
E(H) = 12 × 0.65 = 7.8
Var(H) = 12 × 0.65 × 0.35 = 2.73
H ~ B(12, 0.65); E(H) = 7.8; Var(H) = 2.73
always state the distribution explicitly — earns its own mark
WE 2Real-world setup: defective bulbs
A factory produces light bulbs and 4% are defective. A quality inspector tests a random sample of 80 bulbs. Let D be the number of defective bulbs. (a) State the conditions for the binomial model. (b) State the distribution. (c) Find E(D) and σ(D).
(a) Conditions
• n = 80 fixed
• each bulb defective or not (two outcomes)
• random sample → trials independent
• defect rate constant at p = 0.04
(b) Distribution
D ~ B(80, 0.04)
(c) Mean and standard deviation
E(D) = 80 × 0.04 = 3.2
Var(D) = 80 × 0.04 × 0.96 = 3.072
σ(D) = √3.072 ≈ 1.753
D ~ B(80, 0.04); E(D) = 3.2; σ(D) ≈ 1.75
“defective” is the success label here — counting is what matters, not whether it’s “good”
WE 3Identify which scenarios are binomial
For each random variable, state whether a binomial distribution is appropriate. Justify briefly.
(a) X = number of 6s rolled in 25 throws of a fair die.
(b) Y = number of times a coin is flipped until the first heads.
(c) Z = number of red marbles drawn when 3 are taken without replacement from a bag containing 4 red and 6 blue.
(d) W = number of customers from a random sample of 100 who prefer brand A, where 30% of all customers prefer brand A.
(a) X — YES, binomial
n = 25 fixed, 6/not-6, independent, p = 1/6 → X ~ B(25, 1/6)
(b) Y — NO
number of trials is NOT fixed (depends on first heads)
(c) Z — NO
without replacement on a small bag → trials NOT independent;
p of red changes after each draw
(d) W — YES (approx.)
large population → sampling without replacement ≈ independent;
W ~ B(100, 0.30)
(a) Yes; (b) No; (c) No; (d) Yes (approx.)
small bag without replacement = NOT binomial; huge population = OK
WE 4Find n from E(X) and p
A binomial random variable X has p = 0.4 and E(X) = 12. Find (a) n; (b) Var(X).
(a) Use E(X) = np
12 = n × 0.4
n = 12 / 0.4 = 30
(b) Apply Var(X) = np(1−p)
Var(X) = 30 × 0.4 × 0.6
= 7.2
n = 30; Var(X) = 7.2
when one parameter is unknown, E(X) = np gives it in one line
WE 5Find p from n and E(Y)
A random variable Y ~ B(50, p) has E(Y) = 17.5. Find (a) p; (b) Var(Y) and σ(Y).
(a) Use E(Y) = np
17.5 = 50 × p
p = 17.5 / 50 = 0.35
(b) Apply Var(Y) = np(1−p)
Var(Y) = 50 × 0.35 × 0.65 = 11.375
σ(Y) = √11.375 ≈ 3.373
p = 0.35; Var(Y) = 11.375; σ(Y) ≈ 3.37
always check 0 ≤ p ≤ 1 — a probability outside that range means a slip somewhere
WE 6Archery — set up + criticise the model
An archer hits the bullseye on 30% of her shots, independently of previous shots. She takes 20 shots. Let B be the number of bullseyes.
(a) State the distribution of B. (b) Find E(B), Var(B) and σ(B). (c) Comment on whether the binomial model is still appropriate if the archer becomes tired by shot 18.
(a) Check conditions
n = 20 fixed, hit/miss, independent, p = 0.30 constant
→ B ~ B(20, 0.30)
(b) E, Var and σ
E(B) = 20 × 0.30 = 6
Var(B) = 20 × 0.30 × 0.70 = 4.2
σ(B) = √4.2 ≈ 2.049
(c) Effect of fatigue
If she gets tired, p decreases on later shots
→ p is NOT constant → binomial model NO LONGER appropriate
B ~ B(20, 0.30); E(B) = 6, σ(B) ≈ 2.05; tired → not binomial
criticism questions are easy marks if you cite the broken condition by name
💡 Top tips
- “Success” is just a label — it can be a defect, a miss, an absentee. Whatever you’re counting.
- Sampling from a large population is treated as binomial even though it’s technically without replacement.
- Always state the distribution: X ~ B(n, p) before computing — earns a setup mark.
- E(X) and Var(X) are in the booklet — look them up, don’t rely on memory.
- To find an unknown n or p, use E(X) = np as a one-line equation.
⚠ Common mistakes
- Using binomial when trials aren’t independent — small bags without replacement, family members all in one sample.
- Using binomial when p changes — fatigue, learning, weather conditions all break constancy.
- Confusing n trials with n successes — n is the number of trials, the random variable is the count of successes.
- Forgetting (1 − p) in the variance — Var(X) is np(1 − p), not just np.
- Reporting σ where Var was asked (or vice versa) — square root vs no square root.
Next: Calculating Binomial Probabilities. Once X ~ B(n, p) is set up, the GDC handles P(X = k), P(X ≤ k) and ranges in seconds — you just need to translate the wording into the right inequality.
Need help with Statistics & Probability?
Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.
Book Free Session →