IB Maths AA HL Topic 4 — Statistics & Probability Paper 1 & 2 ~7 min read

The Binomial Distribution

A binomial random variable counts the number of successes in n independent trials, each with the same probability p of success. If the scenario has fixed trials, two outcomes, independence and constant p, you’re looking at X ~ B(n, p). The mean is np and the variance is np(1 − p) — both in the booklet.

📘 What you need to know

Notation: X ~ B(n, p) — n trials, probability p of success on each.
Four conditions: fixed n; trials are independent; exactly two outcomes per trial; p is constant.
Probability formula: P(X = r) = ⁿC_r · p^r · (1 − p)^{n − r} — usually computed on the GDC.
Mean: E(X) = np.
Variance: Var(X) = np(1 − p).
Standard deviation: σ(X) = √[np(1 − p)].
Sampling rule: with replacement → exactly binomial; without replacement → only ≈ binomial if the population is large.
Cannot use binomial: number of trials not fixed, probability changes (fatigue, learning), or trials aren’t independent.

The four conditions

Before writing X ~ B(n, p), check all four boxes. If any one fails, the binomial model is wrong.

Condition	What it means	Spot it from
Fixed n	The number of trials is decided in advance and finite	“flipped 20 times”, “sample of 80”
Independent	One trial doesn’t affect the next	with replacement, large population
Two outcomes	Each trial is success / not-success	“yellow / not yellow”, “defective / not”
Constant p	The probability of success doesn’t change	fair coin, fixed defect rate

“Success” is just a label. A defective bulb, a missed shot, a person without a vaccine — call any of them “success” if it’s the thing you’re counting.

Mean, variance and standard deviation

Mean of a binomial E(X) = np

Variance of a binomial Var(X) = np(1 − p)

Standard deviation

σ(X) = √[np(1 − p)]

always the square root of variance

Why np?

n trials × p per trial

long-run average count of successes

What can — and can’t — be modelled binomially

Scenario	Binomial?	Reason
Number of 6s in 25 dice rolls	YES	fixed n, two outcomes, independent, p = 1/6
Times a coin is flipped until it lands on heads	NO	n is not fixed
Red marbles drawn from a small bag without replacement	NO	trials are not independent — p changes
Voters in a sample of 1000 from a country of millions	YES (approx.)	large population → independence holds well enough
Lengths a tired swimmer can finish under a minute	NO	p drops over time as fatigue sets in

🧭 Recipe — set up and solve a binomial problem

Identify the trial (the one event being repeated).
Identify “success” (the outcome you’re counting) and find p.
Check the four conditions: fixed n, independent, two outcomes, constant p.
Write X ~ B(n, p) explicitly.
Apply E(X) = np or Var(X) = np(1 − p) as needed.

Worked examples

WE 1

Set up the model and find E and Var

A coin is biased so that P(heads) = 0.65. The coin is flipped 12 times. Let H be the number of heads. (a) State the distribution of H. (b) Find E(H) and Var(H).

(a) Identify n, p, and check the four conditions n = 12 (fixed), two outcomes (heads/not heads), independent flips, constant p = 0.65 → H ~ B(12, 0.65) (b) Apply E(X) = np and Var(X) = np(1−p) E(H) = 12 × 0.65 = 7.8 Var(H) = 12 × 0.65 × 0.35 = 2.73 H ~ B(12, 0.65); E(H) = 7.8; Var(H) = 2.73 always state the distribution explicitly — earns its own mark

WE 2

Real-world setup: defective bulbs

A factory produces light bulbs and 4% are defective. A quality inspector tests a random sample of 80 bulbs. Let D be the number of defective bulbs. (a) State the conditions for the binomial model. (b) State the distribution. (c) Find E(D) and σ(D).

(a) Conditions • n = 80 fixed • each bulb defective or not (two outcomes) • random sample → trials independent • defect rate constant at p = 0.04 (b) Distribution D ~ B(80, 0.04) (c) Mean and standard deviation E(D) = 80 × 0.04 = 3.2 Var(D) = 80 × 0.04 × 0.96 = 3.072 σ(D) = √3.072 ≈ 1.753 D ~ B(80, 0.04); E(D) = 3.2; σ(D) ≈ 1.75 “defective” is the success label here — counting is what matters, not whether it’s “good”

WE 3

Identify which scenarios are binomial

For each random variable, state whether a binomial distribution is appropriate. Justify briefly.

(a) X = number of 6s rolled in 25 throws of a fair die.
(b) Y = number of times a coin is flipped until the first heads.
(c) Z = number of red marbles drawn when 3 are taken without replacement from a bag containing 4 red and 6 blue.
(d) W = number of customers from a random sample of 100 who prefer brand A, where 30% of all customers prefer brand A.

(a) X — YES, binomial n = 25 fixed, 6/not-6, independent, p = 1/6 → X ~ B(25, 1/6) (b) Y — NO number of trials is NOT fixed (depends on first heads) (c) Z — NO without replacement on a small bag → trials NOT independent; p of red changes after each draw (d) W — YES (approx.) large population → sampling without replacement ≈ independent; W ~ B(100, 0.30) (a) Yes; (b) No; (c) No; (d) Yes (approx.) small bag without replacement = NOT binomial; huge population = OK

WE 4

Find n from E(X) and p

A binomial random variable X has p = 0.4 and E(X) = 12. Find (a) n; (b) Var(X).

(a) Use E(X) = np 12 = n × 0.4 n = 12 / 0.4 = 30 (b) Apply Var(X) = np(1−p) Var(X) = 30 × 0.4 × 0.6 = 7.2 n = 30; Var(X) = 7.2 when one parameter is unknown, E(X) = np gives it in one line

WE 5

Find p from n and E(Y)

A random variable Y ~ B(50, p) has E(Y) = 17.5. Find (a) p; (b) Var(Y) and σ(Y).

(a) Use E(Y) = np 17.5 = 50 × p p = 17.5 / 50 = 0.35 (b) Apply Var(Y) = np(1−p) Var(Y) = 50 × 0.35 × 0.65 = 11.375 σ(Y) = √11.375 ≈ 3.373 p = 0.35; Var(Y) = 11.375; σ(Y) ≈ 3.37 always check 0 ≤ p ≤ 1 — a probability outside that range means a slip somewhere

WE 6

Archery — set up + criticise the model

An archer hits the bullseye on 30% of her shots, independently of previous shots. She takes 20 shots. Let B be the number of bullseyes.

(a) State the distribution of B. (b) Find E(B), Var(B) and σ(B). (c) Comment on whether the binomial model is still appropriate if the archer becomes tired by shot 18.

(a) Check conditions n = 20 fixed, hit/miss, independent, p = 0.30 constant → B ~ B(20, 0.30) (b) E, Var and σ E(B) = 20 × 0.30 = 6 Var(B) = 20 × 0.30 × 0.70 = 4.2 σ(B) = √4.2 ≈ 2.049 (c) Effect of fatigue If she gets tired, p decreases on later shots → p is NOT constant → binomial model NO LONGER appropriate B ~ B(20, 0.30); E(B) = 6, σ(B) ≈ 2.05; tired → not binomial criticism questions are easy marks if you cite the broken condition by name

💡 Top tips

“Success” is just a label — it can be a defect, a miss, an absentee. Whatever you’re counting.
Sampling from a large population is treated as binomial even though it’s technically without replacement.
Always state the distribution: X ~ B(n, p) before computing — earns a setup mark.
E(X) and Var(X) are in the booklet — look them up, don’t rely on memory.
To find an unknown n or p, use E(X) = np as a one-line equation.

⚠ Common mistakes

Using binomial when trials aren’t independent — small bags without replacement, family members all in one sample.
Using binomial when p changes — fatigue, learning, weather conditions all break constancy.
Confusing n trials with n successes — n is the number of trials, the random variable is the count of successes.
Forgetting (1 − p) in the variance — Var(X) is np(1 − p), not just np.
Reporting σ where Var was asked (or vice versa) — square root vs no square root.

Next: Calculating Binomial Probabilities. Once X ~ B(n, p) is set up, the GDC handles P(X = k), P(X ≤ k) and ranges in seconds — you just need to translate the wording into the right inequality.

Need help with Statistics & Probability?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.

Book Free Session →

The Binomial Distribution

📘 What you need to know

The four conditions

Mean, variance and standard deviation

What can — and can’t — be modelled binomially

🧭 Recipe — set up and solve a binomial problem

Worked examples

💡 Top tips

⚠ Common mistakes

Need help with Statistics & Probability?

Quick Links

Contact us

Follow us