IB Maths AA HL
Topic 4 — Statistics & Probability
Paper 1 & 2
~7 min read
Mean & Variance
For a discrete random variable X, the expected value E(X) is the long-run average — multiply each value by its probability and add. The variance Var(X) measures how spread out those values are around the mean. Both formulas are in the booklet — the work is just setting up the table, multiplying, and summing.
📘 What you need to know
- Expected value: E(X) = ∑ x · P(X = x) — the mean of X.
- E(X) need not be a possible value of X — e.g. expected number of tails in 5 coin flips is 2.5.
- E(X²) = ∑ x² · P(X = x) — square the values, NOT the probabilities.
- Variance: Var(X) = E(X²) − [E(X)]² — the “working” version of the formula.
- Standard deviation: σ(X) = √Var(X); always non-negative.
- E(X²) ≠ [E(X)]² in general — different orders of squaring give different numbers.
- Symmetric distribution: by symmetry, E(X) sits at the centre — no calculation needed.
- Game fairness: expected gain = E(prize) − cost; the game is fair iff this equals 0.
Expected value E(X)
Mean of a discrete random variable
E(X) = ∑ x · P(X = x)
Read it as: take each value, weight it by its probability, then add up the weighted values. The result is the long-run average of X — what you’d see if you ran the experiment many times.
Symmetry shortcut: if both the values AND the probabilities are symmetric about some midpoint m, then E(X) = m. Saves a row of arithmetic in exam questions.
Variance Var(X) and standard deviation
Working formula for variance
Var(X) = E(X²) − [E(X)]²
Compute E(X²) by squaring each value first, then weighting by the same probabilities, then summing. Subtract the square of the mean to get Var(X). The standard deviation is just the square root.
E(X²)
∑ x² · P(X = x)
square the values, then weight
[E(X)]²
(∑ x · P(X = x))²
weight first, then square the result
The order matters: E(X²) and [E(X)]² are almost never equal. Mixing them up is the single biggest mistake in this topic — variance comes out negative if you swap them.
Game fairness using E(X)
Let X be the prize random variable in a game with cost c to play. The player’s expected gain is E(X) − c. Three possibilities:
| Expected gain | Meaning | Fair? |
|---|
| E(X) − c > 0 | player expected to win money over time | not fair (favours player) |
| E(X) − c = 0 | break-even on average | fair |
| E(X) − c < 0 | player expected to lose money over time | not fair (favours operator) |
🧭 Recipe — calculate E(X) and Var(X)
- Set up the table: values x on top, probabilities P(X = x) below.
- Add a row for x·P(x) and sum it → E(X).
- Add a row for x²·P(x) and sum it → E(X²).
- Apply Var(X) = E(X²) − [E(X)]².
- Sense-check: variance must be ≥ 0; σ = √Var.
Worked examples
WE 1Compute E(X) directly from a table
The discrete random variable X has distribution:
| x | 1 | 2 | 3 | 4 | 5 |
|---|
| P(X = x) | 0.1 | 0.2 | 0.3 | 0.25 | 0.15 |
|---|
Find the expected value of X.
Apply E(X) = ∑ x · P(x)
E(X) = 1(0.1) + 2(0.2) + 3(0.3) + 4(0.25) + 5(0.15)
= 0.1 + 0.4 + 0.9 + 1.0 + 0.75
= 3.15
E(X) = 3.15
3.15 isn’t one of the values X can take — that’s expected (pun intended)
WE 2Compute E(Y) and Var(Y)
The discrete random variable Y has distribution:
| y | 0 | 1 | 2 | 3 |
|---|
| P(Y = y) | 0.4 | 0.3 | 0.2 | 0.1 |
|---|
Find (a) E(Y); (b) Var(Y).
(a) E(Y) = ∑ y · P(y)
= 0(0.4) + 1(0.3) + 2(0.2) + 3(0.1)
= 0 + 0.3 + 0.4 + 0.3 = 1.0
(b) E(Y²) = ∑ y² · P(y)
= 0(0.4) + 1(0.3) + 4(0.2) + 9(0.1)
= 0 + 0.3 + 0.8 + 0.9 = 2.0
Apply Var(Y) = E(Y²) − [E(Y)]²
Var(Y) = 2.0 − 1.0² = 1.0
E(Y) = 1.0; Var(Y) = 1.0
always compute E(X) BEFORE squaring it for the variance formula
WE 3Game fairness — spinner game
A spinner game costs $5 to play. The prize T (in dollars) has distribution:
| t | 2 | 4 | 8 | 20 |
|---|
| P(T = t) | 0.5 | 0.3 | 0.15 | 0.05 |
|---|
(a) Find E(T). (b) Determine whether the game is fair.
(a) E(T) = ∑ t · P(t)
= 2(0.5) + 4(0.3) + 8(0.15) + 20(0.05)
= 1 + 1.2 + 1.2 + 1 = 4.4
(b) Compare expected prize with cost
Expected gain = E(T) − cost = 4.4 − 5 = −0.6
Player loses $0.60 on average → NOT fair
(a) E(T) = $4.40; (b) not fair (expected loss = $0.60)
“fair” = expected gain is exactly 0; anything else favours one side
WE 4Use symmetry to find E(M)
A discrete random variable M takes values 4, 7, 10, 13, 16 with probabilities 0.1, 0.25, 0.3, 0.25, 0.1 respectively. Find E(M) using symmetry, and verify the result.
Spot the symmetry
Values 4, 7, 10, 13, 16 are symmetric about 10 (steps of ±3, ±6)
Probabilities 0.1, 0.25, 0.3, 0.25, 0.1 are symmetric about the centre
→ E(M) = 10 (by symmetry)
Verify directly
E(M) = 4(0.1) + 7(0.25) + 10(0.3) + 13(0.25) + 16(0.1)
= 0.4 + 1.75 + 3 + 3.25 + 1.6 = 10 ✓
E(M) = 10
spot symmetry first — saves a row of arithmetic on exam day
WE 5Solve simultaneous equations from sum=1 and given E(X)
The discrete random variable X has distribution:
Given that E(X) = 2.6, find a and b.
Equation 1: ∑P = 1
a + b + 0.3 + 0.2 = 1
a + b = 0.5 … (1)
Equation 2: E(X) = 2.6
1(a) + 2(b) + 3(0.3) + 4(0.2) = 2.6
a + 2b + 1.7 = 2.6
a + 2b = 0.9 … (2)
Subtract (1) from (2)
b = 0.4, then a = 0.5 − 0.4 = 0.1
a = 0.1, b = 0.4
two unknowns → always need TWO equations: the sum rule plus E(X)
WE 6Full mean, variance and standard deviation
The number of customers N entering a small shop in a 10-minute period has distribution:
| n | 0 | 1 | 2 | 3 | 4 | 5 |
|---|
| P(N = n) | 0.10 | 0.20 | 0.30 | 0.20 | 0.15 | 0.05 |
|---|
Find (a) E(N); (b) E(N²); (c) Var(N); (d) the standard deviation σ(N).
(a) E(N) = ∑ n · P(n)
= 0(0.10) + 1(0.20) + 2(0.30) + 3(0.20) + 4(0.15) + 5(0.05)
= 0 + 0.20 + 0.60 + 0.60 + 0.60 + 0.25 = 2.25
(b) E(N²) = ∑ n² · P(n)
= 0 + 1(0.20) + 4(0.30) + 9(0.20) + 16(0.15) + 25(0.05)
= 0 + 0.20 + 1.20 + 1.80 + 2.40 + 1.25 = 6.85
(c) Var(N) = E(N²) − [E(N)]²
= 6.85 − 2.25² = 6.85 − 5.0625 = 1.7875
(d) σ(N) = √Var(N)
= √1.7875 ≈ 1.337
E(N) = 2.25; Var(N) = 1.7875; σ(N) ≈ 1.34
tabulate n·P(n) and n²·P(n) row by row — clean working, easy marks
💡 Top tips
- Always tabulate x·P(x) and x²·P(x) as separate rows — clean, low-error working.
- Check for symmetry first: if values and probabilities are symmetric, E(X) is just the midpoint.
- Use the GDC stats mode (values as data, probabilities as frequencies) to double-check on calculator paper.
- Game fairness: net gain = prize − cost; subtract the cost ONCE, on the expected prize.
- Variance must be ≥ 0: a negative answer means E(X²) and [E(X)]² got swapped.
⚠ Common mistakes
- Confusing E(X²) with [E(X)]² — they’re almost never equal; one squares values, the other squares the mean.
- Squaring the probabilities in E(X²) — only the values get squared.
- Subtracting the wrong way round: the formula is Var = E(X²) − [E(X)]², not the reverse.
- Forgetting to subtract the cost when checking game fairness — E(prize) alone says nothing about fairness.
- Reporting variance with the wrong units in worded problems — Var has units squared (e.g. dollars²).
Next: Transformation of a Single Variable. If you know E(X) and Var(X), what are E(aX + b) and Var(aX + b)? Two short formulas turn this into one of the easiest topics on Paper 1.
Need help with Statistics & Probability?
Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.
Book Free Session →