IB Maths AI HL Normal Distribution Paper 1 & 2 ~7 min read

The Normal Distribution

The normal distribution is the famous bell curve — a continuous probability distribution, unlike the discrete binomial. It models heaps of real-life measurements (heights, weights, times) that cluster symmetrically around a central value. It’s written X ∼ N(μ, σ2), and because it’s continuous, the probability of any single exact value is zero — you always work with areas under the curve.

📘 What you need to know

Continuous variables & area

A continuous random variable measures something — height, weight, time — and can take any value in a range. For a continuous distribution the probability of an exact value is always zero, so we measure probability as area under the probability density curve.

Probability as area P(aXb) = area under the curve from a to b   |   total area = 1 P(X = k) = 0, so ≤ and < give the same answer

🤔 Why is P(X = k) = 0 for a continuous variable?

Probability is the area under the curve. A single point has no width, so its “area” — and therefore its probability — is zero. This is why, for the normal distribution, it never matters whether an inequality is strict (<, >) or weak (≤, ≥): P(X < k) = P(Xk).

Shape & the 68–95–99.7 rule

The normal curve is symmetric and bell-shaped, centred on μ. The standard deviation σ controls its width, and a fixed share of the data falls within each band of σ from the mean.

The 68–95–99.7 (empirical) rule
μ−3σ μ−2σ μ−σ μ μ+σ μ+2σ μ+3σ 68% 95% 99.7%
~68% within ±σ, ~95% within ±2σ, ~99.7% within ±3σ of the mean.
How μ and σ change the curve
Same σ, different μ curve slides sideways Same μ, different σ large σ → short & wide
A small variance gives a tall, narrow curve; a large variance gives a short, wide one (same area).

🧠 Memory aid — variance vs standard deviation

The notation N(μ, σ2) gives you the variance as the second number. If a question wants the standard deviation, square-root it. So N(40, 100) means μ = 40 and σ2 = 100, giving σ = √100 = 10 — not 100.

Modelling with the normal

Many continuous, symmetric, single-peaked variables can be modelled normally if the population is large enough. Although a normal variable can technically take any real value, values more than about 4 standard deviations from the mean have practically zero density — which is why it can model things like height that can’t truly be negative.

Can model ✓
symmetric, 1 mode
Heights, weights, times, running speeds — large population, bell-shaped, single peak.
Cannot model ✗
skewed / multimodal
Human lifespan (not symmetric); a random number generator (no single mode).
Exam habit: when a question mixes distributions, state clearly which one each variable follows (e.g. S ∼ N(40, 100)). To justify a normal model, the usual assumptions are that the variable is symmetrical and bell-shaped.

Worked examples

WE 1

Mean and standard deviation from notation

The speeds (mph) of a cheetah subspecies are modelled by S ∼ N(40, 100). Write down the mean and standard deviation.

read off μ and σ² μ = 40, σ² = 100 square-root for σ σ = √100 = 10 mean = 40 mph, sd = 10 mph the second number in N(μ, σ²) is the variance — don’t forget to root it.
WE 2

State the assumptions

State two assumptions needed to model the cheetah speeds with a normal distribution.

the distribution of speeds is… 1. symmetrical 2. bell-shaped symmetrical AND bell-shaped a large population with one mode also supports the model.
WE 3

Apply the 68% rule

For the cheetahs S ∼ N(40, 100), roughly what proportion run between 30 and 50 mph?

find how many σ from the mean 30 = 40 − 10 = μ − σ; 50 = 40 + 10 = μ + σ apply the empirical rule within μ ± σ → about 68%. ≈ 68% 30 to 50 is exactly one sd either side of the mean.
WE 4

Apply the 95% rule

For S ∼ N(40, 100), roughly what proportion run faster than 60 mph?

60 = 40 + 20 = μ + 2σ about 95% lie within μ ± 2σ, so 5% lie outside. split the 5% by symmetry 5% ÷ 2 = 2.5% in each tail ≈ 2.5% “faster than μ + 2σ” is just the upper tail.
WE 5

Is the normal a good model?

Explain whether human lifespan can be well modelled by a normal distribution.

check symmetry lifespans are NOT symmetric — most people live to old age, with a long left tail of early deaths. No — not symmetrical a normal model needs a symmetric, bell-shaped, single-mode variable.

💡 Top tips

⚠ Common mistakes

Next up — Calculations with the Normal Distribution. You’ll use your GDC’s normal CD function to find probabilities like P(a < X < b), handle one-sided tails with “very big” bounds, and run the inverse normal to go from a probability back to a value of x.

Need help with the Normal Distribution?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.

Book Free Session →