IB Maths AA SL Topic 4 — Statistics Toolkit Paper 1 & 2 ~10 min read

Measures of Dispersion

Two data sets can have the exact same mean but look completely different — one tightly bunched, the other wildly spread. Dispersion is the second half of the picture: how spread out the data is. This note covers all four measures: range, IQR, variance, and standard deviation.

📘 What you need to know

The range = largest value − smallest value. Quick to find, but ruined by outliers.
The quartiles Q₁, Q₂, Q₃ split sorted data into four equal sections (each holding 25% of the values).
The interquartile range (IQR) = Q₃ − Q₁ — the spread of the middle 50%. Not affected by outliers.
The variance (σ²) measures the average squared distance from the mean.
The standard deviation (σ) = √variance. It’s the most important measure of spread.
For variance and standard deviation, use your GDC — don’t waste exam time calculating by hand.
The IQR formula is in the formula booklet. Variance and SD aren’t, but your GDC handles them.

Why “spread” matters

Imagine two students, both with a mean test score of 70%. Student A scored 68, 70, 72, 70, 70 — really consistent. Student B scored 30, 50, 70, 90, 110 — wild swings. Same mean, totally different stories.

Dispersion measures answer the question: “how spread out is the data?” A small dispersion means the data is tightly bunched around the centre. A big dispersion means the data is all over the place.

RANGE

Range = Max − Min

Difference between the biggest and smallest values. Quick but rough.

Same units as the data

INTERQUARTILE RANGE

IQR = Q₃ − Q₁

Spread of the middle 50% of the data. Ignores outliers.

Same units as the data

VARIANCE

σ²

Mean of the squared distances from the mean. Always positive.

Squared units (a bit awkward)

STANDARD DEVIATION

σ = √variance

Square root of the variance. The most useful measure of spread.

Same units as the data

If you’re stuck on which measure to use: standard deviation is the safest default. It uses every value, has the right units, and is what most exam questions expect.

The range — quickest but roughest

The range is the difference between the largest and smallest values. That’s it.

Range

Range = Maximum − Minimum

Example: for the data 4, 7, 2, 9, 5: max = 9, min = 2, so range = 7.

📍

Why is the range “rough”?

The range only looks at the two most extreme values and ignores everything else. If your data has a single outlier — say, one freakishly tall person in a sample of average heights — the range jumps massively, even though the rest of the data is perfectly normal.

Quartiles and the IQR

Quartiles are three special values that split your sorted data into four equal-sized groups. Each group holds 25% of the data.

Q₁ (lower quartile) — splits the lowest 25% from the rest.
Q₂ (median) — splits the data in half.
Q₃ (upper quartile) — splits the highest 25% from the rest.

Quartiles split the sorted data into four equal pieces

IQR = middle 50%

Min

Q₁

Q₂ (median)

Q₃

Max

Each section holds 25% of the data. The orange bracket shows the middle 50% — that’s the IQR.

The IQR formula

Interquartile range

IQR = Q₃ − Q₁

The IQR measures how spread out the middle 50% of the data is. Because it ignores the lowest 25% and the highest 25%, it isn’t affected by outliers — that’s why it’s so useful.

🤔 Why is the IQR better than the range for skewed data?

Imagine 99 people earning around $50k per year, plus one billionaire earning $5 billion. The range would be about $5 billion — a meaningless number for describing the typical spread. But the IQR would still be a sensible figure (maybe $20k), because it ignores the extreme top and bottom 25%.

That’s why economists usually quote IQRs, not ranges, when describing income distributions.

📍

Always use your GDC for quartiles

Different methods of finding quartiles by hand give slightly different answers. The IB expects you to use your calculator’s “1-Var Stats” function — it’ll give you Q₁, Q₂, Q₃, the mean, and standard deviation all in one go.

Variance and standard deviation

The range and IQR only use a few values from your data set. The standard deviation uses every value — that’s what makes it powerful.

The idea: for each data point, find how far it is from the mean (this is the “deviation”). Then average those deviations. Big deviations means data spread far from the centre — small deviations means data clustered tightly.

The formulas (don’t memorise — use your GDC)

Variance (σ²)

σ² = Σ(x_i − μ)²n

Standard deviation (σ)

σ = √variance = √Σ(x_i − μ)²n

🤔 Why square the differences?

If you just took (x_i − μ) and averaged it, the positive and negative differences would cancel out, giving you 0 every time. Squaring kills the negatives, so all the deviations contribute properly.

The downside? Squared units are weird (m² when you wanted m). That’s why we square-root at the end to get the standard deviation back into normal units.

🧠

Memory trick: “Variance is squared, SD wears the clothes”

Variance has the squared units (kg², s², m²) — it’s the “naked” version. The standard deviation puts the original units back on (kg, s, m) by square-rooting. That’s why SD is the one you actually quote in answers.

How to find them on your GDC

Enter the data into a list (List 1 or L1, depending on your model).
Run “1-Var Stats” or “Statistics Calculation” on that list.
Look for σ_x (or “stdDev”) — that’s your standard deviation.
Square it to get the variance, σ²_x.
Some calculators show variance directly — if they do, just read it off.

There’s also a thing called “sample standard deviation” (s_x) which divides by n − 1 instead of n. For IB AA SL, always use σ (population standard deviation) unless the question says otherwise. They’re slightly different numbers!

Which measure should I use?

	Range	IQR	Standard deviation
Quick to calculate?	Yes	Need GDC	Need GDC
Uses every value?	No (only 2)	No (middle 50%)	Yes
Affected by outliers?	Yes (badly)	No	Yes
Same units as data?	Yes	Yes	Yes
Best for…	Quick checks	Skewed data	Symmetric data

Rule of thumb: if the data is roughly symmetric and outlier-free, use standard deviation. If it’s skewed or has outliers, use the IQR. The range is mostly a quick check.

Worked examples

WE 1

Find the range and IQR

Find the range and interquartile range for the data set below.

43, 29, 70, 51, 64, 43

Data: 43, 29, 70, 51, 64, 43range Max − Min: 70 − 29 = 41 Range = 41iqr Find Q₁ and Q₃ using GDC: Q₁ = 43, Q₃ = 64 IQR = Q₃ − Q₁: 64 − 43 = 21 IQR = 21 always use the GDC for quartiles — different by-hand methods give different answers!

WE 2

Find the variance and standard deviation

Find the variance and standard deviation for the data set below. Give answers to 3 s.f.

43, 29, 70, 51, 64, 43

Data: 43, 29, 70, 51, 64, 43 Use the GDC’s 1-Var Stats — never calculate by hand in an exam! Enter data in list, run 1-Var Stats: σ_x = 13.759… (this is SD) Square it for variance: σ_x² = 13.759…² = 189.333… Round to 3 s.f.: Variance = 189, SD = 13.8 (3 s.f.) σ has the same units as the data; σ² has squared units

WE 3

Compare two data sets using SD

Two students each took 5 maths tests. Their scores are:

Aisha: 78, 80, 82, 79, 81 | Ben: 60, 95, 70, 100, 75

Find the mean and SD for each. Comment on consistency.

Use GDC for both. Same mean is possible — but SD shows real differences.aisha From GDC: mean = 80, σ ≈ 1.41ben From GDC: mean = 80, σ ≈ 14.6comment Same mean (80) but very different spreads. Aisha’s σ is small → consistent scores. Ben’s σ is large → wildly varying scores. Aisha is more consistent (smaller SD) two data sets can share a mean but have totally different shapes — that’s why SD matters

WE 4

How an outlier affects range vs IQR

The number of books read by 7 students last summer: 3, 4, 5, 6, 7, 8, 9. An eighth student says they read 50 books.

(a) Find the range and IQR for the original 7. (b) Find the new range and IQR with all 8. (c) Comment.

The 50 is a clear outlier. We’ll see how each measure responds.part (a) Original 7: 3, 4, 5, 6, 7, 8, 9 Range = 9 − 3: Range = 6 From GDC: Q₁ = 4, Q₃ = 8 → IQR = 4 Range = 6, IQR = 4part (b) All 8: 3, 4, 5, 6, 7, 8, 9, 50 Range = 50 − 3: Range = 47 From GDC: Q₁ = 4.5, Q₃ = 8.5 → IQR = 4 Range = 47, IQR = 4part (c) Range jumped from 6 to 47 (almost 8× bigger!). IQR didn’t change at all. IQR is much more reliable when outliers exist range is affected by outliers; IQR ignores the top and bottom 25%

WE 5

Real-world interpretation of SD

The masses of 100 newborn babies have a mean of 3.4 kg and a standard deviation of 0.5 kg. The masses of 100 chocolate bars have a mean of 50 g and a standard deviation of 0.5 g.

(a) Compare the absolute spread. (b) Comment on which data is more “consistent” relative to its mean.

Both have the same SD numerically (0.5) — but their scales are completely different.part (a) Babies SD = 0.5 kg = 500 g Chocolate SD = 0.5 g Babies have 1000× more absolute spreadpart (b) Compare SD as a fraction of the mean: Babies: 0.5 ÷ 3.4 ≈ 0.147 = 14.7% Chocolate: 0.5 ÷ 50 = 0.01 = 1% Chocolate is much more consistent (only 1% variation) always interpret SD relative to the size of the data — context matters!

💡 Top tips

Use your GDC. Don’t waste time computing variance or SD by hand. Enter data in a list, run 1-Var Stats, and read off everything you need.
Use σ_x not s_x. The IB AA SL syllabus uses population SD (σ), which divides by n. The “sample SD” (s_x) divides by n − 1 — close, but not the same answer.
Square SD to get variance. Variance = (SD)². Many calculators only display SD directly — just square it if asked for variance.
Match units carefully. Range, IQR, and SD all share units with the data. Variance has squared units (e.g. kg²). Don’t put kg² when the question wants kg.
Round to 3 s.f. in the final answer unless told otherwise. But keep more decimals in your working to avoid rounding errors.
For comparing data sets, always quote both a measure of centre and a measure of spread. Mean alone misses half the story.
For skewed data or data with outliers, use the IQR. For symmetric data, use the standard deviation.
“Consistent” or “reliable” in a question usually points to the smaller SD or IQR.

⚠ Common mistakes

Using s_x instead of σ_x. They’re both on your calculator — pick the right one. AA SL wants σ, not s.
Confusing variance with standard deviation. Variance is the squared version. SD is the square root. They are not the same number!
Forgetting to square root. If a question asks for “standard deviation” but you give the variance, you’ve done half the work.
Forgetting to square. If a question asks for “variance” but you give the SD, same problem.
Using the range when there’s a clear outlier. The range is destroyed by extreme values. Use the IQR instead.
Trying to find quartiles by hand. Different methods give different answers — and the IB expects GDC values. Don’t second-guess your calculator.
Quoting variance with the wrong units. If your data is in metres, variance is in m², not m.
Comparing SDs of different-scale data. An SD of 0.5 means very different things for babies’ weights vs. chocolate bars. Always think about relative spread when comparing across contexts.

You can now describe both where the centre of any data set is, and how spread out it is around that centre. The next note covers frequency tables — how to apply all of these formulas when you have lots of repeated values.

Need help with Measures of Dispersion?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.

Book Free Session →

Measures of Dispersion

📘 What you need to know

Why “spread” matters

RANGE

INTERQUARTILE RANGE

VARIANCE

STANDARD DEVIATION

The range — quickest but roughest

Why is the range “rough”?

Quartiles and the IQR

The IQR formula

🤔 Why is the IQR better than the range for skewed data?

Always use your GDC for quartiles

Variance and standard deviation

The formulas (don’t memorise — use your GDC)

🤔 Why square the differences?

Memory trick: “Variance is squared, SD wears the clothes”

How to find them on your GDC

Which measure should I use?

Worked examples

💡 Top tips

⚠ Common mistakes

Need help with Measures of Dispersion?

Quick Links

Contact us

Follow us