IB Maths AA SL Topic 4 β€” Statistics Toolkit Paper 1 & 2 ~10 min read

Measures of Dispersion

Two data sets can have the exact same mean but look completely different β€” one tightly bunched, the other wildly spread. Dispersion is the second half of the picture: how spread out the data is. This note covers all four measures: range, IQR, variance, and standard deviation.

πŸ“˜ What you need to know

Why “spread” matters

Imagine two students, both with a mean test score of 70%. Student A scored 68, 70, 72, 70, 70 β€” really consistent. Student B scored 30, 50, 70, 90, 110 β€” wild swings. Same mean, totally different stories.

Dispersion measures answer the question: “how spread out is the data?” A small dispersion means the data is tightly bunched around the centre. A big dispersion means the data is all over the place.

RANGE

Range = Max βˆ’ Min
Difference between the biggest and smallest values. Quick but rough.
Same units as the data

INTERQUARTILE RANGE

IQR = Q3 βˆ’ Q1
Spread of the middle 50% of the data. Ignores outliers.
Same units as the data

VARIANCE

σ²
Mean of the squared distances from the mean. Always positive.
Squared units (a bit awkward)

STANDARD DEVIATION

Οƒ = √variance
Square root of the variance. The most useful measure of spread.
Same units as the data
If you’re stuck on which measure to use: standard deviation is the safest default. It uses every value, has the right units, and is what most exam questions expect.

The range β€” quickest but roughest

The range is the difference between the largest and smallest values. That’s it.

Range
Range = Maximum βˆ’ Minimum

Example: for the data 4, 7, 2, 9, 5: max = 9, min = 2, so range = 7.

πŸ“

Why is the range “rough”?

The range only looks at the two most extreme values and ignores everything else. If your data has a single outlier β€” say, one freakishly tall person in a sample of average heights β€” the range jumps massively, even though the rest of the data is perfectly normal.

Quartiles and the IQR

Quartiles are three special values that split your sorted data into four equal-sized groups. Each group holds 25% of the data.

Quartiles split the sorted data into four equal pieces
IQR = middle 50%
Min
Q1
Q2 (median)
Q3
Max
Each section holds 25% of the data. The orange bracket shows the middle 50% β€” that’s the IQR.

The IQR formula

Interquartile range
IQR = Q3 βˆ’ Q1

The IQR measures how spread out the middle 50% of the data is. Because it ignores the lowest 25% and the highest 25%, it isn’t affected by outliers β€” that’s why it’s so useful.

πŸ€” Why is the IQR better than the range for skewed data?

Imagine 99 people earning around $50k per year, plus one billionaire earning $5 billion. The range would be about $5 billion β€” a meaningless number for describing the typical spread. But the IQR would still be a sensible figure (maybe $20k), because it ignores the extreme top and bottom 25%.

That’s why economists usually quote IQRs, not ranges, when describing income distributions.

πŸ“

Always use your GDC for quartiles

Different methods of finding quartiles by hand give slightly different answers. The IB expects you to use your calculator’s “1-Var Stats” function β€” it’ll give you Q1, Q2, Q3, the mean, and standard deviation all in one go.

Variance and standard deviation

The range and IQR only use a few values from your data set. The standard deviation uses every value β€” that’s what makes it powerful.

The idea: for each data point, find how far it is from the mean (this is the “deviation”). Then average those deviations. Big deviations means data spread far from the centre β€” small deviations means data clustered tightly.

The formulas (don’t memorise β€” use your GDC)

Variance (σ²)
σ² = Ξ£(xi βˆ’ ΞΌ)Β²n
Standard deviation (Οƒ)
Οƒ = √variance  =  √Σ(xi βˆ’ ΞΌ)Β²n

πŸ€” Why square the differences?

If you just took (xi βˆ’ ΞΌ) and averaged it, the positive and negative differences would cancel out, giving you 0 every time. Squaring kills the negatives, so all the deviations contribute properly.

The downside? Squared units are weird (mΒ² when you wanted m). That’s why we square-root at the end to get the standard deviation back into normal units.

🧠

Memory trick: “Variance is squared, SD wears the clothes”

Variance has the squared units (kgΒ², sΒ², mΒ²) β€” it’s the “naked” version. The standard deviation puts the original units back on (kg, s, m) by square-rooting. That’s why SD is the one you actually quote in answers.

How to find them on your GDC

  1. Enter the data into a list (List 1 or L1, depending on your model).
  2. Run “1-Var Stats” or “Statistics Calculation” on that list.
  3. Look for Οƒx (or “stdDev”) β€” that’s your standard deviation.
  4. Square it to get the variance, σ²x.
  5. Some calculators show variance directly β€” if they do, just read it off.
There’s also a thing called “sample standard deviation” (sx) which divides by n βˆ’ 1 instead of n. For IB AA SL, always use Οƒ (population standard deviation) unless the question says otherwise. They’re slightly different numbers!

Which measure should I use?

RangeIQRStandard deviation
Quick to calculate?YesNeed GDCNeed GDC
Uses every value?No (only 2)No (middle 50%)Yes
Affected by outliers?Yes (badly)NoYes
Same units as data?YesYesYes
Best for…Quick checksSkewed dataSymmetric data
Rule of thumb: if the data is roughly symmetric and outlier-free, use standard deviation. If it’s skewed or has outliers, use the IQR. The range is mostly a quick check.

Worked examples

WE 1

Find the range and IQR

Find the range and interquartile range for the data set below.

43,   29,   70,   51,   64,   43

Data: 43, 29, 70, 51, 64, 43range Max βˆ’ Min: 70 βˆ’ 29 = 41 Range = 41iqr Find Q1 and Q3 using GDC: Q1 = 43,   Q3 = 64 IQR = Q3 βˆ’ Q1: 64 βˆ’ 43 = 21 IQR = 21 always use the GDC for quartiles β€” different by-hand methods give different answers!
WE 2

Find the variance and standard deviation

Find the variance and standard deviation for the data set below. Give answers to 3 s.f.

43,   29,   70,   51,   64,   43

Data: 43, 29, 70, 51, 64, 43 Use the GDC’s 1-Var Stats β€” never calculate by hand in an exam! Enter data in list, run 1-Var Stats: Οƒx = 13.759…  (this is SD) Square it for variance: ΟƒxΒ² = 13.759…² = 189.333… Round to 3 s.f.: Variance = 189,   SD = 13.8 (3 s.f.) Οƒ has the same units as the data; σ² has squared units
WE 3

Compare two data sets using SD

Two students each took 5 maths tests. Their scores are:

Aisha: 78, 80, 82, 79, 81    |    Ben: 60, 95, 70, 100, 75

Find the mean and SD for each. Comment on consistency.

Use GDC for both. Same mean is possible β€” but SD shows real differences.aisha From GDC: mean = 80,   Οƒ β‰ˆ 1.41ben From GDC: mean = 80,   Οƒ β‰ˆ 14.6comment Same mean (80) but very different spreads. Aisha’s Οƒ is small β†’ consistent scores. Ben’s Οƒ is large β†’ wildly varying scores. Aisha is more consistent (smaller SD) two data sets can share a mean but have totally different shapes β€” that’s why SD matters
WE 4

How an outlier affects range vs IQR

The number of books read by 7 students last summer: 3, 4, 5, 6, 7, 8, 9. An eighth student says they read 50 books.

(a) Find the range and IQR for the original 7.   (b) Find the new range and IQR with all 8.   (c) Comment.

The 50 is a clear outlier. We’ll see how each measure responds.part (a) Original 7: 3, 4, 5, 6, 7, 8, 9 Range = 9 βˆ’ 3: Range = 6 From GDC: Q1 = 4,   Q3 = 8 β†’ IQR = 4 Range = 6,   IQR = 4part (b) All 8: 3, 4, 5, 6, 7, 8, 9, 50 Range = 50 βˆ’ 3: Range = 47 From GDC: Q1 = 4.5,   Q3 = 8.5 β†’ IQR = 4 Range = 47,   IQR = 4part (c) Range jumped from 6 to 47 (almost 8Γ— bigger!). IQR didn’t change at all. IQR is much more reliable when outliers exist range is affected by outliers; IQR ignores the top and bottom 25%
WE 5

Real-world interpretation of SD

The masses of 100 newborn babies have a mean of 3.4 kg and a standard deviation of 0.5 kg. The masses of 100 chocolate bars have a mean of 50 g and a standard deviation of 0.5 g.

(a) Compare the absolute spread.   (b) Comment on which data is more “consistent” relative to its mean.

Both have the same SD numerically (0.5) β€” but their scales are completely different.part (a) Babies SD = 0.5 kg = 500 g Chocolate SD = 0.5 g Babies have 1000Γ— more absolute spreadpart (b) Compare SD as a fraction of the mean: Babies: 0.5 Γ· 3.4 β‰ˆ 0.147 = 14.7% Chocolate: 0.5 Γ· 50 = 0.01 = 1% Chocolate is much more consistent (only 1% variation) always interpret SD relative to the size of the data β€” context matters!

πŸ’‘ Top tips

⚠ Common mistakes

You can now describe both where the centre of any data set is, and how spread out it is around that centre. The next note covers frequency tables β€” how to apply all of these formulas when you have lots of repeated values.

Need help with Measures of Dispersion?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.

Book Free Session β†’