IB Maths AI HL Statistics Toolkit Paper 1 & 2 ~7 min read

Measures of Central Tendency

Once you’ve collected a dataset, the first thing you’ll want is a single number that captures where the centre is — what a “typical” value looks like. There are three classic measures: the mean, the median, and the mode. All three are called averages, but they don’t always give the same answer, and each has its own strengths. The mean uses every value but gets dragged around by extreme ones; the median ignores the extremes entirely; the mode just picks the most popular value. Picking the right measure for a given dataset is half the skill — calculating it is the other half.

📘 What you need to know

The three measures at a glance

Mode
most often
Value(s) that appear most frequently. Best for qualitative data and categorical surveys.
Median
middle
The middle value when sorted. Robust to outliers — extreme values don’t move it.
Mean
sum ÷ count
Uses every value. Sensitive to outliers because every extreme value adds to the sum.
Mean (ungrouped data) = 1n Σ xi = x1 + x2 + … + xnn in the formula booklet ✓
how the three measures behave for different shapes of data
Symmetric vs skewed — when do the three measures disagree? Symmetric distribution mean = median = mode Right-skewed distribution mode median mean mode < median < meanthe tail “pulls” the mean toward it
For symmetric data, all three measures coincide. For skewed data they spread apart — the mean is pulled toward the tail by extreme values, while the median and mode stay closer to the bulk of the data.

Calculating each measure for ungrouped data

The mode

The value (or values) that appears most often. If every value occurs exactly once, write “no mode” — don’t write “mode = 0”. A dataset can be bimodal (two modes) or multimodal (three or more) if several values tie for the highest frequency.

The median

Sort the data from smallest to largest. The median is the middle value. If n is odd, it’s the n+12th value. If n is even, take the midpoint of the two middle values.

The mean

Add up every value and divide by how many there are. The GDC does this instantly in statistics mode — but by hand, use the formula above.

🧭 Recipe — finding all three for an ungrouped dataset

  1. Sort the data in order of size. Almost all of the work is easier with sorted data.
  2. Mode: count how often each value appears. The value(s) with the highest count = mode.
  3. Median: the middle value (or midpoint of two middle values for even n).
  4. Mean: sum all values, divide by n.
  5. Always check: each measure should sit within the range of the data — if not, you’ve slipped somewhere.

🤔 Why the mean is sensitive to outliers

The mean uses every value in the sum, so a single huge value affects it. If you have salaries { 30, 32, 35, 33, 31, 34 } with mean 32.5, then adding a single £200 director’s salary jumps the mean to about 56 — even though the typical worker still earns about 32. The median, in contrast, stays put at 33 because it depends only on the middle value’s position, not its size.

Worked examples

WE 1

Find the mode, median, and mean

Find the mode, median, and mean for the data set below.

42    28    67    51    64    42

Step 1: sort the data 28, 42, 42, 51, 64, 67 Step 2: mode = most common 42 appears twice; all others once mode = 42 Step 3: median = middle value n = 6 (even) → midpoint of 3rd and 4th = (42 + 51) / 2 = 46.5 median = 46.5 Step 4: mean = Σx / n Σx = 42+28+67+51+64+42 = 294 mean = 294 / 6 = 49 mean = 49 all three lie between 28 and 67 — sanity check passes.
WE 2

Bimodal data with an even count

Find the mode(s), median, and mean of:

4    7    9    7    12    4    15    10

Step 1: sort the data 4, 4, 7, 7, 9, 10, 12, 15 Step 2: mode both 4 and 7 appear twice modes = 4 and 7 (bimodal) Step 3: median n = 8 → midpoint of 4th and 5th = (7 + 9) / 2 = 8 median = 8 Step 4: mean Σx = 4+7+9+7+12+4+15+10 = 68 mean = 68 / 8 = 8.5 mean = 8.5 when several values tie for highest frequency, list all of them as modes.
WE 3

No mode, odd count

Find the mode, median, and mean of:

8    12    3    17    25

Step 1: sort the data 3, 8, 12, 17, 25 Step 2: mode every value occurs exactly once no mode Step 3: median n = 5 (odd) → 3rd value median = 12 Step 4: mean Σx = 3+8+12+17+25 = 65 mean = 65 / 5 = 13 mean = 13 write “no mode” — never write “mode = 0”.
WE 4

Real-world — quiz scores

Ten students took a quiz marked out of 20. Their scores are:

12   15   11   18   14   15   13   17   15   10

Calculate the mode, median, and mean.

Step 1: sort the data 10, 11, 12, 13, 14, 15, 15, 15, 17, 18 Step 2: mode 15 appears 3 times — highest count mode = 15 Step 3: median (n = 10, even) midpoint of 5th and 6th: (14 + 15) / 2 median = 14.5 Step 4: mean Σx = 12+15+11+18+14+15+13+17+15+10 = 140 mean = 140 / 10 = 14 mean = 14 a slight gap between mean (14) and median (14.5) is normal for skew-free real data.
WE 5

Outlier effect — choose the right measure

The annual salaries (in £000s) of 7 employees at a small company are:

30,   32,   35,   33,   31,   34,   200

(a) Calculate the mean and the median.
(b) Which is a better measure of a “typical” salary, and why?

(a) sort the data 30, 31, 32, 33, 34, 35, 200 mean Σx = 30+31+32+33+34+35+200 = 395 mean = 395 / 7 ≈ 56.4 (£000s) median (n = 7, odd) → 4th value mean ≈ £56 400; median = £33 000 (b) which is “typical”? £200 000 is a clear outlier — likely the director it pulls the mean up to a value no employee actually earns (b) the median (£33 000) — it isn’t distorted by the outlier use the median when data is skewed or has obvious outliers.
WE 6

Find a missing value given the mean

A dataset of 5 numbers has a mean of 12. Four of the numbers are 8, 11, 15, and 14. Find the fifth number.

Step 1: use mean = Σx / n backwards Σx = mean × n = 12 × 5 = 60 Step 2: sum the known values 8 + 11 + 15 + 14 = 48 Step 3: subtract from the total fifth value = 60 − 48 = 12 fifth number = 12 verify (8+11+15+14+12) / 5 = 60 / 5 = 12 ✓ “backwards mean” problems are common in IB — always recover Σx first.

💡 Top tips

⚠ Common mistakes

Next up — Measures of Dispersion. The centre of the data is only half the story. The next set of statistics — range, interquartile range, variance, and standard deviation — describe how spread out the data is around the centre. Two datasets can share the same mean but feel completely different depending on their spread.

Need help with Statistics?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.

Book Free Session →