IB Maths AI HLStatistics ToolkitPaper 1 & 2~7 min read
Measures of Central Tendency
Once you’ve collected a dataset, the first thing you’ll want is a single number that captures where the centre is — what a “typical” value looks like. There are three classic measures: the mean, the median, and the mode. All three are called averages, but they don’t always give the same answer, and each has its own strengths. The mean uses every value but gets dragged around by extreme ones; the median ignores the extremes entirely; the mode just picks the most popular value. Picking the right measure for a given dataset is half the skill — calculating it is the other half.
📘 What you need to know
Mean, median, mode are all measures of central tendency — they all describe where the centre of the data is. All three are called averages.
The mode is the value that occurs most often. A dataset can have one mode, more than one (bimodal, multimodal), or no mode at all if every value is unique.
The median is the middle value when the data is sorted. If there are two middle values, take their midpoint.
The mean is the sum of all values divided by the number of values: x̄ = 1n Σ xi.
The mean can also be written as μ (Greek letter “mu”) — used when describing a whole population.
Units for all three are the same as the original data.
GDC: input the data in statistics mode and the calculator returns the mean, median, sum, and other statistics in one go.
“No mode” ≠ “the mode is 0”. If every value occurs equally often, write “no mode” explicitly.
Outliers affect the mean strongly but not the median or mode — choose the median for skewed data with extreme values.
The three measures at a glance
Mode
most often
Value(s) that appear most frequently. Best for qualitative data and categorical surveys.
Median
middle
The middle value when sorted. Robust to outliers — extreme values don’t move it.
Mean
sum ÷ count
Uses every value. Sensitive to outliers because every extreme value adds to the sum.
Mean (ungrouped data)x̄ = 1nΣxi = x1 + x2 + … + xnnin the formula booklet ✓
how the three measures behave for different shapes of data
For symmetric data, all three measures coincide. For skewed data they spread apart — the mean is pulled toward the tail by extreme values, while the median and mode stay closer to the bulk of the data.
Calculating each measure for ungrouped data
The mode
The value (or values) that appears most often. If every value occurs exactly once, write “no mode” — don’t write “mode = 0”. A dataset can be bimodal (two modes) or multimodal (three or more) if several values tie for the highest frequency.
The median
Sort the data from smallest to largest. The median is the middle value. If n is odd, it’s the n+12th value. If n is even, take the midpoint of the two middle values.
The mean
Add up every value and divide by how many there are. The GDC does this instantly in statistics mode — but by hand, use the formula above.
🧭 Recipe — finding all three for an ungrouped dataset
Sort the data in order of size. Almost all of the work is easier with sorted data.
Mode: count how often each value appears. The value(s) with the highest count = mode.
Median: the middle value (or midpoint of two middle values for even n).
Mean: sum all values, divide by n.
Always check: each measure should sit within the range of the data — if not, you’ve slipped somewhere.
🤔 Why the mean is sensitive to outliers
The mean uses every value in the sum, so a single huge value affects it. If you have salaries { 30, 32, 35, 33, 31, 34 } with mean 32.5, then adding a single £200 director’s salary jumps the mean to about 56 — even though the typical worker still earns about 32. The median, in contrast, stays put at 33 because it depends only on the middle value’s position, not its size.
Worked examples
WE 1
Find the mode, median, and mean
Find the mode, median, and mean for the data set below.
42 28 67 51 64 42
Step 1: sort the data28, 42, 42, 51, 64, 67Step 2: mode = most common42 appears twice; all others oncemode = 42Step 3: median = middle valuen = 6 (even) → midpoint of 3rd and 4th= (42 + 51) / 2 = 46.5median = 46.5Step 4: mean = Σx / nΣx = 42+28+67+51+64+42 = 294mean = 294 / 6 = 49mean = 49all three lie between 28 and 67 — sanity check passes.
WE 2
Bimodal data with an even count
Find the mode(s), median, and mean of:
4 7 9 7 12 4 15 10
Step 1: sort the data4, 4, 7, 7, 9, 10, 12, 15Step 2: modeboth 4 and 7 appear twicemodes = 4 and 7 (bimodal)Step 3: mediann = 8 → midpoint of 4th and 5th= (7 + 9) / 2 = 8median = 8Step 4: meanΣx = 4+7+9+7+12+4+15+10 = 68mean = 68 / 8 = 8.5mean = 8.5when several values tie for highest frequency, list all of them as modes.
Ten students took a quiz marked out of 20. Their scores are:
12 15 11 18 14 15 13 17 15 10
Calculate the mode, median, and mean.
Step 1: sort the data10, 11, 12, 13, 14, 15, 15, 15, 17, 18Step 2: mode15 appears 3 times — highest countmode = 15Step 3: median (n = 10, even)midpoint of 5th and 6th: (14 + 15) / 2median = 14.5Step 4: meanΣx = 12+15+11+18+14+15+13+17+15+10 = 140mean = 140 / 10 = 14mean = 14a slight gap between mean (14) and median (14.5) is normal for skew-free real data.
WE 5
Outlier effect — choose the right measure
The annual salaries (in £000s) of 7 employees at a small company are:
30, 32, 35, 33, 31, 34, 200
(a) Calculate the mean and the median.
(b) Which is a better measure of a “typical” salary, and why?
(a) sort the data30, 31, 32, 33, 34, 35, 200meanΣx = 30+31+32+33+34+35+200 = 395mean = 395 / 7 ≈ 56.4 (£000s)median (n = 7, odd) → 4th valuemean ≈ £56 400; median = £33 000(b) which is “typical”?£200 000 is a clear outlier — likely the directorit pulls the mean up to a value no employee actually earns(b) the median (£33 000) — it isn’t distorted by the outlieruse the median when data is skewed or has obvious outliers.
WE 6
Find a missing value given the mean
A dataset of 5 numbers has a mean of 12. Four of the numbers are 8, 11, 15, and 14. Find the fifth number.
Step 1: use mean = Σx / n backwardsΣx = mean × n = 12 × 5 = 60Step 2: sum the known values8 + 11 + 15 + 14 = 48Step 3: subtract from the totalfifth value = 60 − 48 = 12fifth number = 12verify(8+11+15+14+12) / 5 = 60 / 5 = 12 ✓“backwards mean” problems are common in IB — always recover Σx first.
💡 Top tips
Always sort first. Mode is easier to spot, median is the middle entry, and you’ll catch typing errors.
For mean, calculate Σx separately and divide at the end — keeps the workings tidy and easier to check.
Check your answer is within the data range: every measure of central tendency must lie between the smallest and largest values.
Choose by context: mean for symmetric data without outliers, median for skewed data, mode for qualitative or categorical data.
For “find a missing value” problems: rearrange mean = Σx/n to Σx = mean × n.
⚠ Common mistakes
Forgetting to sort before finding the median. The median is the middle of the sorted data, not the middle of the original list.
Writing “mode = 0” when every value is unique. The correct answer is no mode.
Missing a mode when more than one value ties. A dataset can be bimodal or multimodal — list all of them.
Dividing by the wrong n: n is the number of values, not the largest value.
Treating mean as “the right answer” for skewed data. With outliers, the median is usually the better summary.
Next up — Measures of Dispersion. The centre of the data is only half the story. The next set of statistics — range, interquartile range, variance, and standard deviation — describe how spread out the data is around the centre. Two datasets can share the same mean but feel completely different depending on their spread.
Need help with Statistics?
Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.