IB Maths AI SLTopic 4 — Statistics ToolkitPaper 1 & 21.5 × IQR rule~6 min read
Outliers
An outlier is a data value far outside the rest. The IB uses a strict definition: a value is an outlier if it lies more than 1.5 × IQR from the nearest quartile. Whether to keep or remove an outlier depends on whether it is a genuine extreme value or just a recording error.
๐ What you need to know
Definition: a value x is an outlier if x < Q1 − 1.5 IQR OR x > Q3 + 1.5 IQR.
Fences: the two boundary values Q1 − 1.5 IQR (lower) and Q3 + 1.5 IQR (upper). Anything beyond a fence is an outlier.
Boundary values are NOT outliers: the rule says more than 1.5 IQR, not “equal or more”. A value exactly on a fence is kept.
Remove if it is an error: a typo, double-counted entry, or wrong unit — remove and recompute.
Keep if it is genuine: a real extreme value (mansion in a row of houses, prodigy in a class) — keep but flag it.
On a box plot: outliers are drawn as separate × marks, and the whisker stops at the most extreme non-outlier.
The 1.5 × IQR rule visualised
Picture the box plot. The box covers the middle 50% of the data (Q1 to Q3). Then extend invisible “fences” 1.5 IQRs further out on each side. Anything that escapes those fences is an outlier.
The box covers Q1 to Q3 (IQR = 12). Orange fences sit 1.5 IQR beyond each quartile: 40 − 18 = 22 and 52 + 18 = 70. The whiskers end at 35 and 55 (the most extreme non-outliers), and the values 12 and 120 fall beyond the fences — both flagged as outliers (red ×).
The outlier rulex is an outlier ⇔ x < Q1 − 1.5 IQR or x > Q3 + 1.5 IQR
where IQR = Q3 − Q1
๐งญ Recipe — identify outliers
Order the data and find Q1 and Q3. Use the GDC on Paper 2.
Compute IQR = Q3 − Q1.
Compute the two fences: lower = Q1 − 1.5 IQR ยท upper = Q3 + 1.5 IQR.
List any data values below the lower fence or above the upper fence.
Decide: keep (genuine extreme) or remove (clear error / typo). Justify with the context.
Strict inequality: the rule says more than 1.5 IQR. A value exactly on the fence (e.g. x = Q1 − 1.5 IQR) is NOT an outlier — it’s the borderline case.
Worked examples
WE 1
Identify outliers from given quartiles
A data set has Q1 = 12 and Q3 = 24. Determine which of the following values are outliers:
3, 15, 18, 28, 50
Step 1 โ IQRIQR = 24 โ 12 = 12Step 2 โ fenceslower = 12 โ 1.5 ร 12 = 12 โ 18 = โ6upper = 24 + 1.5 ร 12 = 24 + 18 = 42Step 3 โ check each value3 โ [โ6, 42] โ not outlier15, 18, 28 โ all inside โ not outliers50 > 42 โ outlier โonly 50 is an outlierdon’t be fooled by 3 โ it looks small, but it’s still above the lower fence of โ6. Compare to the FENCE, not to the median.
WE 2
Full workflow from raw data
The 5 km running times (min) of 11 club athletes are:
35, 38, 40, 42, 44, 45, 47, 48, 50, 52, 95
Identify any outliers.
Step 1 โ quartiles (n = 11)median = 6th value = 45lower 5: 35, 38, 40, 42, 44 โ Qโ = 40upper 5: 47, 48, 50, 52, 95 โ Qโ = 50Step 2 โ IQRIQR = 50 โ 40 = 10Step 3 โ fenceslower = 40 โ 15 = 25upper = 50 + 15 = 65Step 4 โ flag values outside [25, 65]95 > 65 โ outlieroutlier: 95 min95 minutes for 5 km is extremely slow โ likely a recording error or a walker. Decide context before removing it.
WE 3
Two outliers (low AND high)
The annual incomes of 10 employees (in $1000s) are:
12, 35, 40, 42, 45, 47, 50, 52, 55, 120
Identify any outliers.
Step 1 โ quartiles (n = 10)median = (45+47)/2 = 46lower 5: 12, 35, 40, 42, 45 โ Qโ = 40upper 5: 47, 50, 52, 55, 120 โ Qโ = 52Step 2 โ IQR & fencesIQR = 12lower = 40 โ 18 = 22upper = 52 + 18 = 70Step 3 โ values outside [22, 70]12 < 22 โ outlier120 > 70 โ outlieroutliers: 12 and 120an outlier set can be one-sided, two-sided, or empty. Always check BOTH fences โ students often forget to test for low outliers.
WE 4
Keep or remove? — justify the decision
Seven recent house sale prices on a quiet street are recorded (in $000s):
240, 260, 275, 290, 305, 320, 750
(a) Show that 750 is an outlier. (b) State whether 750 should be removed, with justification.
(a) Quartiles (n = 7, odd)median = 4th value = 290lower 3: 240, 260, 275 โ Qโ = 260upper 3: 305, 320, 750 โ Qโ = 320IQR = 60upper fence = 320 + 90 = 410750 > 410 โ outlier โ(b) Decision750k is a plausible mansion price โ not an errorkeep 750 โ it is a genuine extreme value, not a typo“Keep / remove” decisions need a CONTEXT-based reason. “It’s plausibly real” โ keep. “Clearly a typo / impossible value” โ remove.
WE 5
Find the smallest integer that is an outlier
A data set has Q1 = 18 and Q3 = 26. Find the smallest integer value that would be classified as an upper outlier.
Step 1 โ IQR & upper fenceIQR = 26 โ 18 = 8upper fence = 26 + 1.5 ร 8 = 26 + 12 = 38Step 2 โ outlier means STRICTLY greater than 3838 itself: NOT an outlier (on the boundary)39: 39 > 38 โsmallest integer outlier = 39trap: the value AT the fence is not an outlier. The smallest integer above 38 is 39.
WE 6
Comprehensive — full Paper-2 style question
A commuter records her daily journey times (min) over 11 working days:
12, 17, 18, 19, 21, 22, 24, 25, 26, 28, 50
(a) Find Q1, Q3 and the IQR. (b) Identify any outliers. (c) Suggest whether the outlier should be removed.
(a) Quartiles (n = 11, odd)median = 6th value = 22lower 5: 12, 17, 18, 19, 21 โ Qโ = 18upper 5: 24, 25, 26, 28, 50 โ Qโ = 26IQR = 26 โ 18 = 8Qโ = 18 ยท Qโ = 26 ยท IQR = 8(b) Fenceslower = 18 โ 12 = 6upper = 26 + 12 = 3812 not < 6 โ not outlier50 > 38 โ outlier โoutlier: 50 min(c) Decision50 min is plausible โ a bad-traffic daykeep 50; it is a genuine valuecommuting times can spike for legitimate reasons (accidents, delays). Unless told it was a recording error, this outlier represents real-world variability and should be kept.
๐ก Top tips
Compute the fences ONCE; then it’s just a “less than / greater than” check for every data value.
Check BOTH ends — outliers can be low, high, or both.
Boundary values are NOT outliers: the rule uses strict inequality.
“Keep or remove” needs CONTEXT: typo/error โ remove. Genuine extreme value โ keep.
Box plot drawing: whisker stops at the smallest/largest non-outlier; outliers become × marks beyond.
โ Common mistakes
Using 1.5 × range instead of 1.5 × IQR. It’s the interquartile range, not the full range.
Forgetting to halve… wait, just forgetting Q1: students often compute only the upper fence and miss low outliers.
Treating fence values as outliers: a value at the fence is fine; it’s only an outlier if it goes past the fence.
Removing every outlier automatically: only remove genuine errors. A legitimate extreme value should be kept and discussed.
Drawing the box plot whisker to the outlier: the whisker must STOP at the most extreme non-outlier; the outlier is a separate × mark.
Next up: Box & Whisker Diagrams. You’ve already seen the box plot in action; now you’ll draw and read them formally — from the five-number summary (min, Q1, median, Q3, max), with outliers shown separately. Box plots are the fastest way to compare two distributions at a glance.
Need help with AI SL Statistics?
Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.