IB Maths AA HLTopic 4 — Statistics & ProbabilityPaper 1 & 2~7 min read
Box & Whisker Diagrams
A box plot turns the five-number summary (min, Q1, median, Q3, max) into a single visual: a box for the middle 50% and whiskers reaching out to the extremes. Outliers get marked separately with crosses. Two box plots side-by-side show differences in centre, spread, and skew at a glance.
📘 What you need to know
Five-number summary: min, Q1, median, Q3, max — these define the box plot.
Box spans Q1 to Q3 — middle 50% of the data.
Median line drawn inside the box.
Whiskers extend to the smallest and largest non-outlier values.
Outliers shown as crosses (×) outside the whiskers; whiskers stop at the next-nearest valid value.
Symmetry check: equal distances from the median to Q1 and Q3, with equal-length whiskers, suggests a symmetric distribution.
Comparing two distributions: stack the box plots on a shared axis — easy visual contrast of centre, spread, and skew.
Use your GDC to draw box plots; the calculator marks outliers automatically.
Anatomy of a box plot
Box plot anatomy: the box covers the middle 50%, whiskers reach the extreme non-outlier values, and any outliers are marked with crosses. *”Max” here means the largest non-outlier; outliers are excluded from the whiskers.
Each section holds 25% of the data: from min (or smallest non-outlier) to Q1; from Q1 to median; from median to Q3; from Q3 to max. The box’s width is the IQR.
Reading shape from a box plot
Symmetric
median centred in the box
left = right of median; equal-length whiskers
Skewed
median pushed toward one quartile
whisker on opposite side typically longer too
A symmetric box plot is suggestive (but not proof) of a normal distribution. Use the histogram covered later for a stronger visual test.
🧭 Recipe — draw a box plot
Sort the data and find the five-number summary: min, Q1, median, Q3, max.
Identify outliers using the 1.5 × IQR rule.
Draw a clearly-labelled axis with even scale and units.
Draw the box from Q1 to Q3, with a vertical line at the median.
Extend whiskers to the smallest and largest non-outlier values.
Mark outliers with × at their actual positions.
Worked examples
WE 1
Find the five-number summary
Find the minimum, lower quartile, median, upper quartile, and maximum of the data set: 12, 18, 14, 20, 9, 22, 16, 25, 11, 19.
Step 1: Sort (n = 10)9, 11, 12, 14, 16, 18, 19, 20, 22, 25Step 2: Min and maxMin = 9; Max = 25Step 3: Median (n even → avg of 5th and 6th)(16 + 18)/2 = 17Step 4: QuartilesLower half: 9, 11, 12, 14, 16 → Q₁ = 12Upper half: 18, 19, 20, 22, 25 → Q₃ = 20Five-number summary: 9, 12, 17, 20, 25these five values are everything you need to draw the box plot
WE 2
Draw a box plot for data with no outliers
For the data set 30, 35, 40, 42, 45, 47, 50, 52, 55, 60, (a) check for outliers, (b) state the five-number summary, and (c) describe the box plot you would draw.
(a) Quartiles and outlier checkSorted (already): n = 10; median = 46Q₁ = 40, Q₃ = 52, IQR = 12Bounds: 40 − 18 = 22; 52 + 18 = 70All values lie in [22, 70] → no outliers(b) Five-number summary30, 40, 46, 52, 60(c) Box plotAxis from ~25 to 65 (clear scale, in original units)Box from 40 to 52, with median line at 46Left whisker from 30 to 40; right whisker from 52 to 60No outliers; box [40, 52], whiskers reach 30 and 60, median at 46box is roughly symmetric — median almost in the centre of the box
WE 3
Draw a box plot when outliers are present
For the data set 5, 22, 28, 30, 32, 35, 36, 38, 40, 42, 60, (a) identify any outliers, (b) state where each whisker should end.
(a) Quartiles (n = 11, median = 35, exclude it from halves)Lower: 5, 22, 28, 30, 32 → Q₁ = 28Upper: 36, 38, 40, 42, 60 → Q₃ = 40IQR = 12Bounds: 28 − 18 = 10; 40 + 18 = 585 < 10 → outlier; 60 > 58 → outlier(b) Whisker ends — last non-outlier on each sideLow side: smallest non-outlier = 22High side: largest non-outlier = 42Outliers: 5 and 60; whiskers run from 22 to Q₁=28 and Q₃=40 to 42; mark × at 5 and at 60whiskers stop at the next valid value, not the actual min/max
WE 4
Interpret a box plot
A box plot has Min = 15, Q1 = 22, Median = 28, Q3 = 35, Max = 48 (no outliers). Find (a) the range, (b) the IQR, (c) the percentage of values below 22, (d) the percentage of values above 35, (e) comment on symmetry.
(a) Range = max − min48 − 15 = 33(b) IQR = Q₃ − Q₁35 − 22 = 13(c) Below Q₁ = 2225% (definition of lower quartile)(d) Above Q₃ = 3525%(e) SymmetryMedian − Q₁ = 6; Q₃ − Median = 7 (close)Left whisker = 7; right whisker = 13 — right slightly longer→ approximately symmetric, with very mild positive skewRange = 33; IQR = 13; 25% below 22 and 25% above 35; roughly symmetric“approximately” is correct — perfect symmetry is rare
WE 5
Compare two box plots
Two classes sat the same exam. Class A’s box plot has values Min=40, Q1=55, Med=65, Q3=72, Max=85. Class B’s has Min=30, Q1=50, Med=60, Q3=80, Max=95. Compare the two distributions in context.
Step 1: Compare mediansA: 65 vs B: 60 → A’s typical score higherStep 2: Compare IQRsA: 72 − 55 = 17; B: 80 − 50 = 30 → A more consistentStep 3: Compare rangesA: 45; B: 65 → B has wider overall spreadClass A scored higher on average and was more consistent than Class Balways compare both centre and spread, and frame it in context (exam scores)
WE 6
Skewness from a box plot
The shoe sizes of 12 students are: 36, 36, 37, 37, 38, 38, 39, 39, 40, 41, 42, 45. (a) Find the five-number summary. (b) Check for outliers. (c) Comment on the shape of the distribution.
(a) Sorted (n = 12)Min = 36; Max = 45Median = (38 + 39)/2 = 38.5Lower half: 36, 36, 37, 37, 38, 38 → Q₁ = (37+37)/2 = 37Upper half: 39, 39, 40, 41, 42, 45 → Q₃ = (40+41)/2 = 40.5Five-number summary: 36, 37, 38.5, 40.5, 45(b) IQR = 3.5; bounds: 37 − 5.25 = 31.75 and 40.5 + 5.25 = 45.75All values inside [31.75, 45.75] → no outliers(c) ShapeMedian to Q₁ = 1.5; Q₃ to median = 2.0 (close)Left whisker = 1.0; right whisker = 4.5 (much longer)No outliers; distribution is positively skewed (longer tail to the right)check both the box halves AND the whiskers when commenting on shape
💡 Top tips
Check for outliers before drawing — they change where the whiskers end.
Draw a clear, evenly-spaced axis with the variable’s units labelled.
Use your GDC to verify a hand-drawn box plot — calculator’s box plot mode auto-marks outliers.
For comparison questions, always cover centre AND spread (median and IQR are the standard pairing).
Mention skewness when whiskers and box-halves are visibly asymmetric.
⚠ Common mistakes
Drawing whiskers all the way to the actual min/max when outliers exist — whiskers stop at the next valid value.
Putting the median at the centre of the box automatically — it goes at its actual value, not visually centred.
Forgetting to label the axis with units.
Comparing only the median when asked to compare distributions — always include spread.
Reading the box width as range instead of IQR.
Next: Cumulative Frequency Graphs. When data is grouped, you can’t see individual values — but a cumulative frequency curve lets you read off the median, quartiles, and percentages from the running totals. Plot the upper boundary of each class against its cumulative frequency, then sketch a smooth S-shaped curve.
Need help with Statistics & Probability?
Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.