IB Maths AA HL Topic 4 — Statistics & Probability Paper 1 & 2 ~7 min read

Box & Whisker Diagrams

A box plot turns the five-number summary (min, Q1, median, Q3, max) into a single visual: a box for the middle 50% and whiskers reaching out to the extremes. Outliers get marked separately with crosses. Two box plots side-by-side show differences in centre, spread, and skew at a glance.

📘 What you need to know

Anatomy of a box plot

Min Q₁ Median Q₃ Max* Outlier data values → lower 25% 25% 25% upper 25%
Box plot anatomy: the box covers the middle 50%, whiskers reach the extreme non-outlier values, and any outliers are marked with crosses. *”Max” here means the largest non-outlier; outliers are excluded from the whiskers.
Each section holds 25% of the data: from min (or smallest non-outlier) to Q1; from Q1 to median; from median to Q3; from Q3 to max. The box’s width is the IQR.

Reading shape from a box plot

Symmetric
median centred in the box
left = right of median; equal-length whiskers
Skewed
median pushed toward one quartile
whisker on opposite side typically longer too

A symmetric box plot is suggestive (but not proof) of a normal distribution. Use the histogram covered later for a stronger visual test.

🧭 Recipe — draw a box plot

  1. Sort the data and find the five-number summary: min, Q1, median, Q3, max.
  2. Identify outliers using the 1.5 × IQR rule.
  3. Draw a clearly-labelled axis with even scale and units.
  4. Draw the box from Q1 to Q3, with a vertical line at the median.
  5. Extend whiskers to the smallest and largest non-outlier values.
  6. Mark outliers with × at their actual positions.

Worked examples

WE 1

Find the five-number summary

Find the minimum, lower quartile, median, upper quartile, and maximum of the data set:   12, 18, 14, 20, 9, 22, 16, 25, 11, 19.

Step 1: Sort (n = 10) 9, 11, 12, 14, 16, 18, 19, 20, 22, 25 Step 2: Min and max Min = 9; Max = 25 Step 3: Median (n even → avg of 5th and 6th) (16 + 18)/2 = 17 Step 4: Quartiles Lower half: 9, 11, 12, 14, 16 → Q₁ = 12 Upper half: 18, 19, 20, 22, 25 → Q₃ = 20 Five-number summary: 9, 12, 17, 20, 25 these five values are everything you need to draw the box plot
WE 2

Draw a box plot for data with no outliers

For the data set   30, 35, 40, 42, 45, 47, 50, 52, 55, 60,   (a) check for outliers, (b) state the five-number summary, and (c) describe the box plot you would draw.

(a) Quartiles and outlier check Sorted (already): n = 10; median = 46 Q₁ = 40, Q₃ = 52, IQR = 12 Bounds: 40 − 18 = 22; 52 + 18 = 70 All values lie in [22, 70] → no outliers (b) Five-number summary 30, 40, 46, 52, 60 (c) Box plot Axis from ~25 to 65 (clear scale, in original units) Box from 40 to 52, with median line at 46 Left whisker from 30 to 40; right whisker from 52 to 60 No outliers; box [40, 52], whiskers reach 30 and 60, median at 46 box is roughly symmetric — median almost in the centre of the box
WE 3

Draw a box plot when outliers are present

For the data set   5, 22, 28, 30, 32, 35, 36, 38, 40, 42, 60,   (a) identify any outliers, (b) state where each whisker should end.

(a) Quartiles (n = 11, median = 35, exclude it from halves) Lower: 5, 22, 28, 30, 32 → Q₁ = 28 Upper: 36, 38, 40, 42, 60 → Q₃ = 40 IQR = 12 Bounds: 28 − 18 = 10; 40 + 18 = 58 5 < 10 → outlier; 60 > 58 → outlier (b) Whisker ends — last non-outlier on each side Low side: smallest non-outlier = 22 High side: largest non-outlier = 42 Outliers: 5 and 60; whiskers run from 22 to Q₁=28 and Q₃=40 to 42; mark × at 5 and at 60 whiskers stop at the next valid value, not the actual min/max
WE 4

Interpret a box plot

A box plot has Min = 15, Q1 = 22, Median = 28, Q3 = 35, Max = 48 (no outliers). Find (a) the range, (b) the IQR, (c) the percentage of values below 22, (d) the percentage of values above 35, (e) comment on symmetry.

(a) Range = max − min 48 − 15 = 33 (b) IQR = Q₃ − Q₁ 35 − 22 = 13 (c) Below Q₁ = 22 25% (definition of lower quartile) (d) Above Q₃ = 35 25% (e) Symmetry Median − Q₁ = 6; Q₃ − Median = 7 (close) Left whisker = 7; right whisker = 13 — right slightly longer → approximately symmetric, with very mild positive skew Range = 33; IQR = 13; 25% below 22 and 25% above 35; roughly symmetric “approximately” is correct — perfect symmetry is rare
WE 5

Compare two box plots

Two classes sat the same exam. Class A’s box plot has values Min=40, Q1=55, Med=65, Q3=72, Max=85. Class B’s has Min=30, Q1=50, Med=60, Q3=80, Max=95. Compare the two distributions in context.

Step 1: Compare medians A: 65 vs B: 60 → A’s typical score higher Step 2: Compare IQRs A: 72 − 55 = 17; B: 80 − 50 = 30 → A more consistent Step 3: Compare ranges A: 45; B: 65 → B has wider overall spread Class A scored higher on average and was more consistent than Class B always compare both centre and spread, and frame it in context (exam scores)
WE 6

Skewness from a box plot

The shoe sizes of 12 students are:   36, 36, 37, 37, 38, 38, 39, 39, 40, 41, 42, 45. (a) Find the five-number summary. (b) Check for outliers. (c) Comment on the shape of the distribution.

(a) Sorted (n = 12) Min = 36; Max = 45 Median = (38 + 39)/2 = 38.5 Lower half: 36, 36, 37, 37, 38, 38 → Q₁ = (37+37)/2 = 37 Upper half: 39, 39, 40, 41, 42, 45 → Q₃ = (40+41)/2 = 40.5 Five-number summary: 36, 37, 38.5, 40.5, 45 (b) IQR = 3.5; bounds: 37 − 5.25 = 31.75 and 40.5 + 5.25 = 45.75 All values inside [31.75, 45.75] → no outliers (c) Shape Median to Q₁ = 1.5; Q₃ to median = 2.0 (close) Left whisker = 1.0; right whisker = 4.5 (much longer) No outliers; distribution is positively skewed (longer tail to the right) check both the box halves AND the whiskers when commenting on shape

💡 Top tips

⚠ Common mistakes

Next: Cumulative Frequency Graphs. When data is grouped, you can’t see individual values — but a cumulative frequency curve lets you read off the median, quartiles, and percentages from the running totals. Plot the upper boundary of each class against its cumulative frequency, then sketch a smooth S-shaped curve.

Need help with Statistics & Probability?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.

Book Free Session →