IB Maths AA SLTopic 4 — Statistics ToolkitPaper 1 & 2~10 min read
Cumulative Frequency Graphs
A cumulative frequency graph is the running total of your data, plotted as a smooth S-shaped curve. They’re how you find quartiles, percentiles, and medians from grouped data — and they’re easy to read once you know which line to draw.
📘 What you need to know
Cumulative frequency = running total of frequencies up to that value.
For grouped data, plot points at the upper boundary of each class with the cumulative frequency.
Join the points with a smooth curve (not straight lines).
To estimate quartiles or percentiles: draw a horizontal line from the y-axis to the curve, then drop a vertical line to the x-axis.
Median = 50% of total. Q1 = 25%. Q3 = 75%. pth percentile = p% of total.
The graph always increases — never goes down.
What is cumulative frequency?
Cumulative frequency is just a fancy word for “running total”. You go through your frequency table and add each frequency to the previous one, building up to the grand total at the end.
Example — building a cumulative frequency row
Take this grouped frequency table for the heights of 30 students:
Height (cm)
Frequency
Cumulative frequency
140 ≤ h < 150
4
4
150 ≤ h < 160
9
13 (4+9)
160 ≤ h < 170
11
24 (13+11)
170 ≤ h < 180
5
29 (24+5)
180 ≤ h < 190
1
30 (29+1)
The cumulative frequency at the top of each class tells you “how many students are this height or shorter“. So 24 students are below 170 cm, 29 are below 180 cm, and so on.
The last cumulative frequency must equal the total number of values (here, 30). If it doesn’t, you’ve added wrong somewhere — check your arithmetic.
How to draw the graph
5 steps to draw a cumulative frequency graph
Build a cumulative frequency column in your table.
Plot each point at the upper boundary of the class, with the cumulative frequency on the y-axis.
Plot a starting point at the lower boundary of the first class with cumulative frequency 0 (the curve must start somewhere).
Join the points with a smooth curve (NOT straight lines!).
Label your axes — the x-axis is the variable (with units), the y-axis is “cumulative frequency”.
Cumulative frequency curve for the 30 students’ heights
🤔 Why do we plot at the upper boundary?
If 4 students are in the class 140 ≤ h < 150, we know all 4 are below 150 cm — but we don’t know exactly where. The safest place to mark “4 students so far” is at the upper boundary of 150 cm. By 160 cm, we’ve now counted 13 students; by 170 cm, 24; and so on.
How to read information from the graph
Cumulative frequency graphs are a two-way street. You can read them in either direction depending on what the question asks.
From x → y
↑→
“How many values are below x?”
Draw a vertical line up from your x-value to the curve, then horizontally to the y-axis. Read off the cumulative frequency.
e.g. “How many students are shorter than 165 cm?”
From y → x
→↓
“What value is the median / quartile / percentile?”
Draw a horizontal line from your y-value to the curve, then vertically down to the x-axis. Read off the value.
e.g. “Find the median height” → start at y = 15 (half of 30).
Finding the median, quartiles, and percentiles
The position on the y-axis depends on the total number of values (n):
What you want
Position on y-axis
For n = 30
Lower quartile (Q1)
25% of n = 0.25 × n
7.5
Median (Q2)
50% of n = 0.5 × n
15
Upper quartile (Q3)
75% of n = 0.75 × n
22.5
pth percentile
p% of n = (p/100) × n
e.g. 70th = 21
🧠
Memory trick: “y-axis = how many, x-axis = the value”
The y-axis tells you how many students/items. The x-axis tells you the value. So if a question asks “what’s the median” you start on y; if it asks “how many below 165 cm”, start on x.
Finding the IQR
Interquartile range
IQR = Q3 − Q1
Just read Q1 and Q3 off the graph (using the 25% and 75% horizontal lines), then subtract.
📍
How to estimate the frequency of a class
To find how many values fell in a class like 140 ≤ h < 160, find the cumulative frequency at x = 160 and at x = 140, then subtract. The difference is the frequency of that range.
Worked examples
WE 1
Build a cumulative frequency column
The frequency table shows times taken (in minutes) by 50 students to finish a test. Complete the cumulative frequency column.
Time (min)
Frequency
Cumulative frequency
0 ≤ t < 10
3
?
10 ≤ t < 20
8
?
20 ≤ t < 30
15
?
30 ≤ t < 40
18
?
40 ≤ t < 50
6
?
Add each frequency to the running total. Final cum freq must equal 50.Row 1:3 → 3Row 2:3 + 8 = 11Row 3:11 + 15 = 26Row 4:26 + 18 = 44Row 5:44 + 6 = 50 ✓Cum freq: 3, 11, 26, 44, 50always check that the last value matches the total number of data points
WE 2
Read information from a cumulative frequency graph
The graph below shows the lengths in cm, l, of 30 puppies in a training group.
(a) Find an estimate for the median length. (b) Find Q1 and Q3, then the IQR. (c) Estimate the percentage of puppies longer than 51 cm.
n = 30 puppiespart (a) — medianMedian position:50% × 30 = 15From y = 15, draw across to curve, drop down:Median ≈ 47 cmpart (b) — quartiles & iqrQ₁ position:25% × 30 = 7.5From y = 7.5 → x ≈ 39.5 cm: Q₁ ≈ 39.5Q₃ position:75% × 30 = 22.5From y = 22.5 → x ≈ 51.4 cm: Q₃ ≈ 51.4IQR = Q₃ − Q₁:51.4 − 39.5 = 11.9IQR ≈ 11.9 cmpart (c) — % longer than 51From x = 51 cm, go up to curve, across:Cumulative frequency at 51 cm ≈ 22.So 22 puppies are ≤ 51 cm. Therefore:Number longer than 51 cm = 30 − 22 = 8Percentage = 830 × 100 ≈ 26.7%≈ 26.7% (3 s.f.)“longer than” = total − below; always subtract from n!
WE 3
Estimate the frequency of a class from the curve
Using the puppies graph from WE 2, given the interval 40 ≤ l < 45 was used when collecting data, find the frequency of this class.
Frequency of a class = cum freq at upper boundary − cum freq at lower boundary.Cum freq at l = 45:≈ 16Cum freq at l = 40:≈ 8Subtract:16 − 8 = 8Frequency of class 40 ≤ l < 45 ≈ 8subtract the two cumulative frequencies — that gives the count in between
WE 4
Find a percentile
Using the puppies graph again (n = 30), find the 70th percentile of the puppy lengths.
n = 30, 70th percentilePercentile position = (p/100) × n. Then read off the curve.Position:70100 × 30 = 21From y = 21, draw across to curve, drop down to x-axis.Read off:x ≈ 50 cm70th percentile ≈ 50 cm70% of the puppies are 50 cm or shorter
WE 5
Use a cumulative frequency graph to draw a box plot
From the puppies cumulative frequency graph, the smallest value is approximately 35 cm and the largest is 60 cm. Use this to construct a box plot.
From WE 2 we already have all 5 numbers needed.5-number summary:Min ≈ 35, Q₁ ≈ 39.5, Median ≈ 47, Q₃ ≈ 51.4, Max ≈ 60Plot the box from Q₁ to Q₃, line at median, whiskers to min/max.Box plot drawn ✓cumulative frequency graphs are a fast way to get all the numbers needed for a box plot from grouped data
💡 Top tips
Plot at the upper boundary of each class. All values in the class are below this point — that’s why the cumulative frequency belongs there.
Always include a starting point with cum freq 0 at the lower boundary of the first class. The curve has to start somewhere.
Use a smooth curve, not straight lines when joining the points.
Median = 50% of n. Q1 = 25% of n. Q3 = 75% of n. pth percentile = (p/100) × n.
To find the frequency of a class, subtract the cum freq at its upper boundary from its lower.
“Greater than” or “more than” = subtract the cumulative frequency from n. Don’t read it directly.
Always show the lines you draw on the graph (horizontal first, then vertical) — examiners give marks for the method, not just the answer.
The total cumulative frequency at the end must equal n. If it doesn’t, recheck your additions.
⚠ Common mistakes
Plotting at the mid-interval value instead of the upper boundary. Mid-interval is for estimating the mean — for cumulative frequency graphs, use the upper boundary.
Joining points with straight lines. A cumulative frequency graph is always drawn as a smooth curve.
Using n/2, (n+1)/2 instead of 50% × n. For grouped data on a CF graph, just use percentages of n directly. Don’t fuss with odd/even adjustments.
Reading “more than” directly from the graph. The curve gives you “less than or equal to”. To get “more than”, do n − (cum freq).
Forgetting to start the curve at 0. Without a starting point at cum freq 0, the curve looks like it begins mid-air.
Not drawing the read-off lines. Draw a horizontal line, then a vertical one — and leave them in your answer. Examiners look for those.
Misreading the scale. Always check what each gridline is worth before you start reading off values.
Confusing percentage with cumulative frequency. The 70th percentile is at y = 0.7 × n (a number of values), not at y = 70 directly.
Cumulative frequency graphs are a Paper 2 favourite — every year there’s a 4–6 mark question that boils down to “find the median, IQR, and a percentile”. Master this method and those marks come for free.
Need help with Cumulative Frequency Graphs?
Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.