IB Maths AI SLTopic 4 — Statistics ToolkitPaper 1 & 2Ungrouped & grouped~8 min read
Frequency Tables
When the same value (or interval) repeats often, listing every data point is wasteful. A frequency table packs the data into one row of values and one row of counts. The mean, median, mode and standard deviation still work — you just use the frequencies as multipliers. Grouped tables use mid-interval values and every answer becomes an estimate.
📘 What you need to know
Ungrouped table: rows give value xi and frequency fi. Exact values known.
Grouped table: rows give a class interval and its frequency. Exact values lost ⇒ everything is an estimate.
Mean: x̄ = Σfixin with n = Σfi. Grouped: xi = mid-interval value.
Mode / modal class = value (or class) with the highest frequency.
Median: use cumulative frequencies to find the middle position.
GDC: 1-Var Stats with FreqList returns mean, median, quartiles and σ at once.
Ungrouped frequency tables
Each row holds one value and how often it appears. The mode is the x-value (not the frequency!) with the largest count. For the median, build a cumulative frequency row (running total) and locate the middle position from there. Worked example below.
Grouped frequency tables
When data is continuous (or has many distinct values), values are grouped into class intervals. Exact values are lost — we use the mid-interval value as the representative for each class:
mid-interval value = lower + upper boundary2.
The mean from a grouped table is therefore an estimate. Round to 3 sf and write “≈” to flag this.
Left: an ungrouped bar chart — bars sit on individual values, the tallest is the mode. Right: a grouped histogram — bars sit on class intervals, the tallest is the modal class. The orange ×’s mark the mid-interval values used to estimate the mean.
The frequency-table mean (formula booklet)x̄ = Σfixin , n = Σfi
Grouped: use xi = lower + upper2 (mid-interval value)
🧭 Recipe — tackle any frequency-table question
Read the row labels: individual values (ungrouped) or class intervals (grouped)?
If grouped, write the mid-interval values alongside each class. These replace xi in every formula.
Compute n = Σfi. Needed for the mean and to locate the median.
Mean: compute Σfixi, then divide by n. For grouped, call it an estimate.
Std dev / quartiles: enter the value list and frequency list on the GDC, then run 1-Var Stats with a FreqList.
Ungrouped is EXACT, grouped is an ESTIMATE. Anything from a grouped table loses precision because individual values are unknown — report grouped answers as estimates and round (often to 3 sf).
Worked examples
WE 1
Mean from an ungrouped frequency table
A help desk records the number of calls received each day for 30 days:
calls x
5
6
7
8
9
freq f
4
7
10
6
3
Find the mean number of calls per day.
Step 1 — total: n = Σfn = 4+7+10+6+3 = 30Step 2 — Σfx5×4 + 6×7 + 7×10 + 8×6 + 9×3= 20 + 42 + 70 + 48 + 27 = 207Step 3 — meanx̄ = 207 / 30 = 6.9mean = 6.9 calls/dayΣfx is the total of all values. Divide by n (total count) to get the mean — same formula as for raw data, just compacted.
WE 2
Mode and median from a frequency table
The number of pets owned by 26 households:
pets x
0
1
2
3
4
freq f
5
8
6
5
2
Find the mode and the median.
Mode = value with the highest frequencyf = 8 is the largest → mode = 1Median: n = 26 (even) → midpoint of 13th & 14thCumulative frequencies5, 13, 19, 24, 26Locate 13th and 14thcum 13 reached at x = 1 → 13th value = 1cum 19 reached at x = 2 → 14th value = 2Medianmedian = (1 + 2) / 2 = 1.5mode = 1 · median = 1.5cumulative frequency = running total. Use it to locate where the 13th and 14th households “sit” without writing out all 26 values.
WE 3
Standard deviation from an ungrouped table
Twenty students scored as follows on a short quiz:
Customers’ waiting times (min) at a café are grouped:
time t
0≤t<2
2≤t<4
4≤t<6
6≤t<8
freq f
5
15
20
10
Estimate the mean waiting time.
Step 1 — mid-interval valuesmidpoints: 1, 3, 5, 7Step 2 — n = Σfn = 5+15+20+10 = 50Step 3 — Σf × midpoint5×1 + 15×3 + 20×5 + 10×7= 5 + 45 + 100 + 70 = 220Step 4 — estimated meanx̄ ≈ 220 / 50 = 4.4estimated mean ≈ 4.4 minthe true mean needs the original 50 wait-times, which we don’t have. Using midpoints assumes every value sits at the centre of its class — fair estimate, not exact.
WE 5
Find a missing frequency given the mean
The mean of the distribution below is 2.5. Find the missing frequency k.
x
1
2
3
4
f
4
8
k
4
Step 1 — Σf and Σfx in terms of kn = 4 + 8 + k + 4 = 16 + kΣfx = 4 + 16 + 3k + 16 = 36 + 3kStep 2 — apply mean = 2.5(36 + 3k) / (16 + k) = 2.5Step 3 — solve36 + 3k = 2.5(16 + k)36 + 3k = 40 + 2.5k0.5k = 4k = 8Checkn = 24, Σfx = 60, mean = 60/24 = 2.5 ✓k = 8“reverse mean” with an unknown frequency: write Σfx and Σf in terms of k, then solve mean × n = Σfx. Always verify the final answer.
WE 6
Modal class, midpoint & estimated mean
Weights (kg) of 50 dogs at a vet clinic:
weight w
0–5
5–10
10–15
15–20
20–25
freq f
4
12
18
8
8
(a) Write down the modal class. (b) State the mid-interval value of the modal class. (c) Estimate the mean weight.
(a) Highest frequency = 18modal class: 10 ≤ w < 15(b) Midpoint of 10–15(10 + 15) / 2 = 12.5midpoint = 12.5 kg(c) Estimate the meanmidpoints: 2.5, 7.5, 12.5, 17.5, 22.5Σf×mid = 4×2.5 + 12×7.5 + 18×12.5 + 8×17.5 + 8×22.5 = 10 + 90 + 225 + 140 + 180 = 645x̄ ≈ 645 / 50 = 12.9estimated mean ≈ 12.9 kgthree-part grouped question — exam standard. Always present the modal class as an INTERVAL (10 ≤ w < 15), not just a number.
💡 Top tips
Write the mid-interval row first for any grouped table — saves slips later.
Compute n = Σfi early. It’s needed for the mean, the median position, and the GDC.
“Modal class” must be the INTERVAL, e.g. 10 ≤ w < 15. Don’t write just “12.5”.
GDC FreqList: don’t enter every repeated value — use the value list with a frequency list.
Round grouped answers to 3 sf and flag them as estimates (“≈” or “estimated mean”).
⚠ Common mistakes
Dividing by the number of rows instead of by n = Σfi. Mean = sum of values / count of data points.
Forgetting midpoints for grouped data: don’t use lower or upper boundaries — use (lower + upper) / 2.
Saying the mode = highest frequency: mode is the value with the highest frequency, not the frequency itself.
Picking sx instead of σx on the GDC. IB uses the population std dev σx.
Reporting a grouped mean as exact: it’s always an estimate — use “≈” or write “estimated mean”.
Next up: Linear Transformations of Data. If every value xi is replaced by axi + b (e.g. a teacher doubles all scores then adds 10), how does the mean change? What happens to the standard deviation? Short rules, big time-saver in exams.
Need help with AI SL Statistics?
Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.