IB Maths AI SL Topic 4 — Statistics Toolkit Paper 1 & 2 Ungrouped & grouped ~8 min read

Frequency Tables

When the same value (or interval) repeats often, listing every data point is wasteful. A frequency table packs the data into one row of values and one row of counts. The mean, median, mode and standard deviation still work — you just use the frequencies as multipliers. Grouped tables use mid-interval values and every answer becomes an estimate.

📘 What you need to know

Ungrouped table: rows give value x_i and frequency f_i. Exact values known.
Grouped table: rows give a class interval and its frequency. Exact values lost ⇒ everything is an estimate.
Mean: x̄ = Σ f_i x_in with n = Σ f_i. Grouped: x_i = mid-interval value.
Mode / modal class = value (or class) with the highest frequency.
Median: use cumulative frequencies to find the middle position.
GDC: 1-Var Stats with FreqList returns mean, median, quartiles and σ at once.

Ungrouped frequency tables

Each row holds one value and how often it appears. The mode is the x-value (not the frequency!) with the largest count. For the median, build a cumulative frequency row (running total) and locate the middle position from there. Worked example below.

Grouped frequency tables

When data is continuous (or has many distinct values), values are grouped into class intervals. Exact values are lost — we use the mid-interval value as the representative for each class:

mid-interval value = lower + upper boundary2.

The mean from a grouped table is therefore an estimate. Round to 3 sf and write “≈” to flag this.

Left: an ungrouped bar chart — bars sit on individual values, the tallest is the mode. Right: a grouped histogram — bars sit on class intervals, the tallest is the modal class. The orange ×’s mark the mid-interval values used to estimate the mean.

The frequency-table mean (formula booklet) x̄ = Σ f_i x_in , n = Σ f_i

Grouped: use x_i = lower + upper2 (mid-interval value)

🧭 Recipe — tackle any frequency-table question

Read the row labels: individual values (ungrouped) or class intervals (grouped)?
If grouped, write the mid-interval values alongside each class. These replace x_i in every formula.
Compute n = Σf_i. Needed for the mean and to locate the median.
Mean: compute Σf_i x_i, then divide by n. For grouped, call it an estimate.
Std dev / quartiles: enter the value list and frequency list on the GDC, then run 1-Var Stats with a FreqList.

Ungrouped is EXACT, grouped is an ESTIMATE. Anything from a grouped table loses precision because individual values are unknown — report grouped answers as estimates and round (often to 3 sf).

Worked examples

WE 1

Mean from an ungrouped frequency table

A help desk records the number of calls received each day for 30 days:

calls x	5	6	7	8	9
freq f	4	7	10	6	3

Find the mean number of calls per day.

Step 1 — total: n = Σf n = 4+7+10+6+3 = 30 Step 2 — Σfx 5×4 + 6×7 + 7×10 + 8×6 + 9×3 = 20 + 42 + 70 + 48 + 27 = 207 Step 3 — mean x̄ = 207 / 30 = 6.9 mean = 6.9 calls/day Σfx is the total of all values. Divide by n (total count) to get the mean — same formula as for raw data, just compacted.

WE 2

Mode and median from a frequency table

The number of pets owned by 26 households:

pets x	0	1	2	3	4
freq f	5	8	6	5	2

Find the mode and the median.

Mode = value with the highest frequency f = 8 is the largest → mode = 1 Median: n = 26 (even) → midpoint of 13th & 14th Cumulative frequencies 5, 13, 19, 24, 26 Locate 13th and 14th cum 13 reached at x = 1 → 13th value = 1 cum 19 reached at x = 2 → 14th value = 2 Median median = (1 + 2) / 2 = 1.5 mode = 1 · median = 1.5 cumulative frequency = running total. Use it to locate where the 13th and 14th households “sit” without writing out all 26 values.

WE 3

Standard deviation from an ungrouped table

Twenty students scored as follows on a short quiz:

score x	1	2	3	4
freq f	2	4	6	8

Find the mean, variance and standard deviation.

n = 2+4+6+8 = 20 Σfx 2 + 8 + 18 + 32 = 60 Mean x̄ = 60 / 20 = 3 Σf(x − x̄)² (1−3)²×2 = 4×2 = 8 (2−3)²×4 = 1×4 = 4 (3−3)²×6 = 0 (4−3)²×8 = 1×8 = 8 sum = 8+4+0+8 = 20 Variance and σ σ² = 20 / 20 = 1 σ = √1 = 1 mean = 3 · variance = 1 · σ = 1 in the exam: L1 = values, L2 = frequencies, then 1-Var Stats L1, L2. The GDC returns σ directly — the by-hand version shows what it’s doing.

WE 4

Estimate the mean from a grouped table

Customers’ waiting times (min) at a café are grouped:

time t	0≤t<2	2≤t<4	4≤t<6	6≤t<8
freq f	5	15	20	10

Estimate the mean waiting time.

Step 1 — mid-interval values midpoints: 1, 3, 5, 7 Step 2 — n = Σf n = 5+15+20+10 = 50 Step 3 — Σf × midpoint 5×1 + 15×3 + 20×5 + 10×7 = 5 + 45 + 100 + 70 = 220 Step 4 — estimated mean x̄ ≈ 220 / 50 = 4.4 estimated mean ≈ 4.4 min the true mean needs the original 50 wait-times, which we don’t have. Using midpoints assumes every value sits at the centre of its class — fair estimate, not exact.

WE 5

Find a missing frequency given the mean

The mean of the distribution below is 2.5. Find the missing frequency k.

x	1	2	3	4
f	4	8	k	4

Step 1 — Σf and Σfx in terms of k n = 4 + 8 + k + 4 = 16 + k Σfx = 4 + 16 + 3k + 16 = 36 + 3k Step 2 — apply mean = 2.5 (36 + 3k) / (16 + k) = 2.5 Step 3 — solve 36 + 3k = 2.5(16 + k) 36 + 3k = 40 + 2.5k 0.5k = 4 k = 8 Check n = 24, Σfx = 60, mean = 60/24 = 2.5 ✓ k = 8 “reverse mean” with an unknown frequency: write Σfx and Σf in terms of k, then solve mean × n = Σfx. Always verify the final answer.

WE 6

Modal class, midpoint & estimated mean

Weights (kg) of 50 dogs at a vet clinic:

weight w	0–5	5–10	10–15	15–20	20–25
freq f	4	12	18	8	8

(a) Write down the modal class. (b) State the mid-interval value of the modal class. (c) Estimate the mean weight.

(a) Highest frequency = 18 modal class: 10 ≤ w < 15 (b) Midpoint of 10–15 (10 + 15) / 2 = 12.5 midpoint = 12.5 kg (c) Estimate the mean midpoints: 2.5, 7.5, 12.5, 17.5, 22.5 Σf×mid = 4×2.5 + 12×7.5 + 18×12.5 + 8×17.5 + 8×22.5 = 10 + 90 + 225 + 140 + 180 = 645 x̄ ≈ 645 / 50 = 12.9 estimated mean ≈ 12.9 kg three-part grouped question — exam standard. Always present the modal class as an INTERVAL (10 ≤ w < 15), not just a number.

💡 Top tips

Write the mid-interval row first for any grouped table — saves slips later.
Compute n = Σf_i early. It’s needed for the mean, the median position, and the GDC.
“Modal class” must be the INTERVAL, e.g. 10 ≤ w < 15. Don’t write just “12.5”.
GDC FreqList: don’t enter every repeated value — use the value list with a frequency list.
Round grouped answers to 3 sf and flag them as estimates (“≈” or “estimated mean”).

⚠ Common mistakes

Dividing by the number of rows instead of by n = Σf_i. Mean = sum of values / count of data points.
Forgetting midpoints for grouped data: don’t use lower or upper boundaries — use (lower + upper) / 2.
Saying the mode = highest frequency: mode is the value with the highest frequency, not the frequency itself.
Picking s_x instead of σ_x on the GDC. IB uses the population std dev σ_x.
Reporting a grouped mean as exact: it’s always an estimate — use “≈” or write “estimated mean”.

Next up: Linear Transformations of Data. If every value x_i is replaced by ax_i + b (e.g. a teacher doubles all scores then adds 10), how does the mean change? What happens to the standard deviation? Short rules, big time-saver in exams.

Need help with AI SL Statistics?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.

Book Free Session →

Frequency Tables

📘 What you need to know

Ungrouped frequency tables

Grouped frequency tables

🧭 Recipe — tackle any frequency-table question

Worked examples

💡 Top tips

⚠ Common mistakes

Need help with AI SL Statistics?

Quick Links

Contact us

Follow us