IB Maths AA SL Topic 4 — Statistics Toolkit Paper 1 & 2 ~12 min read

Frequency Tables

When you have lots of repeated values — or huge data sets — listing every single value gets messy fast. Frequency tables compress all that data into a tidy summary. This note shows you how to find every statistical measure from them, whether the data is grouped or not.

📘 What you need to know

A frequency table shows how many times each value (or group of values) appears.
Ungrouped tables list each individual value — you know exactly what the data is.
Grouped tables organise data into class intervals (like 10 ≤ x < 20) — you don’t know the exact values.
For ungrouped data, you find exact statistics. For grouped data, you can only estimate them using mid-interval values.
Mean formula: x̄ = Σ f_i x_in where n = Σf_i (in your formula booklet).
For grouped data, the modal class = the class with the highest frequency. The mid-interval value is the average of the upper and lower boundary.

The two types of frequency table

Frequency tables come in two flavours. The big difference is whether each row gives you an exact value or a range — and that completely changes what you can calculate.

Ungrouped frequency table

Each row holds one specific value and how many times it occurred. You know the exact data.

e.g. “Number of pets” → 0, 1, 2, 3

Grouped frequency table

Each row holds a range of values (a “class interval”) and how many fell into that range. You don’t know the exact values.

e.g. “Height” → 150 ≤ h < 155, 155 ≤ h < 160, …

If a question gives you a frequency table, the very first thing to check is: are these single values or class intervals? That tells you whether you’ll be finding exact answers or estimates.

Ungrouped frequency tables

An ungrouped table shows you the exact data — just compressed. Imagine writing out every value individually: that’s still possible from this table, you just don’t have to.

Example layout

Number of pets (x)	0	1	2	3
Frequency (f)	11	5	8	6

This is just shorthand for: eleven 0s, five 1s, eight 2s, and six 3s. Total of 30 students.

How to find each statistic

📍 Mode

The mode is the value with the highest frequency — easy to spot, just look for the biggest f.

📍 Median

The median is the middle value when all the data is in order. Use a cumulative frequency (running total) to find which row it lands in.

Find the position: median position = (n + 1) ÷ 2 (or the average of the two middle positions if n is even).
Build a cumulative frequency row by adding running totals.
Find which row contains your position — that’s the median value.

📍 Mean

Mean for an ungrouped frequency table

x̄ = Σ f_i x_in

where n = Σf_i (the total frequency)

In plain English:

Multiply each value by its frequency: f_i × x_i.
Add all those products together: that’s Σ f_i x_i (the total of every value, repeated by its frequency).
Divide by the total frequency.

🤔 Why multiply value × frequency?

If the value 2 occurs 8 times, the contribution to the total is 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 = 8 × 2 = 16. So f_i × x_i is just a shortcut for “add this value f times”. Then dividing by the total frequency gives the mean.

📍 Standard deviation, range, IQR

Use your GDC. Enter the values in one list and the frequencies in another, run 1-Var Stats with frequency mode on. The calculator handles all the rest.

📍

Always check your answer makes sense

The mean, median, and mode should always sit inside the range of your data. If the values go from 0 to 3 but you’ve calculated a mean of 5, you’ve made an arithmetic error somewhere.

Grouped frequency tables

Grouped tables look like this:

Height, h (cm)	Frequency
150 ≤ h < 155	3
155 ≤ h < 160	5
160 ≤ h < 165	9
165 ≤ h < 170	7
170 ≤ h < 175	1

You can see that 9 students have a height between 160 and 165 — but you don’t know if they’re 161 cm or 164.9 cm. That’s the cost of grouping the data. So instead of finding exact statistics, you find estimates.

The mid-interval value (the trick that makes it all work)

Mid-interval value

Mid-interval = lower boundary + upper boundary2

For each class, find the midpoint and use it as a “stand-in” for every value in that class. So for 160 ≤ h < 165, the mid-interval is (160 + 165) ÷ 2 = 162.5 cm — and we treat all 9 students as if they’re each 162.5 cm tall.

🧠

Memory trick: “Pretend they’re all in the middle”

You can’t know exactly where each value falls, so guess the safest spot — the middle of the class. Some will be a bit higher, some a bit lower, and the errors mostly cancel out.

What you can find from a grouped table

Modal class — the class with the highest frequency. Note: this is a class, not a single value (you can’t say which exact height was most common).
Estimated mean — use the formula but with mid-interval values:

Estimated mean (grouped data)

x̄ ≈ Σ f_i x_in (x_i = mid-interval values)

Estimated standard deviation, variance — same idea, plug into your GDC using the mid-interval values and frequencies.
Estimated median, quartiles, IQR — best found from a cumulative frequency graph (covered in the next note).
Range — can’t be found exactly, since you don’t know the smallest and largest values.

📍

Show that you know they’re estimates

It’s good practice to round grouped-data answers (e.g. 162.1 cm to 4 s.f.) rather than leave them as exact fractions. This signals to the marker that you understand the answer is an approximation.

In the formula booklet, the mean formula looks the same for both ungrouped and grouped data — but the meaning of x_i changes. For ungrouped, x_i is the actual value. For grouped, it’s the mid-interval. Same formula, different interpretation.

Worked examples

WE 1

Find statistics from an ungrouped frequency table

The frequency table below shows the number of pets owned by 30 students.

Number of pets	0	1	2	3
Frequency	11	5	8	6

Find: (a) the mode (b) the median (c) the mean (d) the standard deviation.

Total frequency n = 11 + 5 + 8 + 6 = 30 students.part (a) — mode Highest frequency = 11, which belongs to value 0. Mode = 0part (b) — median n = 30 → median = average of 15th and 16th values. Build cumulative frequency: 0 → 11, 1 → 16, 2 → 24, 3 → 30 15th value lands in “1” row, 16th value lands in “1” row. Median = 1part (c) — mean Use Σfx ÷ n: Σfx = 11×0 + 5×1 + 8×2 + 6×3 = 0+5+16+18 = 39 Mean = 3930 = 1.3 Mean = 1.3part (d) — sd Enter values in L1, frequencies in L2, run 1-Var Stats. σ_x ≈ 1.159… SD = 1.16 (3 s.f.) always set frequency list when running stats — otherwise it counts each row only once!

WE 2

Find statistics from a grouped frequency table

The table below shows the heights, in cm, of 25 students.

Height, h	Frequency
150 ≤ h < 155	3
155 ≤ h < 160	5
160 ≤ h < 165	9
165 ≤ h < 170	7
170 ≤ h < 175	1

(a) Write down the modal class. (b) Find the mid-interval value of the modal class. (c) Estimate the mean height.

Grouped data — answers will be estimates, not exact values.part (a) Highest frequency = 9, in the row 160 ≤ h < 165. Modal class = 160 ≤ h < 165part (b) Mid-interval = (lower + upper) ÷ 2: 160 + 1652 = 162.5 Mid-interval = 162.5 cmpart (c) — estimated mean Find each mid-interval, then Σfx: Mid-intervals: 152.5, 157.5, 162.5, 167.5, 172.5 Σfx = 3×152.5 + 5×157.5 + 9×162.5 + 7×167.5 + 1×172.5 = 457.5 + 787.5 + 1462.5 + 1172.5 + 172.5 = 4052.5 Divide by n = 25: 4052.525 = 162.1 Estimated mean ≈ 162.1 cm use mid-interval values as the “x” — they’re stand-ins for every value in the class

WE 3

Find a missing frequency given the mean

The frequency table shows the number of goals scored in 20 football matches. Given that the mean is 1.5, find the value of k.

Goals	0	1	2	3
Frequency	4	k	6	3

Total = 20, mean = 1.5 Two equations: total frequency = 20, and mean × n = Σfx. Total frequency: 4 + k + 6 + 3 = 20 Solve for k: k = 20 − 13 = 7 Check using mean: Σfx = 4×0 + 7×1 + 6×2 + 3×3 = 0 + 7 + 12 + 9 = 28 Mean = 28 ÷ 20 = 1.4 ❌ doesn’t match 1.5 So k can’t come from the total alone — use the mean equation. Set up: Σfx = 1.5 × 20 = 30 0 + k + 12 + 9 = 30 → k = 9 Then total: 4 + 9 + 6 + 3 = 22, not 20… There’s a conflict — only one constraint can be satisfied. Re-read: “20 matches” is the total → must satisfy that. Use that to find k = 7. k = 7 (and mean = 1.4, not exactly 1.5) always sanity-check both constraints. If they conflict, the question may have a typo or expect approximation

WE 4

Use cumulative frequency to find the median

The table shows the number of children per family in a survey of 40 families. Find the median.

Children	0	1	2	3	4
Frequency	5	12	15	6	2

n = 40 → median is average of 20th and 21st values Build cumulative frequencies (running totals) to locate the right row. Cumulative frequencies: 0 → 5, 1 → 17, 2 → 32, 3 → 38, 4 → 40 20th value: first row where cum freq ≥ 20 is “2” (at 32) 21st value: also in row “2” Median = average of two “2”s: Median = 2 children cumulative frequency = running total — perfect tool for finding the median’s row

WE 5

Estimate the standard deviation from grouped data

Using the heights table from WE 2 (25 students), estimate the standard deviation of the heights.

Grouped data → use mid-interval values in the GDC. SD will be an estimate. Enter mid-intervals in L1: 152.5, 157.5, 162.5, 167.5, 172.5 Enter frequencies in L2: 3, 5, 9, 7, 1 Run 1-Var Stats with frequency list = L2: σ_x ≈ 5.099… Round to 3 s.f.: Estimated SD ≈ 5.10 cm (3 s.f.) remember — this is an estimate because we treated everyone in a class as having the mid-interval height

💡 Top tips

Always set the frequency list on your GDC. Otherwise the calculator treats each row as one data point — and your answers will be totally wrong.
For ungrouped data, answers are exact. For grouped data, they’re estimates — always say “estimated” in your final answer.
Build a cumulative frequency row when finding the median from any frequency table. It locates the median’s row instantly.
Mid-interval value = (lower boundary + upper boundary) ÷ 2. Always work this out for every class before estimating the mean or SD.
The modal class is a class, not a single value. Write “the modal class is 160 ≤ h < 165”, not “the mode is 162.5”.
Sanity check by total frequency. Σf_i should equal n — if not, you’ve miscounted.
Round grouped-data answers to 3 s.f. or similar — this signals to the marker that you understand the answer is an estimate.
For “find a missing frequency given the mean” questions, use the equation Σf_i x_i = mean × n.

⚠ Common mistakes

Forgetting the frequency list on the GDC. If you enter just the values without the frequency column, the calculator gives the wrong mean and SD.
Using just x instead of f × x in the mean. Don’t divide Σx by 4 (number of classes); divide Σfx by Σf (total frequency).
Saying “mode = 162.5” for grouped data. You can only give a modal class, not a single mode.
Confusing exact with estimate. Grouped data → always estimates. Don’t write “the mean is exactly 162.1 cm”.
Wrong mid-interval values. For 160 ≤ h < 165, the midpoint is 162.5 — not 160 or 162 or 165. Take the average of the two boundaries.
Including the wrong row in cumulative frequency. The cumulative frequency at the end of a class is the running total up to and including that class.
Forgetting units. Heights → cm. Time → seconds. Always include them in the final answer.
Stopping at the modal class. Some questions ask for the mid-interval of the modal class as well — read the question carefully.

Frequency tables are the bridge between raw data and graphs. The next note covers linear transformations of data — what happens to the mean and SD when you scale or shift everything in your data set.

Need help with Frequency Tables?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.

Book Free Session →

Frequency Tables

📘 What you need to know

The two types of frequency table

Ungrouped frequency table

Grouped frequency table

Ungrouped frequency tables

Example layout

How to find each statistic

📍 Mode

📍 Median

📍 Mean

🤔 Why multiply value × frequency?

📍 Standard deviation, range, IQR

Always check your answer makes sense

Grouped frequency tables

The mid-interval value (the trick that makes it all work)

Memory trick: “Pretend they’re all in the middle”

What you can find from a grouped table

Show that you know they’re estimates

Worked examples

💡 Top tips

⚠ Common mistakes

Need help with Frequency Tables?

Quick Links

Contact us

Follow us