IB Maths AA SL Topic 4 — Statistics Toolkit Paper 1 & 2 ~12 min read

Frequency Tables

When you have lots of repeated values — or huge data sets — listing every single value gets messy fast. Frequency tables compress all that data into a tidy summary. This note shows you how to find every statistical measure from them, whether the data is grouped or not.

📘 What you need to know

The two types of frequency table

Frequency tables come in two flavours. The big difference is whether each row gives you an exact value or a range — and that completely changes what you can calculate.

Ungrouped frequency table

Each row holds one specific value and how many times it occurred. You know the exact data.
e.g. “Number of pets” → 0, 1, 2, 3

Grouped frequency table

Each row holds a range of values (a “class interval”) and how many fell into that range. You don’t know the exact values.
e.g. “Height” → 150 ≤ h < 155, 155 ≤ h < 160, …
If a question gives you a frequency table, the very first thing to check is: are these single values or class intervals? That tells you whether you’ll be finding exact answers or estimates.

Ungrouped frequency tables

An ungrouped table shows you the exact data — just compressed. Imagine writing out every value individually: that’s still possible from this table, you just don’t have to.

Example layout

Number of pets (x)0123
Frequency (f)11586

This is just shorthand for: eleven 0s, five 1s, eight 2s, and six 3s. Total of 30 students.

How to find each statistic

📍 Mode

The mode is the value with the highest frequency — easy to spot, just look for the biggest f.

📍 Median

The median is the middle value when all the data is in order. Use a cumulative frequency (running total) to find which row it lands in.

  1. Find the position: median position = (n + 1) ÷ 2 (or the average of the two middle positions if n is even).
  2. Build a cumulative frequency row by adding running totals.
  3. Find which row contains your position — that’s the median value.

📍 Mean

Mean for an ungrouped frequency table
= Σ fi xin
where n = Σfi (the total frequency)

In plain English:

  1. Multiply each value by its frequency: fi × xi.
  2. Add all those products together: that’s Σ fi xi (the total of every value, repeated by its frequency).
  3. Divide by the total frequency.

🤔 Why multiply value × frequency?

If the value 2 occurs 8 times, the contribution to the total is 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 = 8 × 2 = 16. So fi × xi is just a shortcut for “add this value f times”. Then dividing by the total frequency gives the mean.

📍 Standard deviation, range, IQR

Use your GDC. Enter the values in one list and the frequencies in another, run 1-Var Stats with frequency mode on. The calculator handles all the rest.

📍

Always check your answer makes sense

The mean, median, and mode should always sit inside the range of your data. If the values go from 0 to 3 but you’ve calculated a mean of 5, you’ve made an arithmetic error somewhere.

Grouped frequency tables

Grouped tables look like this:

Height, h (cm)Frequency
150 ≤ h < 1553
155 ≤ h < 1605
160 ≤ h < 1659
165 ≤ h < 1707
170 ≤ h < 1751

You can see that 9 students have a height between 160 and 165 — but you don’t know if they’re 161 cm or 164.9 cm. That’s the cost of grouping the data. So instead of finding exact statistics, you find estimates.

The mid-interval value (the trick that makes it all work)

Mid-interval value
Mid-interval = lower boundary + upper boundary2

For each class, find the midpoint and use it as a “stand-in” for every value in that class. So for 160 ≤ h < 165, the mid-interval is (160 + 165) ÷ 2 = 162.5 cm — and we treat all 9 students as if they’re each 162.5 cm tall.

🧠

Memory trick: “Pretend they’re all in the middle”

You can’t know exactly where each value falls, so guess the safest spot — the middle of the class. Some will be a bit higher, some a bit lower, and the errors mostly cancel out.

What you can find from a grouped table

Estimated mean (grouped data)
Σ fi xin  (xi = mid-interval values)
📍

Show that you know they’re estimates

It’s good practice to round grouped-data answers (e.g. 162.1 cm to 4 s.f.) rather than leave them as exact fractions. This signals to the marker that you understand the answer is an approximation.

In the formula booklet, the mean formula looks the same for both ungrouped and grouped data — but the meaning of xi changes. For ungrouped, xi is the actual value. For grouped, it’s the mid-interval. Same formula, different interpretation.

Worked examples

WE 1

Find statistics from an ungrouped frequency table

The frequency table below shows the number of pets owned by 30 students.

Number of pets0123
Frequency11586

Find: (a) the mode   (b) the median   (c) the mean   (d) the standard deviation.

Total frequency n = 11 + 5 + 8 + 6 = 30 students.part (a) — mode Highest frequency = 11, which belongs to value 0. Mode = 0part (b) — median n = 30 → median = average of 15th and 16th values. Build cumulative frequency: 0 → 11,   1 → 16,   2 → 24,   3 → 30 15th value lands in “1” row, 16th value lands in “1” row. Median = 1part (c) — mean Use Σfx ÷ n: Σfx = 11×0 + 5×1 + 8×2 + 6×3 = 0+5+16+18 = 39 Mean = 3930 = 1.3 Mean = 1.3part (d) — sd Enter values in L1, frequencies in L2, run 1-Var Stats. σx ≈ 1.159… SD = 1.16 (3 s.f.) always set frequency list when running stats — otherwise it counts each row only once!
WE 2

Find statistics from a grouped frequency table

The table below shows the heights, in cm, of 25 students.

Height, hFrequency
150 ≤ h < 1553
155 ≤ h < 1605
160 ≤ h < 1659
165 ≤ h < 1707
170 ≤ h < 1751

(a) Write down the modal class.   (b) Find the mid-interval value of the modal class.   (c) Estimate the mean height.

Grouped data — answers will be estimates, not exact values.part (a) Highest frequency = 9, in the row 160 ≤ h < 165. Modal class = 160 ≤ h < 165part (b) Mid-interval = (lower + upper) ÷ 2: 160 + 1652 = 162.5 Mid-interval = 162.5 cmpart (c) — estimated mean Find each mid-interval, then Σfx: Mid-intervals: 152.5, 157.5, 162.5, 167.5, 172.5 Σfx = 3×152.5 + 5×157.5 + 9×162.5 + 7×167.5 + 1×172.5    = 457.5 + 787.5 + 1462.5 + 1172.5 + 172.5 = 4052.5 Divide by n = 25: 4052.525 = 162.1 Estimated mean ≈ 162.1 cm use mid-interval values as the “x” — they’re stand-ins for every value in the class
WE 3

Find a missing frequency given the mean

The frequency table shows the number of goals scored in 20 football matches. Given that the mean is 1.5, find the value of k.

Goals0123
Frequency4k63
Total = 20,   mean = 1.5 Two equations: total frequency = 20, and mean × n = Σfx. Total frequency: 4 + k + 6 + 3 = 20 Solve for k: k = 20 − 13 = 7 Check using mean: Σfx = 4×0 + 7×1 + 6×2 + 3×3 = 0 + 7 + 12 + 9 = 28 Mean = 28 ÷ 20 = 1.4  ❌ doesn’t match 1.5 So k can’t come from the total alone — use the mean equation. Set up: Σfx = 1.5 × 20 = 30 0 + k + 12 + 9 = 30  →  k = 9 Then total: 4 + 9 + 6 + 3 = 22, not 20… There’s a conflict — only one constraint can be satisfied. Re-read: “20 matches” is the total → must satisfy that. Use that to find k = 7. k = 7   (and mean = 1.4, not exactly 1.5) always sanity-check both constraints. If they conflict, the question may have a typo or expect approximation
WE 4

Use cumulative frequency to find the median

The table shows the number of children per family in a survey of 40 families. Find the median.

Children01234
Frequency5121562
n = 40 → median is average of 20th and 21st values Build cumulative frequencies (running totals) to locate the right row. Cumulative frequencies: 0 → 5,   1 → 17,   2 → 32,   3 → 38,   4 → 40 20th value: first row where cum freq ≥ 20 is “2” (at 32) 21st value: also in row “2” Median = average of two “2”s: Median = 2 children cumulative frequency = running total — perfect tool for finding the median’s row
WE 5

Estimate the standard deviation from grouped data

Using the heights table from WE 2 (25 students), estimate the standard deviation of the heights.

Grouped data → use mid-interval values in the GDC. SD will be an estimate. Enter mid-intervals in L1: 152.5, 157.5, 162.5, 167.5, 172.5 Enter frequencies in L2: 3, 5, 9, 7, 1 Run 1-Var Stats with frequency list = L2: σx ≈ 5.099… Round to 3 s.f.: Estimated SD ≈ 5.10 cm (3 s.f.) remember — this is an estimate because we treated everyone in a class as having the mid-interval height

💡 Top tips

⚠ Common mistakes

Frequency tables are the bridge between raw data and graphs. The next note covers linear transformations of data — what happens to the mean and SD when you scale or shift everything in your data set.

Need help with Frequency Tables?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.

Book Free Session →