IB Maths AA HL Topic 4 — Statistics & Probability Paper 1 & 2 ~8 min read

Frequency Tables

Frequency tables compress repeated data into a tidy summary. The same statistical measures (mean, median, mode, SD) all generalise — you just weight each value by how often it occurs. For grouped data, you lose the exact values and have to use mid-interval estimates instead.

📘 What you need to know

Ungrouped frequency tables

Mean from a frequency tablex  =  Σ fixin    where   n = Σfi

To find the median, build a cumulative frequency column. The median is the (n + 1)/2-th value (or for even n, the average of the n/2-th and (n/2 + 1)-th). Read off which row’s cumulative frequency first reaches that position.

Grouped frequency tables

Trade-off: grouping gains tidiness but loses precision — you no longer know exact values, so all measures become estimates.
Mid-interval value
xi = (lower + upper) / 2
stand-in for the actual values in that class
Modal class
class with highest frequency
requires equal class widths to be meaningful

The mean is estimated by summing fixi using the mid-interval values and dividing by n. The median, IQR, and quartiles are read off a cumulative frequency graph (covered in a later note).

🧭 Recipe — measures from a frequency table

  1. For grouped data, compute mid-interval values xi = (lower + upper)/2. (Skip for ungrouped.)
  2. Find n = Σfi (total frequency).
  3. Compute Σfixi. Divide by n for the mean.
  4. For mode/modal class: pick the row with the highest frequency.
  5. For median: build cumulative totals; locate which row contains the middle value (or class).
  6. For SD: enter values + frequencies into the GDC’s 1-Var Stats.

Worked examples

WE 1

Mode, median, mean from an ungrouped frequency table

The table shows the number of goals scored per match by a football team across 20 matches.

Goals (x)01234
Frequency (f)47621

Find (a) the mode, (b) the median, and (c) the mean.

Step 1: Total n = Σf n = 4 + 7 + 6 + 2 + 1 = 20 (a) Mode = value with highest frequency 7 is highest → Mode = 1 goal (b) Median: cumulative frequencies cum: 4, 11, 17, 19, 20 For n = 20, median = avg of 10th and 11th values cumulative reaches 11 at goals = 1 → both 10th and 11th are “1” Median = 1 goal (c) Mean = Σfx/n Σfx = 0(4) + 1(7) + 2(6) + 3(2) + 4(1) = 29 Mean = 29/20 = 1.45 Mode = 1; Median = 1; Mean = 1.45 goals all three close together — distribution is fairly symmetric and centered near 1
WE 2

Standard deviation from an ungrouped frequency table

The table shows the number of mobile phones owned by 40 people in a survey.

Phones01234
Frequency2141842

Find the mean and standard deviation. Give your standard deviation to 3 s.f.

Step 1: n = Σf n = 2 + 14 + 18 + 4 + 2 = 40 Step 2: Σfx 0(2) + 1(14) + 2(18) + 3(4) + 4(2) = 0 + 14 + 36 + 12 + 8 = 70 Mean = 70/40 = 1.75 Step 3: Σfx² 0²(2) + 1²(14) + 2²(18) + 3²(4) + 4²(2) = 0 + 14 + 72 + 36 + 32 = 154 Step 4: Variance σ² = Σfx²/n − μ² σ² = 154/40 − 1.75² = 3.85 − 3.0625 = 0.7875 Step 5: SD σ = √0.7875 ≈ 0.8874 Mean = 1.75; SD ≈ 0.887 phones faster method: use 1-Var Stats with two lists (values + frequencies)
WE 3

Modal class and estimated mean from a grouped table

The table shows the times in minutes (t) that 35 students took to complete a puzzle.

Time (min)0 ≤ t < 55 ≤ t < 1010 ≤ t < 1515 ≤ t < 2020 ≤ t < 25
Frequency491273

(a) Write down the modal class. (b) Find the mid-interval value of the modal class. (c) Find an estimate for the mean.

(a) Modal class = highest frequency freq 12 is highest → Modal class: 10 ≤ t < 15 (b) Mid-interval value (10 + 15)/2 = 12.5 min (c) Mid-interval values: 2.5, 7.5, 12.5, 17.5, 22.5 Σfx 4(2.5) + 9(7.5) + 12(12.5) + 7(17.5) + 3(22.5) = 10 + 67.5 + 150 + 122.5 + 67.5 = 417.5 Estimated mean 417.5/35 ≈ 11.93 (a) 10 ≤ t < 15; (b) 12.5 min; (c) ≈ 11.9 min “estimate” — rounded form is preferred over leaving fractions
WE 4

Estimated mean and SD from a grouped frequency table

The heights, in cm, of 50 plants in a greenhouse are recorded.

Height (cm)10 ≤ h < 2020 ≤ h < 3030 ≤ h < 4040 ≤ h < 5050 ≤ h < 60
Frequency51218105

Find an estimate for the mean and the standard deviation. Give your SD to 3 s.f.

Step 1: Mid-intervals — 15, 25, 35, 45, 55 Step 2: Σfx 5(15) + 12(25) + 18(35) + 10(45) + 5(55) = 75 + 300 + 630 + 450 + 275 = 1730 Estimated mean = 1730/50 = 34.6 cm Step 3: Σfx² 5(225) + 12(625) + 18(1225) + 10(2025) + 5(3025) = 1125 + 7500 + 22050 + 20250 + 15125 = 66050 Step 4: σ² = Σfx²/n − μ² = 66050/50 − 34.6² = 1321 − 1197.16 = 123.84 Step 5: SD σ = √123.84 ≈ 11.13 Estimated mean ≈ 34.6 cm; estimated SD ≈ 11.1 cm word “estimated” must appear — exact values impossible without raw data
WE 5

Find a missing frequency given the mean

The number of siblings of students in a class are summarised below, where x is unknown.

Siblings01234
Frequency7x942

Given that the mean number of siblings is 1.5, find the value of x.

Step 1: Express n and Σfx in terms of x n = 7 + x + 9 + 4 + 2 = 22 + x Σfx = 0(7) + 1(x) + 2(9) + 3(4) + 4(2) = x + 38 Step 2: Apply mean = Σfx/n (x + 38)/(22 + x) = 1.5 Step 3: Solve x + 38 = 1.5(22 + x) = 33 + 1.5x 38 − 33 = 1.5x − x → 5 = 0.5x → x = 10 x = 10 verify: total = 32, Σfx = 48; mean = 48/32 = 1.5 ✓
WE 6

Modal class, median class, and estimated mean

The test scores of 40 students out of 100 are summarised below.

Score30 ≤ s < 4040 ≤ s < 5050 ≤ s < 6060 ≤ s < 7070 ≤ s < 8080 ≤ s < 90
Frequency47121151

(a) Write down the modal class. (b) Find the class containing the median. (c) Estimate the mean score.

(a) Modal class freq 12 is highest → 50 ≤ s < 60 (b) Median class — use cumulative frequency cum: 4, 11, 23, 34, 39, 40 Median position = n/2 = 20 cum reaches 23 at 50 ≤ s < 60 (first to exceed 20) → median class: 50 ≤ s < 60 (c) Mid-intervals: 35, 45, 55, 65, 75, 85 Σfx = 4(35) + 7(45) + 12(55) + 11(65) + 5(75) + 1(85) = 140 + 315 + 660 + 715 + 375 + 85 = 2290 Estimated mean = 2290/40 = 57.25 (a) 50 ≤ s < 60; (b) 50 ≤ s < 60; (c) ≈ 57.3 when modal class = median class, the distribution is roughly centred there

💡 Top tips

⚠ Common mistakes

Next: Linear Transformations of Data. If every value in a data set is multiplied by 2 and shifted by 5, what happens to the mean and standard deviation? Spoiler: they transform differently. The mean follows the rule literally; the SD ignores the shift but doubles. Same idea, captured by E(aX+b) and Var(aX+b) on the formula booklet.

Need help with Statistics & Probability?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.

Book Free Session →