IB Maths AA HLTopic 4 — Statistics & ProbabilityPaper 1 & 2~7 min read
Histograms
A frequency histogram is a bar chart for grouped continuous data — bars sit side-by-side with no gaps, heights show class frequencies, widths show class intervals. The shape tells you a lot: where the data clusters, whether it’s symmetric, skewed, or potentially normal.
📘 What you need to know
Frequency histogram — bars whose heights are class frequencies, sitting along the x-axis at class boundaries.
No gaps between bars (data is continuous; a bar starts at the previous bar’s end).
Equal class widths are assumed in the IB AA HL syllabus.
Modal class = the class with the tallest bar.
Histogram vs bar chart: bar charts have gaps and are for qualitative or discrete data; histograms have no gaps and are for continuous grouped data.
Shape: symmetric and bell-shaped suggests a normal distribution.
Skew: a long tail to the right is positive skew; long tail to the left is negative skew.
Use the GDC to plot histograms quickly: enter mid-interval values and frequencies, then plot.
Anatomy of a histogram
Frequency histogram: equal class widths, no gaps between bars. The tallest bar is the modal class.
Reading shape from a histogram
Shape
Visual
Implication
Symmetric / bell-shaped
peaks in centre, tails decay equally
could be modelled by a normal distribution
Positively skewed (right-skewed)
tall bars on the left, long tail on the right
median < mean; not normal
Negatively skewed (left-skewed)
tall bars on the right, long tail on the left
median > mean; not normal
Bimodal
two distinct peaks
likely two underlying populations mixed together
“Could be normal” requires both: (i) approximately symmetric, (ii) bell-shaped (high in middle, decaying outwards). Symmetric + flat is uniform, not normal.
🧭 Recipe — draw and interpret a histogram
Set up axes: x = data variable (continuous, with units); y = frequency.
Mark class boundaries on the x-axis using an even scale.
Draw bars for each class — no gaps, height = class frequency.
Identify the modal class (tallest bar).
Comment on shape: symmetric, skewed, bell-shaped, bimodal.
For estimates of mean: use mid-interval values × frequencies, sum, divide by total.
Worked examples
WE 1
Draw a frequency histogram
The table shows the times in minutes (t) that 50 students spent on homework one evening.
Time (min)
0 ≤ t < 20
20 ≤ t < 40
40 ≤ t < 60
60 ≤ t < 80
80 ≤ t < 100
Frequency
6
12
18
9
5
(a) Describe the histogram you would draw. (b) State the modal class.
(a) Histogram descriptionx-axis: time in minutes, scale 0 to 100, no gapsy-axis: frequency, scale 0 to 18 (or 20)5 bars of equal width 20, heights 6, 12, 18, 9, 5Bars touch each other (no gaps)(b) Modal class = tallest barfreq 18 → modal class is 40 ≤ t < 60Modal class: 40 ≤ t < 60shape rises to peak at 40-60 then falls — slight right-skew (tail extends further right)
WE 2
Modal class and estimated mean
The weights of 40 apples (in grams) are summarised below.
Weight (g)
80 ≤ w < 100
100 ≤ w < 120
120 ≤ w < 140
140 ≤ w < 160
160 ≤ w < 180
Frequency
4
9
13
10
4
(a) State the modal class. (b) Estimate the mean weight.
(a) Modal class — tallest barfreq 13 → 120 ≤ w < 140(b) Mid-interval values90, 110, 130, 150, 170Σfx4(90) + 9(110) + 13(130) + 10(150) + 4(170)= 360 + 990 + 1690 + 1500 + 680 = 5220Estimated mean = 5220/40 = 130.5 gModal class: 120 ≤ w < 140; estimated mean = 130.5 gdistribution is roughly symmetric — mean lies near the middle of the modal class
WE 3
Comment on the shape of a histogram
The daily social media usage (in hours) of 47 students is shown below.
Hours
0 ≤ h < 2
2 ≤ h < 4
4 ≤ h < 6
6 ≤ h < 8
8 ≤ h < 10
Frequency
18
14
9
4
2
(a) State the modal class. (b) Comment on the shape of the distribution.
(a) Modal classfreq 18 → 0 ≤ h < 2(b) ShapeFrequencies decrease as hours increase: 18, 14, 9, 4, 2Tallest bar on the left; long tail extends to the rightModal class: 0 ≤ h < 2; positively skewed (right-skewed) distributionpositive skew → mean > median; the data is NOT a good candidate for a normal distribution
WE 4
Could the data be modelled by a normal distribution?
The birth weights (in kg) of 54 newborns are recorded.
Weight (kg)
2.0 ≤ w < 2.5
2.5 ≤ w < 3.0
3.0 ≤ w < 3.5
3.5 ≤ w < 4.0
4.0 ≤ w < 4.5
Frequency
5
12
20
12
5
Comment on whether a normal distribution would be a suitable model for this data.
Step 1: Look at frequencies5, 12, 20, 12, 5 — symmetric around the central class 3.0–3.5Step 2: Check shapePeaks in middle, decays evenly to both tails → bell-shapedStep 3: ConclusionYes — symmetric and bell-shaped, so a normal distribution is a suitable model“yes” requires BOTH symmetry AND bell shape; either alone is not enough
WE 5
Find a missing frequency from a histogram
A histogram has the following class frequencies, where one is unknown.
Class
20 ≤ m < 40
40 ≤ m < 60
60 ≤ m < 80
80 ≤ m < 100
100 ≤ m < 120
Frequency
6
14
x
8
3
The total frequency is 50. Find x and state the modal class.
Step 1: Sum of all frequencies = 506 + 14 + x + 8 + 3 = 5031 + x = 50 → x = 19Step 2: Identify modal classFrequencies: 6, 14, 19, 8, 3 → highest is 19x = 19; modal class: 60 ≤ m < 80always identify modal class AFTER finding the missing frequency — the unknown class might be the mode
WE 6
Use a histogram to estimate counts above a threshold
Using the apple-weight data from WE 2 (n = 40), find: (a) the number of apples weighing more than 140 g; (b) the percentage of apples weighing 120 g or less.
(a) Apples > 140 g — sum classes 140-160 and 160-18010 + 4 = 14(b) Apples ≤ 120 g — sum classes 80-100 and 100-1204 + 9 = 13Percentage: 13/40 × 100 = 32.5%(a) 14 apples; (b) 32.5%when the threshold falls exactly at a class boundary, just sum the relevant whole classes
💡 Top tips
Histogram bars touch — leave NO gaps for continuous data.
Use class boundaries on the x-axis, not midpoints.
Modal class is the tallest bar — there can be more than one if multiple bars tie.
Symmetric AND bell-shaped are both required for “could be normal”.
Use the GDC’s plot to check by-hand histograms — saves time on Paper 2.
⚠ Common mistakes
Drawing gaps between bars — that’s a bar chart, not a histogram.
Using midpoints as bar positions instead of class boundaries.
Confusing modal class with mode — modal class is an interval; mode would be a single value.
Saying “data is normal” from symmetric-only or bell-only — both features are needed for the suggestion.
Sketching from raw data without grouping it first — you need a frequency table for grouped data, not a list.
Final note in this section: Interpreting Data. With all the tools assembled — averages, dispersion, box plots, cumulative graphs, histograms — the question becomes which one to use. The answer depends on whether outliers are present, whether the data is symmetric, and what claim you’re trying to support. Comparing two distributions in context is the single most-tested skill in this section.
Need help with Statistics & Probability?
Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.