IB Maths AA HL
Topic 4 ā Statistics & Probability
Paper 1 & 2
~7 min read
Measures of Dispersion
Two data sets can have the same mean but look completely different ā one tightly clustered, one wildly scattered. Measures of dispersion (range, IQR, variance, standard deviation) capture that spread, each in a slightly different way.
š What you need to know
- Range = largest ā smallest. Affected by outliers.
- Quartiles divide data into four equal parts: Q1 (lower 25%), Q2 (median), Q3 (upper 75%).
- Interquartile range: IQR = Q3 ā Q1 (in formula booklet). Resistant to outliers.
- Variance ϲ = mean of squared deviations from the mean.
- Standard deviation Ļ = ā(variance) ā the most common measure of spread.
- Units: range, IQR, SD all share the data’s units; variance has the squared units.
- Use your GDC: 1-Variable Statistics gives quartiles and Ļ instantly.
- By-hand methods may differ from GDC values for quartiles ā examiners accept either, but match what your GDC reports.
Range and quartiles
Range
max ā min
full span; very sensitive to outliers
Interquartile range
IQR = Q3 ā Q1
middle 50%; ignores extremes
To find quartiles by hand: sort the data, find the median, then find the median of each half (excluding the median itself for an odd-sized data set). On a GDC, just use 1-Var Stats.
Variance and standard deviation
Variance (definition)
ϲ = Ī£ fi(xi ā μ)²n
Variance (computational form)
ϲ = Ī£ fixi²n ā μ²
Standard deviation
Ļ = ā(ϲ)
You don’t need to memorise these formulas ā the GDC’s 1-Var Stats reports Ļ directly. The formulas help you understand what the GDC is doing under the hood.
Which measure to use
| Measure | Uses | Affected by outliers? |
|---|
| Range | quick rough idea of spread | yes ā strongly |
| IQR | middle 50%; pairs with median | no ā resistant |
| Standard deviation | spread about the mean (uses every value) | yes |
| Variance | same as SD, but squared units (less interpretable directly) | yes |
š§ Recipe ā find dispersion measures
- Sort the data in ascending order.
- Range = last value ā first value.
- Quartiles: split sorted data at the median; Q1 = median of lower half, Q3 = median of upper half. (For odd n, exclude the median itself from each half.)
- IQR = Q3 ā Q1.
- SD and variance: use 1-Var Stats on your GDC. To check by hand: ϲ = Ī£(xi ā μ)²/n; Ļ = āϲ.
Worked examples
WE 1Range and IQR ā odd-sized data
Find the range and the interquartile range of the data set: 14, 5, 9, 22, 12, 16, 3, 18, 10, 20, 7.
Step 1: Sort (n = 11)
3, 5, 7, 9, 10, 12, 14, 16, 18, 20, 22
Step 2: Range = max ā min
22 ā 3 = 19
Step 3: Median = 6th value = 12
Step 4: Quartiles (exclude median for n odd)
Lower half: 3, 5, 7, 9, 10 ā Qā = 7 (3rd value)
Upper half: 14, 16, 18, 20, 22 ā Qā = 18 (3rd value)
Step 5: IQR = Qā ā Qā
18 ā 7 = 11
Range = 19; IQR = 11
middle 50% spans only 11 units, while full range is 19 ā most data is centrally clustered
WE 2Range and IQR ā even-sized data
Find the range, median, and interquartile range of the data set: 10, 18, 25, 14, 30, 12, 22, 28, 16, 20.
Step 1: Sort (n = 10)
10, 12, 14, 16, 18, 20, 22, 25, 28, 30
Step 2: Range
30 ā 10 = 20
Step 3: Median (n even ā avg of 5th and 6th)
Median = (18 + 20)/2 = 19
Step 4: Quartiles
Lower half: 10, 12, 14, 16, 18 ā Qā = 14 (3rd value)
Upper half: 20, 22, 25, 28, 30 ā Qā = 25 (3rd value)
Step 5: IQR
25 ā 14 = 11
Range = 20; Median = 19; IQR = 11
for even n, lower half is the first 5 values, upper half is the last 5
WE 3Variance and standard deviation by hand
Find the variance and standard deviation of the data set: 7, 9, 10, 12, 13, 15. Give your standard deviation to 3 s.f.
Step 1: Mean
μ = (7 + 9 + 10 + 12 + 13 + 15)/6 = 66/6 = 11
Step 2: Deviations from the mean
ā4, ā2, ā1, 1, 2, 4
Step 3: Square and sum
16 + 4 + 1 + 1 + 4 + 16 = 42
Step 4: Variance = sum/n
ϲ = 42/6 = 7
Step 5: SD = ā(variance)
Ļ = ā7 ā 2.6458
Variance = 7; SD ā 2.65
deviations are symmetric around the mean ā a sign of clean by-hand data
WE 4Variance and SD using the computational formula
Use ϲ = (Ī£x²)/n ā μ² to find the variance and standard deviation of: 12, 15, 18, 20, 22, 25, 28, 30. Give your answers to 3 s.f.
Step 1: Mean
Ī£x = 12+15+18+20+22+25+28+30 = 170
μ = 170/8 = 21.25
Step 2: Σx²
144 + 225 + 324 + 400 + 484 + 625 + 784 + 900 = 3886
Step 3: Apply formula
ϲ = 3886/8 ā 21.25² = 485.75 ā 451.5625 = 34.1875
Step 4: SD
Ļ = ā34.1875 ā 5.8470
Variance ā 34.2; SD ā 5.85
computational form is faster when the mean isn’t a whole number
WE 5Find missing values from mean and range
Five numbers in ascending order are 4, 9, x, 14, y. Their mean is 11 and range is 14. Find x and y.
Step 1: Use the range to find y
range = y ā 4 = 14 ā y = 18
Step 2: Use the mean to find x
(4 + 9 + x + 14 + 18)/5 = 11
45 + x = 55 ā x = 10
Step 3: Verify ordering
4 < 9 < 10 < 14 < 18 ā
x = 10, y = 18
range gives the largest value; mean closes the system
WE 6Compare two data sets with the same mean
The pulse rates of two groups of 7 athletes (in beats per minute) are recorded:
Group A: 64, 68, 70, 72, 74, 76, 80 Group B: 60, 64, 70, 72, 74, 80, 84.
(a) Find the mean of each group. (b) Find the IQR of each group. (c) Find the standard deviation of each group. (d) Comment on which group is more consistent.
(a) Means
A: 504/7 = 72; B: 504/7 = 72 (identical)
(b) IQR ā both have median 72
A: lower half 64, 68, 70 ā Qā = 68; upper 74, 76, 80 ā Qā = 76; IQR = 8
B: lower 60, 64, 70 ā Qā = 64; upper 74, 80, 84 ā Qā = 80; IQR = 16
(c) SD using GDC
A: Ļ ā 4.90 (variance 24)
B: Ļ ā 7.78 (variance ā 60.6)
(d) Same mean, but A has smaller IQR and SD
Group A is more consistent ā values cluster more tightly around the mean
moral: two groups can have identical averages but very different consistency
š” Top tips
- Always sort before finding range, median, or quartiles.
- Use the GDC’s 1-Var Stats for Ļ ā much faster than the by-hand formula.
- Pair the right average with the right spread: mean ā SD; median ā IQR.
- For by-hand variance, the computational form (Ī£x²/n ā μ²) is faster when the mean isn’t a whole number.
- Be careful with units ā variance is in squared units (e.g., kg²), SD is back to original units (kg).
ā Common mistakes
- Forgetting to sort before finding quartiles or the median.
- Confusing variance with standard deviation ā SD is the square root of variance.
- Reporting variance with the data’s units ā variance has squared units.
- Using a different quartile method by hand than the GDC reports ā values may differ slightly; in exams, follow the GDC.
- Forgetting the absolute value in deviations, leading to a sum of zero ā square them before summing.
Next: Frequency Tables. When data has lots of repeats, frequency tables compress everything into a compact form. The mean, median, mode, and SD all generalise ā but you weight each value by its frequency. For grouped data, you use mid-interval values to estimate the mean.
Need help with Statistics & Probability?
Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.
Book Free Session ā