IB Maths AI SLTopic 4 — Statistics ToolkitPaper 1 & 2Coding data~6 min read
Linear Transformations of Data
If a teacher doubles every mark and adds 10, or you convert metres to centimetres, every data value gets the same linear transformationy = ax + b. You don’t need to redo all the statistics — two short rules tell you what happens to the mean and standard deviation. Memorise them and you’ll save several minutes per exam paper.
📘 What you need to know
The transformation: replace each value xi with yi = axi + b. Here a stretches/shrinks and b shifts.
Mean: ȳ = ax̄ + b. The mean is transformed the same way as every value.
Variance: σy2 = a2σx2. Adding b doesn’t change it.
Standard deviation: σy = |a| σx. Use absolute value so a negative a still gives a positive σ.
Why b doesn’t affect spread: adding the same number to every value shifts the whole data set sideways — the gaps between values stay identical.
HL formula booklet: E(aX + b) = aE(X) + b and Var(aX + b) = a2Var(X). Same rules, just dressed up.
The two rules — and the picture
Think of the data as dots on a number line. Two things can happen:
• Adding b — the entire dot pattern slides sideways by b. Every gap is preserved, so the spread is unchanged. Mean shifts, std dev unchanged.
• Multiplying by a — the pattern is stretched (or shrunk) by factor a away from zero. Gaps grow by factor |a|, so std dev scales by |a| and variance by a2.
Left: adding 3 slides every dot right by 3 — mean shifts but gaps are preserved, so σ is unchanged. Right: multiplying by 2 stretches the whole pattern away from zero — gaps double, so σ doubles too.
Identify a and b from the question (the “multiply by” and “add” parts).
New mean: apply the SAME transformation to the old mean — ȳ = ax̄ + b.
New std dev: multiply the old std dev by |a|. The b is irrelevant for spread.
New variance: multiply the old variance by a2 (not |a|).
Reverse question? Compare old & new std devs to find a; then use the means to find b.
Spread test: if the question only ADDS a constant, the spread doesn’t change — σ, variance, range and IQR are all identical. Only the mean / median / mode shift.
Worked examples
WE 1
Add a constant — mean shifts, σ unchanged
A class of students took a maths test. The mean mark was 60 and the standard deviation was 8. The teacher then adds 5 bonus marks to every student’s score.
Find the new mean and standard deviation.
Transformation: y = x + 5 → a = 1, b = 5New meanȳ = 1×60 + 5 = 65New std dev (only b changed → σ unchanged)σ_y = |1| × 8 = 8new mean = 65 · new σ = 8adding the same amount to every value shifts the WHOLE distribution sideways. Distances between values stay the same → σ unchanged.
WE 2
Unit conversion — multiply only
The heights of a group of plants have mean 2.5 m and standard deviation 0.3 m. The data is converted to centimetres.
Find the new mean and standard deviation.
1 m = 100 cm → transformation y = 100xa = 100, b = 0New meanȳ = 100 × 2.5 = 250 cmNew std devσ_y = |100| × 0.3 = 30 cmmean = 250 cm · σ = 30 cmunit conversions are pure multiplication (b = 0). Both the mean and σ scale by the same factor — and the units change with them.
WE 3
Combined ax + b — standardising scores
A practice test has mean 40 and standard deviation 6. The teacher converts each raw score x using y = 1.5x + 5.
Find the mean and standard deviation of the converted scores.
Identify a, ba = 1.5, b = 5New meanȳ = 1.5 × 40 + 5 = 60 + 5 = 65New std devσ_y = |1.5| × 6 = 9new mean = 65 · new σ = 9apply the transformation to the mean directly. For σ, only the “× 1.5” matters; the “+ 5” doesn’t change spread.
WE 4
Reverse problem — find a and b
A data set has mean 50 and standard deviation 4. After applying y = ax + b, the new mean is 110 and the new standard deviation is 12.
Find the values of a and b.
Step 1 — use σ to find aσ_y = |a| × σ_x12 = |a| × 4 → |a| = 3take a = 3 (positive default)Step 2 — use the mean to find bȳ = a×xÌ„ + b110 = 3×50 + b110 = 150 + b → b = −40Checky = 3x − 40, mean: 3(50)−40 = 110 ✓a = 3 · b = −40always solve σ FIRST (it only involves a). Then plug a into the mean equation to find b. The order matters because b doesn’t appear in the σ equation.
WE 5
Variance scales by a2, not |a|
A data set has variance 9. Each value is multiplied by 4.
Find the new variance and the new standard deviation.
Transformation: y = 4x → a = 4New variance: multiply by a², NOT |a|σ_y² = 4² × 9 = 16 × 9 = 144Std dev (two ways)σ_y = √144 = 12OR σ_y = |4| × √9 = 4 × 3 = 12 ✓new variance = 144 · new σ = 12the trap: writing “new variance = 4 × 9 = 36”. Variance scales by a² because variance is already in squared units. Std dev (linear units) scales by |a|.
WE 6
Negative coefficient — absolute value of a
A set of temperatures (°C) has mean 10 and standard deviation 3. The values are transformed using y = −2x + 50.
Find the new mean and standard deviation.
Identify a, ba = −2, b = 50New mean (signed a)ȳ = (−2)(10) + 50 = −20 + 50 = 30New std dev (|a|, always positive)σ_y = |−2| × 3 = 6new mean = 30 · new σ = 6for the mean keep the negative sign; for σ take the absolute value. A standard deviation can never be negative — that’s why |a| not a.
💡 Top tips
Mean uses signed a; std dev uses |a|; variance uses a2. Three different “a“s — memorise.
Adding a constant changes the mean only, never the spread. Big shortcut for “+ 5 bonus marks” style questions.
Unit conversion is just y = ax with b = 0. Both mean and σ scale identically.
Reverse problems: solve a from the std-dev equation first, then b from the mean equation.
HL formula booklet uses E(…) and Var(…) — same rules, same trick: a outside Var becomes a².
âš Common mistakes
Adding b to the std dev: it doesn’t affect spread. Don’t write σy = aσx + b.
Using a instead of a2 for variance: variance scales by the square because it’s already in squared units.
Forgetting absolute value with a negative a: σ can never be negative. Use |a|.
Computing σ first then b wrong: b only affects the mean. Solve a first using the std-dev equation, then b with the mean.
Mixing up direction: if temperatures rise by 5° (add 5), the mean rises by 5 and σ stays. Don’t change σ.
Next up: Outliers. We’ve already seen that the mean and range are sensitive to extreme values, but how do you formally identify which values count as outliers? The 1.5 × IQR rule answers that — and decides whether to remove them or not.
Need help with AI SL Statistics?
Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.