IB Maths AI HL Statistics Toolkit Paper 1 & 2 ~7 min read

Linear Transformations of Data

Sometimes you’ll want to change every value in a dataset — convert kilograms to grams, shift exam scores to a friendlier scale, or convert temperatures from Celsius to Fahrenheit. Any change of the form “multiply by a constant, then add a constant” is called a linear transformation: Y = aX + b. The key question is: how do the mean, variance, and standard deviation respond? Two simple rules cover everything. Adding a constant shifts the mean but leaves the spread alone. Multiplying by a constant scales both — except the variance gets squared. Memorise the two rules and you can solve every IB question on this topic in seconds.

📘 What you need to know

A linear transformation of data has the form Y = aX + b, where a and b are constants.
Mean rule: E(aX + b) = a E(X) + b. The mean is scaled by a and then shifted by b.
Variance rule: Var(aX + b) = a² Var(X). The variance is multiplied by a²; the constant b has no effect.
Standard deviation rule: σ_new = |a| × σ. SD is scaled by the absolute value of a (so the sign is dropped).
Adding a constant (just +b): the mean shifts, but the spread (variance, SD) is unchanged.
Multiplying by a constant (just ×a): the mean is multiplied by a, the SD by |a|, and the variance by a².
Both formulas E(aX+b) and Var(aX+b) are in the HL formula booklet.
Also called: “effects of constant changes”, “linear coding”.

The two formulas — in the formula booklet

Linear transformation — mean E(aX + b) = a E(X) + b in the formula booklet ✓

Linear transformation — variance Var(aX + b) = a² Var(X) in the formula booklet ✓

From these, the standard deviation rule follows directly: σ_new = √Var = √(a² Var(X)) = |a| × σ.

how each rule splits the transformation

The “+ b” half shifts the whole distribution sideways — the mean moves, but the spread is unchanged. The “× a” half stretches the whole distribution — both the mean and the spread are scaled by |a|.

Why are the two rules different?

Adding +b

shifts only

Every value moves by the same amount. Distances between values stay the same — so spread is unchanged.

Multiplying ×a

scales everything

Every value is stretched. Distances between values grow by factor |a|, so SD grows by |a| and variance by a².

🤔 Why does the variance multiply by a², not by a?

Variance is the average of squared distances from the mean. When you multiply each value by a, every distance from the mean is also multiplied by a — but then those distances get squared in the variance formula. So variance gets multiplied by a². Taking the square root brings the standard deviation back to a single factor of |a| (positive, because spread can’t be negative).

Quick summary table

Transformation	New mean	New variance	New SD
X + b	μ + b	σ² (unchanged)	σ (unchanged)
X − b	μ − b	σ² (unchanged)	σ (unchanged)
aX	aμ	a²σ²	\|a\|σ
aX + b	aμ + b	a²σ²	\|a\|σ

🧠 Memory aid — the “b ignores spread” rule

Adding a constant b is like sliding the entire dataset along a number line — every value moves the same distance, so gaps between values don’t change. That’s why + b changes the mean but never the variance or SD. Once you see that, the rest follows.

🧭 Recipe — applying a linear transformation

Identify a and b in the transformation Y = aX + b.
New mean: multiply the original mean by a, then add b.
New variance: multiply the original variance by a². (b doesn’t appear.)
New SD: multiply the original SD by |a|.
Check the sign: if a is negative, the mean keeps the sign, but the SD becomes positive (|a|).

Worked examples

WE 1

Adding a constant

A dataset has mean 30 and standard deviation 6. Each value is increased by 4. Find the new mean and new standard deviation.

identify the transformation Y = X + 4 → a = 1, b = 4 new mean = aμ + b = 1 × 30 + 4 = 34 new mean = 34 new SD = |a| × σ = |1| × 6 = 6 (unchanged) new SD = 6 adding a constant shifts the mean but leaves the spread completely unchanged.

WE 2

Multiplying by a constant

A dataset has mean 20 and standard deviation 5. Each value is multiplied by 3. Find the new mean, variance, and standard deviation.

identify Y = 3X → a = 3, b = 0 new mean = aμ = 3 × 20 = 60 new mean = 60 new variance = a²σ² original variance = 5² = 25 new = 3² × 25 = 9 × 25 = 225 new variance = 225 new SD = |a| × σ = |3| × 5 = 15 new SD = 15 multiplication scales both mean and spread — but variance scales by a², not by a.

WE 3

Combined transformation aX + b

A teacher marks his students’ tests. The raw mean is 31 marks and the standard deviation is 5 marks. The teacher standardises by doubling the raw score and adding 10.
(a) Calculate the mean of the standardised scores.
(b) Calculate the standard deviation of the standardised scores.

transformation Y = 2X + 10 → a = 2, b = 10 (a) new mean = aμ + b = 2 × 31 + 10 = 62 + 10 (a) new mean = 72 (b) new SD = |a| × σ = |2| × 5 = 10 (b) new SD = 10 use both rules: aμ + b for mean, |a|σ for SD. The b doesn’t appear in the SD calculation.

WE 4

Backwards — find a and b

A dataset has mean 50 and SD 8. After a transformation of the form Y = aX + b (with a > 0), the new mean is 110 and the new SD is 16. Find a and b.

Step 1: use SD to find a |a| × 8 = 16 |a| = 2, and a > 0 ⇒ a = 2 Step 2: use mean to find b a × 50 + b = 110 2 × 50 + b = 110 100 + b = 110 b = 10 a = 2, b = 10 verify mean: 2(50)+10 = 110 ✓; SD: 2(8) = 16 ✓ solve SD equation first to find |a|, then mean equation to find b.

WE 5

Temperature conversion — Celsius to Fahrenheit

The midday temperatures (in °C) over a week have mean 20°C and standard deviation 7°C. The conversion to Fahrenheit is given by F = 95C + 32.
Find the mean and standard deviation of the temperatures in °F.

identify F = (9/5)C + 32 → a = 9/5, b = 32 new mean = (9/5)μ + 32 = (9/5)(20) + 32 = 36 + 32 = 68 mean in °F = 68 °F new SD = (9/5) × σ = (9/5)(7) = 63/5 = 12.6 SD in °F = 12.6 °F unit conversions are linear transformations — the rules apply directly.

WE 6

Negative multiplier

A dataset of profits (£) has mean 30 and variance 16. A new variable is defined by T = −2X + 50. Find the mean, variance, and standard deviation of T.

identify a = −2, b = 50 new mean = aμ + b = (−2)(30) + 50 = −60 + 50 new mean = −10 new variance = a²σ² = (−2)² × 16 = 4 × 16 = 64 new variance = 64 new SD = |a| × σ original SD = √16 = 4 new SD = |−2| × 4 = 8 new SD = 8 the negative sign survives in the mean but disappears in variance (squared) and SD (absolute value).

💡 Top tips

Memorise the two formulas: E(aX+b) = aE(X) + b and Var(aX+b) = a²Var(X). Both are in the HL formula booklet but quick recall saves time.
+ b never affects spread. If the question only adds or subtracts a constant, SD and variance don’t change at all.
SD uses |a|, not a. Always take the absolute value — SD is always positive.
Variance uses a². Don’t forget the square.
For backwards problems: solve the SD equation first to find |a|, then the mean equation to find b.
The formula booklet uses E and Var notation. E(X) means the mean of X; Var(X) means the variance of X.

⚠ Common mistakes

Multiplying SD by a². Variance uses a²; SD uses |a|.
Including b in the variance formula. Adding a constant doesn’t change spread, ever. Don’t write Var(aX+b) = a²Var(X) + b.
Forgetting absolute value when a < 0. A negative a still scales the SD positively; the sign only affects the mean.
Confusing variance and SD. If the problem gives you SD, square it to get variance first if needed (or work with SD directly).
Applying the SD/variance change to mode or median. These rules are for the mean and variance; mode, median, and IQR transform differently (though similarly for linear transformations).

Next up — Outliers. You’ve seen that outliers heavily affect the mean and SD but not the median and IQR. Now you’ll learn the formal IB rule for detecting outliers: a value is an outlier if it lies more than 1.5 × IQR below Q₁ or above Q₃. You’ll also see when outliers should — or shouldn’t — be removed from a dataset.

Need help with Statistics?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.

Book Free Session →

Linear Transformations of Data

📘 What you need to know

The two formulas — in the formula booklet

Why are the two rules different?

🤔 Why does the variance multiply by a², not by a?

Quick summary table

🧠 Memory aid — the “b ignores spread” rule

🧭 Recipe — applying a linear transformation

Worked examples

💡 Top tips

⚠ Common mistakes

Need help with Statistics?

Quick Links

Contact us

Follow us

Linear Transformations of Data

📘 What you need to know

The two formulas — in the formula booklet

Why are the two rules different?

🤔 Why does the variance multiply by a2, not by a?

Quick summary table

🧠 Memory aid — the “b ignores spread” rule

🧭 Recipe — applying a linear transformation

Worked examples

💡 Top tips

⚠ Common mistakes

Need help with Statistics?

Quick Links

Contact us

Follow us

🤔 Why does the variance multiply by a², not by a?