IB Maths AA SL Topic 4 — Statistics Toolkit Paper 1 & 2 ~7 min read

Linear Transformations of Data

If you take a data set and add 10 to every value, what happens to the mean? What about the standard deviation? This note gives you the quick rules — so you don’t have to recalculate from scratch when the data gets shifted or scaled.

📘 What you need to know

A linear transformation means changing every value by the same rule: multiply by k, add a, or both. New value = k × old value + a.
Adding a constant a shifts the data — the mean shifts by a, but the spread (variance, SD) stays the same.
Multiplying by a constant k stretches the data — the mean is multiplied by k, the SD by |k|, and the variance by k².
If you do both (kx + a): apply the multiplication first, then the addition.
The HL formula booklet writes this as: E(aX + b) = aE(X) + b and Var(aX + b) = a²Var(X).

What is a linear transformation?

Sometimes data is awkward to work with — values too big, too small, or in unfriendly units. A linear transformation just means applying the same rule to every value at once.

Linear transformation

new value = k × (old value) + a

Example uses:

Converting Celsius to Fahrenheit: F = 1.8C + 32 (multiply by 1.8, add 32).
Standardising test scores: new score = 2 × raw score + 10.
Switching units: kg to g (multiply by 1000), or seconds to minutes (divide by 60).
Linear coding to make calculations easier — old textbooks shift big numbers down to small ones.

You may have heard this called “linear coding” or “effects of constant changes”. Same thing. The IB usually phrases it as a real-life situation — like a teacher standardising scores or an athlete adjusting times.

The two building blocks

Any linear transformation is made up of two simple operations:

+ a (a SHIFT)

Add (or subtract) a constant. Every value moves by the same amount, like sliding the whole data set up or down.

Mean: shifts by + a
SD: unchanged
Variance: unchanged

× k (a SCALE)

Multiply (or divide) by a constant. The data stretches or shrinks — values further apart spread out more.

Mean: multiplied by k
SD: multiplied by |k|
Variance: multiplied by k²

How shifts and scales affect the data

🤔 Why doesn’t shifting change the spread?

Imagine the data as a row of dots on a number line. If you slide every dot to the right by 10, the dots are still the same distances apart — only their position changed. So the mean moves, but the variance and SD (which measure spread) stay the same.

Compare that to scaling: if you double every value, the gaps between dots also double. The data stretches out, so the spread grows.

The combined rule (the big table)

If you transform data by new = k × old + a, here’s what happens to each measure:

Measure	Add a only	Multiply by k only	Both: kx + a
Mean (x̄)	x̄ + a	k x̄	k x̄ + a
Variance (σ²)	σ²	k² σ²	k² σ²
SD (σ)	σ	\|k\| σ	\|k\| σ

🧠

Memory trick: “Mean does everything, spread only stretches”

The mean reacts to both shifts and stretches — it’s a “where” measure, so it moves and scales. The SD and variance are “spread” measures — they don’t care if you shift the data, only when you stretch it.

📍

Why |k| for SD but k² for variance?

Variance squares everything, so a negative k automatically becomes positive (k² is always positive). Standard deviation is the square root of variance — taking the root brings back the absolute value, hence |k|.

Quick check: SD must always be positive! That’s why we use |k|, not k.

For the IB exam, the most common transformation is “scale then shift” — like 2× + 10 for standardised scores. Treat the multiplication and the addition as two separate operations, and apply the rules to each.

Worked examples

WE 1

Find the mean and SD after standardising scores

A teacher’s raw test scores have mean 31 and standard deviation 5. The teacher standardises the scores by doubling each one, then adding 10.

(a) Find the mean of the standardised scores. (b) Find the standard deviation of the standardised scores.

new score = 2 × raw + 10, raw mean = 31, raw SD = 5 k = 2, a = 10. Apply the transformation rules to each measure.part (a) — new mean Multiply mean by k, then add a: new mean = 2 × 31 + 10 = 62 + 10 = 72 New mean = 72part (b) — new sd SD is multiplied by |k|, NOT shifted by a: new SD = |2| × 5 = 10 New SD = 10 “+ 10” doesn’t affect the SD — only the multiplication by 2 does

WE 2

Effect of just a shift

A data set of weights has mean 65 kg and standard deviation 4 kg. Each weight is reduced by 3 kg. Find the new mean and SD.

new = old − 3, mean = 65, SD = 4 Pure shift (k = 1, a = −3). The SD won’t change. New mean: 65 − 3 = 62 New SD: unchanged = 4 New mean = 62 kg, New SD = 4 kg shift moves the data; doesn’t change how spread out it is

WE 3

Convert kg to g and find new mean, variance, SD

The masses of 50 apples have mean 0.18 kg and SD 0.04 kg. The masses are converted from kg to g (multiply by 1000). Find the new mean, variance, and SD.

new = 1000 × old (kg → g), mean = 0.18, SD = 0.04 Pure scale (k = 1000, a = 0). New mean: 1000 × 0.18 = 180 New SD: |1000| × 0.04 = 40 New variance: 1000² × (0.04)² = 1,000,000 × 0.0016 = 1600 Mean = 180 g, SD = 40 g, Variance = 1600 g² always check: new SD = √(new variance) → √1600 = 40 ✓

WE 4

Effect of a negative scale factor

A set of times has mean 12.5 seconds and standard deviation 2.4 seconds. Each time is transformed by the rule: new = −3 × old + 50. Find the new mean and SD.

new = −3 × old + 50, mean = 12.5, SD = 2.4 k = −3, a = 50. Watch the sign for SD! New mean: −3 × 12.5 + 50 = −37.5 + 50 = 12.5 New SD: use |k|, not k: new SD = |−3| × 2.4 = 3 × 2.4 = 7.2 New mean = 12.5, New SD = 7.2 if you used k = −3 instead of |k|, you’d get a negative SD — impossible!

WE 5

Work backwards to find the original data

After applying the transformation new = 4 × old − 7, a data set has mean 25 and SD 12. Find the mean and SD of the original data.

new mean = 25, new SD = 12, k = 4, a = −7 Use the rules in reverse: undo the SD scaling first, then undo the mean.old sd new SD = |k| × old SD, so: 12 = 4 × old SD → old SD = 3 Old SD = 3old mean new mean = k × old mean + a: 25 = 4 × old mean − 7 32 = 4 × old mean → old mean = 8 Old mean = 8 reverse the formula like solving any linear equation — undo the multiply, then undo the add

💡 Top tips

Memorise the table: only the mean responds to “+ a“. Variance and SD ignore additions completely.
For variance, use k². For SD, use |k|. The squared keeps a negative k positive automatically.
If k is negative, the mean still uses k (not |k|). Only the SD uses absolute value.
SD must always be positive. If your SD comes out negative, check your absolute value — that’s the most common mistake.
For combined transformations, apply scale before shift. The order matters for the mean — but only the scale matters for SD.
Quick sanity check: if the new variance is k² σ², the new SD must be its square root → |k|σ. They’re consistent.
The HL formula booklet writes E(aX + b) and Var(aX + b) — same rules, just dressed up in algebraic notation.
For unit conversions (kg→g, °C→°F, $→€), this is your tool. Identify k and a from the conversion formula.

⚠ Common mistakes

Adding a to the SD. A shift never affects the spread. Only multiplication by k changes the SD.
Multiplying SD by k² instead of |k|. The k² rule applies to variance, not SD. SD only uses |k|.
Multiplying SD by k when k is negative. The new SD would be negative — which is impossible. Always use the absolute value of k.
Applying the shift to variance. Variance is unchanged by a shift, just like SD.
Forgetting the order of operations on the mean. For new = k x + a, multiply first, then add — same as for any value.
Working backwards in the wrong order. When reversing, use the SD equation to find the old SD, then the mean equation to find the old mean.
Confusing variance and SD units. If data is in seconds, SD is in seconds, but variance is in seconds².
Forgetting that k² is always positive. So variance always increases (or stays the same) under any scale, even if k is negative.

Linear transformations come up everywhere — unit conversions, standardised tests, currency exchanges. The next note covers outliers: how to spot extreme values that ruin your statistics, and what to do about them.

Need help with Linear Transformations of Data?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.

Book Free Session →

Linear Transformations of Data

📘 What you need to know

What is a linear transformation?

The two building blocks

+ a (a SHIFT)

× k (a SCALE)

🤔 Why doesn’t shifting change the spread?

The combined rule (the big table)

Memory trick: “Mean does everything, spread only stretches”

Why |k| for SD but k² for variance?

Worked examples

💡 Top tips

⚠ Common mistakes

Need help with Linear Transformations of Data?

Quick Links

Contact us

Follow us