You now have two correlation coefficients in your toolkit. PMCC r measures how linear the data is; Spearman rs measures how monotonic it is. They agree most of the time — but when they DON’T agree, the disagreement itself is information. Read them side by side and the data tells you what kind of model to try next.
Most exam comparisons fall into one of three shapes. The visual below shows what each looks like, and what (r, rs) pair you’d see:
Before doing any calculation, look at the scatter (or think about the context) and decide. The flowchart in words:
The gap between the two coefficients is the diagnostic. Three rules:
(1) r and rs close: the relationship really is linear (or close to it). PMCC is doing its job — trust it, fit a straight line.
(2) rs noticeably larger: data is monotonic but curved. Try a non-linear model (exponential, power, log).
(3) r noticeably larger: rare, but suggests outliers are inflating PMCC. Sketch the scatter and decide whether to remove or report.
Both share three limitations: (i) correlation is not causation — ever; (ii) r ≈ 0 doesn’t mean “no relationship” (could be U-shaped); (iii) both depend on the data you have — predictions outside the data range are unreliable.
🧭 Recipe — comparing two coefficients
- Sketch the scatter first. If it looks like a clear straight line ⇒ PMCC story. If it curves smoothly ⇒ Spearman story. If wild — check for outliers.
- Compute both r and rs on the GDC. Note both values to 3 sf.
- Compare: are they close? Is rs bigger? Is r bigger? Read off the three rules above.
- Pick the one that fits: PMCC if the picture supports a line; Spearman if the picture is curved or contains outliers.
- State both values, your choice, and WHY in context: “rs = 1 but r = 0.79 suggests the relationship is monotonic but not linear, so a non-linear model is appropriate.”
Worked examples
WE 1Which coefficient is more appropriate?
For each scenario, state which correlation coefficient is more appropriate and why.
(a) Estimating petrol used from kilometres driven on a motorway (steady speed). (b) Bacterial colony count doubling each hour (curved growth). (c) Two judges rank 10 ice-skaters 1st through 10th. (d) Testing whether income rises with age, even if the rise is curved.
(a) constant speed ⇒ straight-line theory
PMCC (r) — expect linear relationship
(b) doubling = exponential = curved
Spearman (r_s) — monotonic but not linear
(c) data IS already ranks
Spearman (r_s) — only ranks are available
(d) test for any monotonic rise
Spearman (r_s) — doesn’t assume linear
(a) PMCC · (b) Spearman · (c) Spearman · (d) Spearman
three giveaways for Spearman: the data is RANKED, the relationship is CURVED, or you’re worried about OUTLIERS. PMCC is for clean, linear, value-based data.
WE 2Both coefficients agree — data is linear
Hours studied (x) and number of correct answers on a 20-question quiz (y) for 7 students:
x: 1, 2, 3, 4, 5, 6, 7 | y: 3, 5, 8, 10, 13, 16, 18
(a) Find r and rs. (b) Compare and conclude.
(a) GDC
r = 0.998 (3 sf)
y is strictly increasing ⇒ ranks 1..7 both ways
r_s = 1.000
(b) compare
r ≈ r_s, both very close to 1
⇒ the relationship is LINEAR
a straight-line model is appropriate
r = 0.998, r_s = 1.000 · linear relationship confirmed
when r and r_s are both close to 1 (or both close to −1) AND close to each other, the line of best fit will do a good job. No surprises to flag.
WE 3When rs >> r — the curve signal
A radioactive sample is measured at hourly intervals. Time (x, hours) and radiation level (y, counts per second) recorded:
x: 1, 2, 3, 4, 5, 6 | y: 1, 4, 16, 64, 256, 1024
(a) Find r and rs. (b) Compare. (c) What does the disagreement tell you about the model?
(a) GDC
r = 0.787 (3 sf)
y is strictly increasing ⇒ r_s = 1.000
(b) compare
r_s = 1, r = 0.787 (big gap)
(c) interpret the gap
r_s = 1 says: perfectly monotonic
r < 1 says: NOT a straight line
⇒ relationship is curved (here, exponential)
r = 0.787, r_s = 1.000 · try a NON-LINEAR model
classic signature of a curve. r_s ≈ 1 with r noticeably lower is the surest exam indicator that an exponential, power, or log model should be tried instead of a straight line.
WE 4The outlier signature
An electrical engineer records temperature increase (x, °C) and resistance change (y, Ω) of a heating element:
x: 10, 20, 30, 40, 50, 60 | y: 2, 4, 6, 8, 10, 200
(a) Find r and rs. (b) Suggest what is happening in the data.
(a) GDC
r = 0.681 (3 sf)
y values strictly increasing ⇒ r_s = 1.000
(b) interpret
first 5 points: y rises by 2 each step (linear)
final point: y jumps from 10 to 200 (huge)
that final reading is an OUTLIER
r is dragged down to 0.681 by the outlier
r_s ignores magnitude — stays at 1
r = 0.681, r_s = 1.000 · the last reading is an outlier
always look at the data BEFORE drawing conclusions. The engineer should check: was the 60°C measurement faulty? Did something change in the equipment? r_s tells you the trend is real; r alone might have made you reject the relationship.
WE 5Reading four (r, rs) pairs
For each pair below, describe what the data is most likely doing.
(a) r = 0.92, rs = 0.90 (b) r = 0.42, rs = 0.95 (c) r = 0.85, rs = 0.40 (d) r = 0.08, rs = 0.05
(a) close to each other and both strong
linear relationship
(b) r low but r_s high
monotonic but CURVED — try non-linear model
(c) r high but r_s low
suspicious — outliers likely inflating r
trust r_s, sketch the scatter
(d) both near 0
no monotonic or linear relationship
(a) linear · (b) curved · (c) outliers · (d) no relationship
memorise the three signatures: close + strong = linear; r_s >> r = curve; r >> r_s = outlier. Both near 0 = nothing’s happening.
WE 6Recommend a coefficient to a researcher
For each researcher, recommend which correlation coefficient to use and briefly justify.
(a) An economist is testing whether a country’s GDP and unemployment rate are linearly related, with a view to using regression for prediction. (b) A biologist is studying plant growth over 30 days and expects an S-shaped (logistic) curve. (c) A market researcher asks 8 customers to rank 5 brands from favourite to least favourite.
(a) testing LINEAR theory + planning regression
use PMCC (r)
linear assumption is the whole point
(b) expects S-shaped curve
use Spearman (r_s)
monotonic but definitely not linear
(c) data IS rankings (no numerical values)
use Spearman (r_s)
PMCC needs numerical values; only ranks available
(a) PMCC · (b) Spearman · (c) Spearman
when the question hints at “linear” or “regression” or “straight line”, choose PMCC. When the question hints at “curve” or “ranking” or “monotonic”, choose Spearman. The vocabulary in the question is usually a strong clue.
💡 Top tips
- Always compute BOTH if you have time — the comparison itself is a marked answer in many questions.
- Three signatures to memorise: close + strong = linear · rs >> r = curve · r >> rs = outlier.
- Match strength bands: weak (|val| < 0.4) · moderate (0.4−0.7) · strong (> 0.7) — same for both coefficients.
- Language difference: PMCC supports “linear correlation”; Spearman supports “monotonic correlation”. Don’t mix.
- Sketch first: always plot the scatter on the GDC. A 5-second look saves you from misinterpreting the numbers.
⚠ Common mistakes
- Assuming r = rs always: they only have to be equal at ±1. For all other data they can differ — and the difference is informative.
- Concluding “linear” from rs = 1: Spearman = 1 says only monotonic. Use r to test linearity.
- Throwing out outliers automatically: outliers might be REAL data points telling you something important. Always investigate first.
- Forgetting that both fail for U-shapes: a parabola has strong relationship but neither r nor rs will catch it — both can be near 0.
- Claiming causation from either: a strong r OR a strong rs still only tells you the two variables move together — not that one causes the other.
Next up: Linear Regression — the chapter finale. If your data passes the PMCC test for linearity, you can fit a least-squares regression line y = ax + b using the GDC. You’ll learn how to interpret the gradient a and intercept b in context, and when predictions are reliable (interpolation) vs risky (extrapolation).
Need help with AI SL Correlation & Regression?
Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.
Book Free Session →