IB Maths AI HL Modelling with Functions Paper 2 Choosing & fitting models ~9 min read

Strategy for Modelling Functions

Every function family in the chapter has a fingerprint. Reading the wording, the scatter plot or the differences in a data table tells you which model to reach for — before you let the GDC do the regression and the prediction.

📘 What you need to know

Constant first differences in equally-spaced data ⇒ linear y = mx + c.
Constant second differences ⇒ quadratic y = ax² + bx + c.
Constant ratios (y multiplied by the same factor each step) ⇒ exponential y = k·a^x.
Oscillating between fixed max and min with a clear period ⇒ sinusoidal.
Power-law shape (curve through origin or asymptoting both axes) ⇒ direct or inverse variation.
Use the GDC to fit regression (LinReg, QuadReg, ExpReg, SinReg, PwrReg) and read off the coefficients and an r² goodness-of-fit value.

From context to model type

The wording usually tells you everything. “Constant rate per unit”, “fixed cost plus per-unit charge” ⇒ linear. “Maximum height”, “profit peaks then drops” ⇒ quadratic. “S-shaped”, “volume of a box scooped out of card” ⇒ cubic. “Doubles every”, “depreciates by 8%”, “half-life”, “cools toward room temperature” ⇒ exponential (the last with a constant added). “Tide”, “Ferris wheel”, “daylight hours”, “any annual cycle” ⇒ sinusoidal. “Inversely proportional”, “force depends on the square of distance” ⇒ variation. Stop and name the family before writing equations.

From data to model type

When you’re handed a table, three quick tests cover the bulk of cases. Take first differences Δy: if they’re (nearly) constant, the model is linear. If not, take second differences Δ²y: constant ⇒ quadratic. If neither, take ratios y_n+1/y_n: constant ratio ⇒ exponential. A scatter plot that oscillates is sinusoidal; one that passes through the origin as a power curve is direct variation; one that hugs both axes is inverse variation. With real data the differences are rarely exactly constant — look for “approximately” constant, and let the GDC’s r² value confirm the best fit.

The six shapes you’ll meet in AI HL. Identify the family first, then fit parameters from data or context.

Diagnostic at a glance Δy const ⇒ linear · Δ²y const ⇒ quadratic ratio const ⇒ exponential · periodic ⇒ sinusoidal · power-law shape ⇒ variation

Fitting, predicting, critiquing

Once the model type is set, the GDC handles the rest. Enter the data into two lists, run the matching regression (LinReg for linear, QuadReg for quadratic, ExpReg for exponential, SinReg for sinusoidal, PwrReg for power), and the coefficients drop out along with r² — the closer to 1, the better the fit. Use the model only within the data range you fitted it on; extrapolating far beyond is risky. Real systems hit carrying capacities, regulations change, oscillations damp out — a single-function model is always a simplification. Be ready to write a one-line critique alongside a numerical prediction.

Sense-check every prediction. Does it have the right sign? Right order of magnitude? Within a physically possible range? If a population model says 4 million whales next year, something’s wrong.

🧭 Recipe — pick a model and use it

Read the context: rate-per-unit, peak-and-fall, doubling-time, oscillation, asymptote — each cues a family.
Diagnose the data: first differences, second differences, ratios, or scatter-plot shape.
Fit: by hand for clean cases (linear from two points, quadratic from second differences), or by GDC regression for everything else.
Apply: substitute to predict, or set equal to a target and solve.
Critique: state the domain over which the model is valid; flag any extrapolation; note r² if available.

Worked examples

WE 1

Match scenario to model type

Name the most appropriate model family for each:
(a) Daily ferry passenger numbers vary between 200 in winter and 1200 in summer.
(b) £400 is invested at 5% compound interest per year.
(c) The stretch of a spring is measured against the weight hanging from it.
(d) The time to complete a job depends on the number of workers assigned.
(e) A small shop’s profit rises with price up to a peak, then falls.

match the words (a) annual cycle, fixed max/min ⇒ sinusoidal (b) constant % per period ⇒ exponential growth (c) Hooke’s law: stretch ∝ weight ⇒ linear (direct) (d) more workers → less time ⇒ inverse variation (e) one peak then decline ⇒ quadratic sinusoidal · exponential · linear · inverse · quadratic

WE 2

Identify model by constant ratios

A company’s monthly sales (units) are: t = 0 → 80, t = 1 → 96, t = 2 → 115.2, t = 3 → 138.24, t = 4 → 165.888. (a) Identify the model. (b) Write the equation. (c) Predict S(10).

(a) test first differences 16, 19.2, 23.04, 27.65 — not constant test ratios S(t+1) / S(t) 96/80 = 1.2, 115.2/96 = 1.2, 138.24/115.2 = 1.2, 165.888/138.24 = 1.2 constant ratio 1.2 ⇒ exponential (b) initial value 80, base 1.2 S(t) = 80(1.2)ᵗ (c) S(10) = 80(1.2)¹⁰ = 80 × 6.1917 S(10) ≈ 495 units

WE 3

Identify model by second differences

Data: x = 0, 1, 2, 3, 4 with y = 5, 7, 11, 17, 25. (a) Determine the model type. (b) Find the equation. (c) Predict y(7).

(a) first differences 2, 4, 6, 8 — not constant second differences 2, 2, 2 — constant quadratic y = ax² + bx + c (b) y(0) = c = 5 y(1) = a + b + 5 = 7 ⇒ a + b = 2 y(2) = 4a + 2b + 5 = 11 ⇒ 2a + b = 3 subtract: a = 1, b = 1 y = x² + x + 5 (c) y(7) = 49 + 7 + 5 y(7) = 61

WE 4

GDC linear regression

The number of textbooks sold (y) at a school book fair was recorded against hours open (x):
(1, 15), (2, 21), (3, 24), (4, 29), (5, 36). (a) Run a linear regression to find y = mx + c. (b) Predict y after 8 hours.

(a) GDC: LinReg(ax + b) on the data Σx = 15, Σy = 125, Σxy = 425, Σx² = 55 m = (5·425 − 15·125) / (5·55 − 225) m = 250/50 = 5 c = (125 − 5·15) / 5 = 50/5 = 10 y = 5x + 10 (b) y(8) = 5(8) + 10 y(8) = 50 textbooks r² on the GDC will be close to (but not exactly) 1.

WE 5

State a sensible domain

The value of a piece of office equipment t years after purchase is modelled by V(t) = 28000 − 2400t (dollars). (a) Find V(0) and V(5). (b) For what values of t does the model give a sensible answer? (c) When does the model fail?

(a) substitute V(0) = 28000 ⇒ $28,000 (purchase price) V(5) = 28000 − 12000 = 16000 V(0) = $28,000 · V(5) = $16,000 (b) need V ≥ 0 and t ≥ 0 28000 − 2400t ≥ 0 t ≤ 28000/2400 = 11.67 domain 0 ≤ t ≤ 11.67 yr (c) beyond t = 11.67 model gives negative value — physically meaningless. A piece of equipment can’t have negative value.

WE 6

Critique an extrapolation

A biologist fits an exponential model P(t) = 200e^0.08t (where t is years since 2010) to fish-population data from 2010 to 2020. She uses it to predict the population in 2060. (a) Compute P(50). (b) Give two reasons the prediction may be unreliable.

(a) P(50) = 200e⁰·⁰⁸·⁵⁰ = 200e⁴ ≈ 200 × 54.60 P(50) ≈ 10,920 fish (b) reasons 1. extrapolation 40 years beyond the fitted data; the model is based on only 10 years of growth and cannot be trusted that far out. 2. real populations face carrying capacity, food limits, predation and disease, so exponential growth can’t continue indefinitely — a logistic model would be more realistic.

💡 Top tips

Family before formula: name the model type before you start any algebra or regression.
For data tables, compute differences and ratios. The constant pattern points to the model.
For real data with noise, use the GDC’s regression options and compare r² values.
Always state the domain over which the model is sensible — don’t let it return impossible values.
Critique extrapolations: every model breaks somewhere; flag it.

⚠ Common mistakes

Forcing a linear model onto data that has constant ratios (which is exponential) or constant second differences (quadratic).
Computing ratios when first differences are constant — you found linear, stop there.
Not equally-spaced data: the difference and ratio shortcuts assume equal gaps in x; for unequal spacing use GDC regression.
Extrapolating far beyond the data without comment — lose marks for not flagging the risk.
Reporting r² as a percentage: it’s a decimal between 0 and 1.

Chapter complete — you now have the full modelling toolkit (linear, quadratic, cubic, exponential, sinusoidal, variation) plus a strategy to choose between them. Next chapter: Geometry & Trigonometry.

Need help choosing the right model?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 2 is looking for.

Book Free Session →

Strategy for Modelling Functions

📘 What you need to know

From context to model type

From data to model type

Fitting, predicting, critiquing

🧭 Recipe — pick a model and use it

Worked examples

💡 Top tips

⚠ Common mistakes

Need help choosing the right model?

Quick Links

Contact us

Follow us