IB Maths AA HL
Topic 4 — Statistics & Probability
Paper 1 & 2
HL only
~7 min read
Probability Density Function
For a continuous random variable, probabilities are areas under a curve called the probability density function (pdf), written f(x). To find P(a ≤ X ≤ b), integrate f(x) from a to b — or for linear pdfs, use rectangle / triangle / trapezoid area formulas. P(X = k) is always 0 for any continuous variable.
📘 What you need to know
- Notation: f(x) is the probability density function (pdf).
- Two validity conditions: f(x) ≥ 0 for all x, AND ∫ f(x) dx over the whole domain = 1.
- Probability formula: P(a ≤ X ≤ b) = ∫ₐᵇ f(x) dx — given in the booklet.
- Single-value rule: P(X = k) = 0 for any continuous variable.
- Strict ≡ weak: P(a ≤ X ≤ b) = P(a < X < b) — boundary doesn’t change the area.
- Linear pdfs: use geometry — rectangle (bh), triangle (½bh), or trapezoid (½(a+b)h) — instead of integrating.
- Restricted domain: most pdfs are nonzero only on a finite interval [a, b]; outside, f(x) = 0.
- Unknown constants (k, a, etc.): solve using the validity rule or the probability formula.
What makes a function a valid pdf
A function f(x) qualifies as a probability density function if it satisfies both conditions below. Validity questions almost always test whether you remember the f(x) ≥ 0 condition — many candidates only check the integral.
Condition 1
f(x) ≥ 0 for all x
density can never be negative
Condition 2
∫ f(x) dx = 1 (over full domain)
total area under curve is 1
Computing probabilities — integration or geometry
Probability is area under the curve
P(a ≤ X ≤ b) = ∫ab f(x) dx
For a non-linear pdf you’ll usually integrate. But if f(x) is linear (or piecewise linear), the region is a basic shape and geometric area formulas are quicker:
| Shape | Area | When it appears |
|---|
| Rectangle | base × height | uniform pdf (constant on its domain) |
| Triangle | ½ × base × height | linear pdf rising from 0 (or falling to 0) |
| Trapezoid | ½ × (top + bottom) × height | part of a linear pdf, both ends nonzero |
Strict vs weak doesn’t matter: because P(X = k) = 0 for a continuous variable, swapping ≤ for < doesn’t change the answer. Don’t waste time converting the inequality.
Finding unknown constants
Two common patterns:
- Find k in f(x) = k·g(x): use the validity rule ∫ f(x) dx = 1, then solve for k.
- Find a limit a given P(0 ≤ X ≤ a) = p: set up the integral equal to p, then solve for a.
🧭 Recipe — work with a pdf
- Identify f(x) and the domain on which it’s nonzero.
- If asked to verify validity, check both conditions — f ≥ 0 AND total area = 1.
- Sketch the pdf when possible — symmetries and basic shapes save integration.
- For probabilities, set up ∫ₐᵇ f(x) dx or use a geometric area formula.
- For unknowns (k or limit a), set up an equation and solve.
Worked examples
WE 1Verify a linear pdf and compute a probability geometrically
The continuous random variable X has pdf f(x) = 0.5x for 0 ≤ x ≤ 2 (and 0 otherwise). (a) Show f is a valid pdf. (b) Find P(0.5 ≤ X ≤ 1.5).
(a) Check both validity conditions
f(x) = 0.5x ≥ 0 on [0, 2] ✓
f(0) = 0, f(2) = 1 → triangle with base 2, height 1
Area = ½ × 2 × 1 = 1 ✓
(b) Region is a trapezoid: f(0.5) = 0.25, f(1.5) = 0.75, base = 1
P = ½(0.25 + 0.75)(1) = 0.5
Cross-check by integration
∫_{0.5}^{1.5} 0.5x dx = [0.25x²]_{0.5}^{1.5}
= 0.25(2.25) − 0.25(0.25) = 0.5625 − 0.0625 = 0.5 ✓
(a) Valid; (b) P(0.5 ≤ X ≤ 1.5) = 0.5
always state BOTH validity conditions explicitly — many candidates skip f ≥ 0
WE 2Validate a quadratic pdf by integration
The continuous random variable Y has pdf f(y) = 3y² for 0 ≤ y ≤ 1 (and 0 otherwise). (a) Show f is a valid pdf. (b) Find P(0.2 ≤ Y ≤ 0.6).
(a) Validity
f(y) = 3y² ≥ 0 for all y (square is non-negative) ✓
∫₀¹ 3y² dy = [y³]₀¹ = 1 − 0 = 1 ✓
(b) Integrate from 0.2 to 0.6
P(0.2 ≤ Y ≤ 0.6) = [y³]_{0.2}^{0.6}
= 0.6³ − 0.2³ = 0.216 − 0.008 = 0.208
(a) Valid; (b) P = 0.208
geometric tricks don’t apply for non-linear f — go straight to integration
WE 3Find the constant k that makes f a valid pdf
The continuous random variable X has pdf f(x) = k(4 − x²) for 0 ≤ x ≤ 2 (and 0 otherwise). Find k.
Apply ∫ f(x) dx = 1
∫₀² k(4 − x²) dx = 1
k[4x − x³/3]₀² = 1
k[8 − 8/3 − 0] = 1
k(16/3) = 1
k = 3/16
k = 3/16 (= 0.1875)
also check k > 0 ensures f(x) ≥ 0 on the domain — quick sanity step
WE 4Validate a cubic-style pdf and compute a range
The continuous random variable T (in hours) has pdf f(t) = 364t²(4 − t) for 0 ≤ t ≤ 4 (and 0 otherwise). (a) Verify f is a valid pdf. (b) Find P(1 ≤ T ≤ 3).
(a) Validity
On [0, 4]: t² ≥ 0 and (4 − t) ≥ 0 → f(t) ≥ 0 ✓
∫₀⁴ (3/64)(4t² − t³) dt = (3/64)[4t³/3 − t⁴/4]₀⁴
= (3/64)[256/3 − 64] = (3/64)(64/3) = 1 ✓
(b) Integrate from 1 to 3
∫₁³ (3/64)(4t² − t³) dt = (3/64)[4t³/3 − t⁴/4]₁³
At t=3: 4(27)/3 − 81/4 = 36 − 81/4 = 63/4
At t=1: 4/3 − 1/4 = 13/12
Difference: 63/4 − 13/12 = 189/12 − 13/12 = 176/12 = 44/3
P = (3/64)(44/3) = 44/64 = 11/16
(a) Valid; (b) P(1 ≤ T ≤ 3) = 11/16 = 0.6875
keep the constant 3/64 outside the integral — multiply ONCE at the end
WE 5Uniform pdf — pure rectangle areas
The continuous random variable X has pdf f(x) = 0.25 for 1 ≤ x ≤ 5 (and 0 otherwise). (a) Verify f is a valid pdf. (b) Find P(2 ≤ X ≤ 4). (c) Find P(X > 3.5).
(a) Validity
f(x) = 0.25 ≥ 0 ✓
Rectangle area = base × height = 4 × 0.25 = 1 ✓
(b) Region is a rectangle: width 2, height 0.25
P(2 ≤ X ≤ 4) = 2 × 0.25 = 0.5
(c) P(X > 3.5) = P(3.5 ≤ X ≤ 5)
Width = 1.5, height = 0.25
P = 1.5 × 0.25 = 0.375
(a) Valid; (b) 0.5; (c) 0.375
uniform pdf: never integrate — multiply width by height
WE 6Find an unknown limit given a probability
The continuous random variable X has pdf f(x) = 18x for 0 ≤ x ≤ 4 (and 0 otherwise). Find a ∈ [0, 4] such that P(0 ≤ X ≤ a) = 0.36.
Quick validity check
Triangle area = ½ × 4 × 0.5 = 1 ✓; f(x) ≥ 0 on [0, 4] ✓
Set up the integral
P(0 ≤ X ≤ a) = ∫₀^a (1/8)x dx = [x²/16]₀^a
= a²/16
Equate to given probability
a²/16 = 0.36
a² = 5.76
a = ±2.4 → take positive root (a ∈ [0, 4])
a = 2.4
always reject the negative root if it falls outside the pdf’s domain
💡 Top tips
- Sketch the pdf — symmetries and basic shapes (triangle, rectangle, trapezoid) often beat integration.
- For linear pdfs, use geometric area formulas; for curved pdfs, integrate.
- Strict ≡ weak for continuous variables — don’t waste time converting inequalities.
- Always verify both conditions when asked to show f is a valid pdf — f ≥ 0 AND total area = 1.
- Keep constants outside the integral (like 3/64) — multiply at the very end.
⚠ Common mistakes
- Forgetting f(x) ≥ 0 — only verifying the integral = 1 is half a check.
- Assuming P(X = k) ≠ 0 for continuous variables — it’s exactly 0.
- Integrating outside the domain — if f(x) = 0 outside [a, b], your limits should respect that.
- Confusing pdf with cdf: f is the height; ∫ f gives the cumulative area.
- Taking the wrong root when solving for an unknown limit — must lie within the pdf’s domain.
Next: Median & Mode of a CRV. The median m splits the area in half: ∫_{−∞}^{m} f(x) dx = ½. The mode is the value of x that maximises f — found by differentiating f and solving f′(x) = 0.
Need help with Statistics & Probability?
Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.
Book Free Session →