IB Maths AA SLTopic 4 — Probability DistributionsPaper 1 & 2~10 min read
Discrete Probability Distributions
When you flip a coin, roll a dice, or count something random, the result is a number with a probability attached. A discrete probability distribution is just a list of every possible value and its probability — and it lets you answer any “what’s the chance of…” question.
📘 What you need to know
A discrete random variable (DRV) takes specific values from a set, usually whole numbers from counting something.
A probability distribution shows every value and its probability — usually in a table.
The probabilities must always add up to 1: ∑ P(X = x) = 1.
Capital letters (X, Y) = the random variable. Lower-case letters (x, y) = a particular value.
For a discrete uniform distribution, every value is equally likely — probability = 1n.
To find P(X ≤ k), add up the probabilities of all values ≤ k.
A random variable is just a quantity whose value depends on a random event. You don’t know what value it will take until the event happens — that’s the “random” bit.
The word “discrete” means the variable can only take specific separated values — usually whole numbers. You count it, you don’t measure it.
Examples of discrete random variables
The number of times a coin lands on heads in 20 flips → values {0, 1, 2, …, 20}.
The number on a single dice roll → values {1, 2, 3, 4, 5, 6}.
The number of emails a manager gets in an hour → values {0, 1, 2, 3, …}.
The number of times a dice is rolled until landing on 6 → values {1, 2, 3, …}.
Notice that some DRVs have a finite set of values (like dice rolls 1 to 6) and some have infinitely many possible values (like “number of emails”). Both still count as discrete because the values are separated, not continuous.
The notation
Capital letters like X, Y represent the random variable itself (the “thing” we’re measuring).
Lower-case letters like x, y represent specific values it can take.
P(X = x) means “the probability that X takes the value x“.
What is a probability distribution?
A probability distribution describes every value the variable can take, along with the probability of each. It’s usually given in one of three ways:
Method 1 — A table
The most common form. Each column shows a value and its probability:
x
1
2
3
4
P(X = x)
0.1
0.3
0.4
0.2
(Check: 0.1 + 0.3 + 0.4 + 0.2 = 1 ✓)
Method 2 — A function (pdf)
Sometimes the distribution is given as a rule, like:
P(X = x) = kx² for x = −3, −1, 2, 4
To use it, plug each value of x in to find the probabilities, then build a table.
Method 3 — A vertical line graph
Values along the x-axis, probability on the y-axis. Each value gets a single vertical line whose height is its probability.
A simple discrete probability distribution as a graph
The most important rule
All probabilities must add to 1
∑ P(X = x) = 1
This makes sense — one of the listed values must happen. So the total probability is 100% = 1.
📍
Use this rule to find unknown probabilities
If a question gives you a distribution with an unknown k, just write all probabilities in terms of k, set the sum equal to 1, and solve. This is one of the most common exam patterns!
The discrete uniform distribution
A discrete uniform distribution is the simplest type — every possible value has the same probability.
Discrete uniform distribution
If X takes n equally likely values, then P(X = x) = 1n
Examples:
Rolling a fair dice → 6 values, each with probability 16.
Picking a random card from a deck of 52 → each card has probability 152.
Spinning a fair 8-sided spinner → each face has probability 18.
🧠
Memory trick: “Uniform = same shirt, same probability”
“Uniform” means everything looks the same — like a school uniform. Every value gets the same probability. If there are n values, each gets 1n.
Finding probabilities from a distribution
Single value: P(X = k)
If k is in the table, just read it off.
If k is not a possible value (like P(X = 7) when the values are 1–6), the probability is 0.
Cumulative: P(X ≤ k)
Add up the probabilities of every value that is less than or equal tok:
Cumulative probability
P(X ≤ k) = ∑ P(X = xi) for all xi ≤ k
The complement shortcut
For “more than” or “at least” questions, it’s often easier to use the complement:
Complement formulas
P(X > k) = 1 − P(X ≤ k)
P(X ≥ k) = 1 − P(X < k)
Translating words into inequalities
Exam questions use everyday language — but the maths uses inequality signs. Match them carefully:
P(X ≤ k)
“at most”, “no greater than”, “no more than”
Includes k itself.
P(X < k)
“fewer than”, “less than”, “under”
Does NOT include k.
P(X ≥ k)
“at least”, “no fewer than”, “no less than”
Includes k itself.
P(X > k)
“more than”, “greater than”, “above”
Does NOT include k.
🤔 The single biggest trap
“At least 3” means 3 or more — so X ≥ 3 (includes 3). “More than 3” means strictly bigger than 3 — so X > 3 (does NOT include 3). One little word changes the answer!
If you’re ever unsure, list the values explicitly. “At least 3” for a dice means {3, 4, 5, 6}. “More than 3” means {4, 5, 6}. Listing kills any ambiguity.
Worked examples
WE 1
Find k and a cumulative probability
The probability distribution of X is given by:
P(X = x) = kx² for x = −3, −1, 2, 4 (0 otherwise)
(a) Show that k = 130. (b) Calculate P(X ≤ 3).
First build a table by plugging each x into kx². Then use ∑P = 1.part (a) — find kSubstitute each x:x = −3: k(−3)² = 9kx = −1: k(−1)² = kx = 2: k(2)² = 4kx = 4: k(4)² = 16kSum to 1:9k + k + 4k + 16k = 30k = 1k = 130 ✓part (b) — p(x ≤ 3)Possible values ≤ 3: −3, −1, 2.Their probabilities:P(X = −3) = 930 = 310P(X = −1) = 130P(X = 2) = 430 = 215Add:930 + 130 + 430 = 1430 = 715P(X ≤ 3) = 715always build the table first when given a function — way easier to spot which values to add
WE 2
Find an unknown probability using ∑P = 1
The discrete random variable X has the distribution shown:
x
1
2
3
4
P(X = x)
0.2
0.3
p
0.15
Find the value of p.
All probabilities must sum to 1.Set up:0.2 + 0.3 + p + 0.15 = 10.65 + p = 1p = 0.35p = 0.35classic exam trick — sum is 1, solve for the missing value
WE 3
Calculate inequality probabilities
Using the distribution from WE 2 (with p = 0.35), find:
(a) P(X ≤ 3) (b) P(X > 2) (c) P(X ≥ 2)
Match the inequality to the right values, then add their probabilities.part (a) — at most 3Values: 1, 2, 3 (includes 3).0.2 + 0.3 + 0.35 = 0.85P(X ≤ 3) = 0.85part (b) — more than 2Values: 3, 4 (NOT 2).0.35 + 0.15 = 0.5P(X > 2) = 0.5part (c) — at least 2Values: 2, 3, 4 (includes 2).0.3 + 0.35 + 0.15 = 0.8Or use complement: 1 − P(X < 2) = 1 − 0.2 = 0.8 ✓P(X ≥ 2) = 0.8“at least” includes the value, “more than” doesn’t!
WE 4
Find k from a linear function
The probability distribution of Y is given by P(Y = y) = k(y + 1) for y = 0, 1, 2, 3. Find k.
Build the table, then use ∑P = 1.Plug each y:y = 0: k(0+1) = ky = 1: k(1+1) = 2ky = 2: k(2+1) = 3ky = 3: k(3+1) = 4kSum = 1:k + 2k + 3k + 4k = 10k = 1k = 110 = 0.1building the table first makes the algebra easy
WE 5
A discrete uniform distribution
A fair 8-sided spinner is numbered 1 to 8. Let X be the number it lands on.
(a) Write P(X = x) for any value. (b) Find P(X ≥ 6). (c) Find P(X is odd).
Discrete uniform → each of the 8 values has probability 1/8.part (a)P(X = x) = 18 for x = 1, 2, …, 8part (b) — at least 6Values: 6, 7, 8 (3 values).3 × 18 = 38P(X ≥ 6) = 38part (c) — odd valuesOdd values: 1, 3, 5, 7 (4 values).4 × 18 = 12P(X is odd) = 12for uniform distributions, just count the favourable values and divide by total
💡 Top tips
Always start with a table. Even if the distribution is given as a function, write out a table first — it makes finding probabilities easier.
Check probabilities sum to 1. Always. If they don’t, something’s wrong.
Use ∑ P(X = x) = 1 to find any unknown value (k, p, etc).
Read inequality words carefully. “At least” (≥) includes the value; “more than” (>) does not.
Use the complement for “greater than” or “at least”: P(X > k) = 1 − P(X ≤ k).
If a value isn’t in the distribution, P(X = that) = 0.
For uniform distributions, just count favourable values and divide by total.
Always verify with a sanity check: do all your probabilities add to 1?
⚠ Common mistakes
Forgetting to check ∑P = 1. Always do this — it catches arithmetic errors.
Confusing “at least” with “more than”. ≥ includes the value; > does not. One word can change the answer.
Treating non-listed values as having a probability. If x = 7 isn’t in the distribution, P(X = 7) = 0.
Mixing up upper and lower case.X is the random variable; x is a specific value. They’re related but not the same.
Skipping the table when given a function. Plug values in first, then calculate.
Adding extra values when computing P(X ≤ k). Only include values that are actually possible AND ≤ k.
Forgetting to include k in P(X ≤ k). The ≤ sign means it’s included.
Using a continuous-style formula on discrete data. Always sum, never integrate (that’s HL territory).
Now you can find any probability from a discrete distribution. The next note covers expected values — the long-run “average” outcome you’d expect from a random variable. It’s how you decide whether a game is fair, predict winnings, and more.
Need help with Discrete Probability Distributions?
Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.