IB Maths AA SL Topic 4 โ€” Statistics Toolkit Paper 1 & 2 ~9 min read

Measures of Central Tendency

Mean, median, and mode are three different ways of asking the same question: “where’s the middle of this data?” They each answer it slightly differently โ€” and knowing which one to pick is half the skill.

๐Ÿ“˜ What you need to know

What are they all for?

Imagine someone hands you a list of test scores, salaries, or shoe sizes, and asks: “what’s a typical value?” That single question has three different answers depending on which kind of “typical” you mean.

MEAN

xฬ„ = ฮฃxn
The “fair share”. Add everything up and divide by how many.

MEDIAN

middle of sorted list
The “middle one”. Sort the data, then pick the value in the middle.

MODE

most frequent
The “popular kid”. Whichever value shows up most often.
In statistics class people just say “the average” and mean the mean. In real exams, IB always asks for one specifically โ€” never just “average”. Read the question carefully.

Mode โ€” the most popular value

The mode is the value that occurs most often in the data set. To find it, just count which value appears the most times.

Mode
Mode = the value with the highest frequency

Special cases โ€” when mode gets weird

๐Ÿ“

The mode is the only “average” that works for words

If your data is qualitative โ€” like favourite colours, blood types, or eye colours โ€” you can’t add them up or sort them by size. The mode is the only measure that makes sense.

Median โ€” the middle value

The median is the value sitting right in the middle of the data after you’ve sorted it from smallest to largest. It’s the “halfway point” โ€” half the values are below it, half are above.

How to find the median

  1. Sort the data in order from smallest to largest.
  2. Count how many values you have โ€” call this n.
  3. If n is odd, the median is the single middle value.
  4. If n is even, the median is the average of the two middle values.

Visual example โ€” odd n

3ยท 7ยท 9ยท 12ยท 15ยท 18ยท 22

7 values โ†’ middle one is the 4th. Median = 12.

Visual example โ€” even n

3ยท 7ยท 9ยท 12ยท 15ยท 18ยท 22ยท 25

8 values โ†’ average of 4th and 5th. Median = (12 + 15) รท 2 = 13.5.

๐Ÿค” The (n+1)/2 trick โ€” finding the position

The median sits at position (n + 1) รท 2 in your sorted list.

Example: n = 7 โ†’ position (7+1)/2 = 4, so the 4th value is the median.

Example: n = 8 โ†’ position (8+1)/2 = 4.5, meaning “halfway between the 4th and 5th values” โ€” so average them.

The single biggest mistake students make with the median? Forgetting to sort the data first. The median is the middle of the sorted list, not the middle of the original list.

Mean โ€” the fair share

The mean is what most people call “the average”. It’s the total of all the values, divided by how many values you have.

Mean
xฬ„ = 1n  ฮฃ xi  =  sum of all valuesnumber of values

The ฮฃ (capital sigma) symbol just means “add them all up”. So ฮฃ xi is shorthand for x1 + x2 + x3 + โ€ฆ + xn.

Symbols you’ll see

๐Ÿง 

Memory trick: “Mean = Mountain divided by Mob”

Pile up all your numbers (the mountain), then divide by how many people are in the mob. Add รท count = mean.

Which one should I use?

Each measure has its own strengths and weaknesses. Here’s a quick comparison so you can pick the right tool:

MeanMedianMode
Uses every value?YesNoNo
Affected by outliers?Yes (badly)HardlyNo
Works on words/labels?NoNoYes
Always exists?YesYesSometimes
Easy to calculate?YesSort firstYes
๐Ÿ“

Outliers wreck the mean โ€” but barely touch the median

Imagine a small company where most workers earn $40k a year, but the CEO earns $2 million. The mean salary will look huge โ€” but the median will sit comfortably at $40k. That’s why news articles about “average house prices” often quote the median: it gives a fairer picture of the typical case.

Worked examples

WE 1

Find the mode, median and mean

Find the mode, median and mean for the data set below.

43,   29,   70,   51,   64,   43

Data: 43, 29, 70, 51, 64, 43mode Most common value: 43 appears twice, others once Mode = 43median Sort in order: 29, 43, 43, 51, 64, 70 n = 6 (even) โ†’ average of 3rd and 4th values: 43 + 512 = 47 Median = 47mean Add up all values: ฮฃx = 43+29+70+51+64+43 = 300 Divide by n = 6: 3006 = 50 Mean = 50 always sort the data before finding the median!
WE 2

All three averages โ€” odd number of values

The number of goals scored by a football team in their last 7 matches is given below. Find the mode, median and mean.

2,   0,   1,   3,   2,   4,   2

Data: 2, 0, 1, 3, 2, 4, 2mode 2 appears 3 times โ€” most often: Mode = 2median Sort: 0, 1, 2, 2, 2, 3, 4 n = 7 (odd) โ†’ middle (4th) value: Median = 2mean ฮฃx = 0+1+2+2+2+3+4 = 14 147 = 2 Mean = 2 all three are equal here โ€” that’s a sign the data is symmetrical
WE 3

The effect of an outlier on the mean and median

The salaries (in thousands of dollars) of 5 employees at a small company are: 35, 38, 40, 42, 45. The owner, who earns $250,000, is added to the data set.

(a) Find the mean and median before adding the owner.   (b) Find the mean and median after.   (c) Comment.

Salaries in $1000s โ€” the owner adds an extreme value (an outlier).part (a) โ€” before Sort: 35, 38, 40, 42, 45 n = 5 โ†’ middle (3rd) value: Median = 40 Mean: 35+38+40+42+455 = 2005 = 40 Median = $40k,   Mean = $40kpart (b) โ€” after Sort: 35, 38, 40, 42, 45, 250 n = 6 โ†’ average of 3rd and 4th: (40+42)/2 = 41 Mean: 200 + 2506 = 4506 = 75 Median = $41k,   Mean = $75kpart (c) Median moved by only $1k โ€” barely budged. Mean jumped from $40k to $75k โ€” nearly doubled! Mean is heavily affected by outliers; median is not this is exactly why news articles use median house prices, not mean
WE 4

Find a missing value given the mean

The mean of the five values 8, 12, x, 15, 20 is 14. Find the value of x.

Mean = 14,   n = 5 Work backwards from the mean formula. Total = mean ร— n. Use mean ร— n = total: ฮฃx = 14 ร— 5 = 70 Sum the known values: 8 + 12 + 15 + 20 = 55 Find x: x = 70 โˆ’ 55 = 15 x = 15 “reverse the mean formula” is a classic IB question type
WE 5

Find the modal category for qualitative data

20 students were asked their favourite ice cream flavour. The results were:

Vanilla: 5,   Chocolate: 8,   Strawberry: 4,   Mint: 3

(a) State the modal flavour.   (b) Explain why the mean and median can’t be calculated.

Flavours are words โ€” qualitative data. Only the mode works here.part (a) Highest frequency: Chocolate with 8 students Modal flavour = Chocolatepart (b) You can’t add or sort flavour names โ€” “vanilla + chocolate” makes no mathematical sense. Mean and median only work with numbers mode is the only “average” for qualitative data

๐Ÿ’ก Top tips

โš  Common mistakes

Now you can describe where the centre of any data set is. The next note covers measures of dispersion โ€” how spread out the data is around that centre. The two together give you the full picture.

Need help with Measures of Central Tendency?

Get 1-on-1 help from an IB examiner who knows exactly what Paper 1 & 2 are looking for.

Book Free Session โ†’