Choosing appropriate measures of center based on data distribution
Reserve & Extensions • K-12
The mean and median are both measures of center, but they tell different stories about data. Choosing the wrong one can be misleading -- and in real life, this matters more than you might think.
The mean (arithmetic average) is the sum of all values divided by the count:
The mean uses every data point in its calculation. This is a strength when data is symmetric, but a weakness when extreme values (outliers) are present.
The median is the middle value when data is arranged in order. For an even number of values, it is the average of the two middle values.
The median is resistant to outliers -- extreme values barely affect it.
Five friends earn these hourly wages: $12, $13, $14, $15, $16.
Mean: (12 + 13 + 14 + 15 + 16) / 5 = 70 / 5 = $14
Median: The middle value is $14.
Both agree. Now suppose one friend gets a raise to $50:
Data: $12, $13, $14, $15, $50.
Mean: (12 + 13 + 14 + 15 + 50) / 5 = 104 / 5 = $20.80
Median: The middle value is still $14.
The mean jumped by $6.80 because of one outlier. The median barely moved. If you want to describe what a "typical" friend earns, the median is more honest here.
When data is skewed (pulled in one direction), the mean gets dragged toward the tail while the median stays near the bulk of the data.
A neighborhood has these home prices (in thousands): $180, $195, $200, $210, $220, $230, $1,500.
Mean: (180 + 195 + 200 + 210 + 220 + 230 + 1500) / 7 = 2735 / 7 = $390,714
Median: Ordered middle value = $210,000
The mean suggests homes cost nearly $400K, but 6 out of 7 homes are under $230K. The one mansion skews the mean. News reports use median home price for exactly this reason.
A teacher wants to calculate final grades from five equally weighted tests. Scores: 78, 82, 85, 90, 80.
Here the mean is appropriate: (78 + 82 + 85 + 90 + 80) / 5 = 83. Every test should contribute equally to the final grade -- no score is an "outlier" to ignore.
Saying "average income in the U.S. is $X" using the mean is misleading. Because a small number of very high earners pull the mean up, the median income is a much better representation of what a typical person earns. The mean U.S. income is significantly higher than the median.
Ask: "Could there be extreme values or is the data lopsided?" If yes, use the median. If the data is roughly symmetric with no wild outliers, the mean works well. When in doubt, report both.
1. Find the mean and median: 3, 5, 7, 9, 11.
Mean = (3 + 5 + 7 + 9 + 11) / 5 = 35 / 5 = 7. Median = 7 (middle value). They are equal because the data is symmetric.
2. Find the mean and median: 3, 5, 7, 9, 100.
Mean = (3 + 5 + 7 + 9 + 100) / 5 = 124 / 5 = 24.8. Median = 7. The outlier (100) drags the mean far above the typical values.
3. A company reports its employees' "average salary" as $95,000. But most employees earn between $40,000 and $60,000, while the CEO earns $5,000,000. Which measure did the company likely use, and which would be more informative?
The company likely used the mean, which is inflated by the CEO's salary. The median would be more informative since it better represents what a typical employee earns.
4. A student scores 88, 91, 85, 90, 87 on five quizzes. Should the teacher use mean or median to compute the grade? Why?
Use the mean. The scores are close together with no outliers, and every quiz should count equally toward the grade. Mean = (88 + 91 + 85 + 90 + 87) / 5 = 441 / 5 = 88.2.
5. Data set A: 10, 20, 30, 40, 50. Data set B: 10, 20, 30, 40, 500. For each, state mean and median, and which measure of center you would report.
Set A: Mean = 30, Median = 30. Either works (symmetric data). Set B: Mean = 120, Median = 30. Report the median -- the value 500 is an outlier that makes the mean unrepresentative.