The median is the middle point of an ordered data set. For example, the set (2,4,7,9,10) has a median of 7. Grouped data is clumped into categories, with the exact detail of each data point lost. Therefore, the exact median can't be known from grouped data alone. However, if you know the number of data points in each interval, you can tell which is the "centre interval," that is, which one contains the data point that is the median. You can further refine the estimate of the median point by using a formula, based on the assumption that the middle interval's data points are evenly distributed.
- Skill level:
Other People Are Reading
Group your data points into intervals, if they aren't already. Determine which interval must contain the median data point.
For instructional purposes, consider the dataset (1,2,4,5,6,7,7,7,9). The median here is 6. You might group the set into intervals of width 4, for instance. Its frequency distribution might then be, for example, 1-4: 3 5-8: 5 9-12: 1 In the ungrouped data, the median is clearly in the 5-8 category. You can tell that even without seeing the original dataset.
Calculate the difference in the number of data points up to the middle interval and half the total number of data points.
For the example above, this equals 9/2 - 3 = 1.5. This estimates how far into the middle interval the mean should lie.
Divide by the number of points in the middle interval.
Continuing with the example, 1.5 / 5 = 0.3. This gives a proportion for how far into the middle interval the median is.
Multiply by the width of the middle interval.
Continuing with the example, 0.3 x 4 = 1.2. The converts the proportion into the interval into an actual increment of data.
Add the above result to the value between the middle interval and the interval below.
Since the cut-off between the middle interval and the interval below is 4.5, this gives you 4.5 + 1.2 = 5.7, which you can round up to 6, the correct answer.
Tips and warnings
- Effectively, the above calculation is the same as using the formula L + (n/2 - c)/f x w, where L is the number between the middle and next-lower interval, n is the total number of data points, c is the cumulative number of points below the middle interval, f is the number of data points in the middle interval, and w is its width.
- 20 of the funniest online reviews ever
- 14 Biggest lies people tell in online dating sites
- Hilarious things Google thinks you're trying to search for