In mathematical terms, a "mean" is an average. Averages are calculated to represent a data set meaningfully. For instance, a meteorologist could tell you that the mean temperature for January 22 in Chicago is 25 degrees F based on past data. This number cannot predict the exact temperature for next January 22 in Chicago, but it tells you enough to know that you should pack a jacket if you are going to Chicago on that date. Two commonly used means are the arithmetic mean and the geometric mean. Knowing which one to use for your data means understanding their differences.
Formulas for Calculation
The most obvious difference between the arithmetic mean and the geometric mean for a data set is how they are calculated. The arithmetic mean is calculated by adding up all the numbers in a data set and dividing the result by the total number of data points.
Example: Arithmetic mean of 11, 13, 17 and 1,000 = (11 + 13 + 17 + 1,000) / 4 = 260.25
The geometric mean of a data set is calculated by multiplying the numbers in the data set, and taking the nth root of the result, where "n" is the total number of data points in the set.
Example: Geometric mean of 11, 13, 17 and 1,000 = 4th root of (11 x 13 x 17 x 1,000) = 39.5
The Effect of Outliers
When you look at the results of arithmetic mean and geometric mean calculations, you notice that the effect of outliers is greatly dampened in the geometric mean. What does this mean? In the data set of 11, 13, 17 and 1,000, the number 1,000 is called an "outlier" because its value is much higher than all the other ones. When the arithmetic mean is calculated, the result is 260.25. Notice that no number in the data set is even close to 260.25, so the arithmetic mean is not representative in this case. The outlier's effect has been exaggerated. The geometric mean, at 39.5, does a better job of showing that most numbers from the data set are within the 0-to-50 range.
Statisticians use arithmetic means to represent data with no significant outliers. This type of mean is good for representing average temperatures, because all the temperatures for January 22 in Chicago will be between -50 and 50 degrees F. A temperature of 10,000 degrees F is just not going to happen. Things like batting averages and average race car speeds are also represented well using arithmetic means.
Geometric means are used in cases where the differences among data points are logarithmic or vary by multiples of 10. Biologists use geometric means to describe the sizes of bacterial populations, which can be 20 organisms one day and 20,000 the next. Economists can use geometric means to describe income distributions. You and most of your neighbors might make around $65,000 per year, but what if the guy up on the hill makes $65 million per year? The arithmetic mean of the income in your neighborhood would be misleading here, so a geometric mean would be more suitable.
- calculating image by timur1970 from Fotolia.com