When scientists, economists or statisticians make predictions based on theory and then gather real data, they need a way to measure the variation between predicted and measured values. They usually rely on the mean square error (MSE), which is the sum of the variations of the individual data points squared and divided by the number of data points minus 2. When the data is displayed on a graph, you determine the MSE by summing the variations in the vertical axis data points. On an x-y graph, that would be the y-values.
Why Square the Variations?
Multiplying the variation between predicted and observed values has two desirable effects. The first is to ensure that all values are positive. If one or more values were negative, the sum of all the values could be unrealistically small and a poor representation of the actual variation between predicted and observed values. The second advantage of squaring is to give more weight to larger differences, which ensures that a large value for MSE signifies large data variations.
Sample Calculation Stock Algorithm
Suppose you have an algorithm that predicts the prices of a particular stock on a daily basis. On Monday, it predicts the stock price to be $5.50, on Tuesday to be $6.00, Wednesday $6.00, Thursday $7.50 and Friday $8.00. Considering Monday as Day 1, you have a set of data points that appears like this: (1, 5.50), (2, 6.00), (3, 6.00), (4, 7.50) and(5, 8.00). The actual prices are as follows: Monday $4.75 (1, 4.75); Tuesday $5.35 (2, 5.35); Wednesday $6.25 (3, 6.25); Thursday $7.25 (4, 7.25); and Friday: $8.50 (5, 8.50).
The variations between the y-values of these points are 0.75, 0.65, -0.25, 0.25 and -0.50 respectively, where the negative sign indicates a predicted value smaller than the observed one. To calculate MSE, you first square each variation value, which eliminates the minus signs and yields 0.5625, 0.4225, 0.0625, 0.0625 and 0.25. Summing these values gives 1.36 and dividing by the number of measurements minus 2, which is 3, yields the MSE, which turns out to be 0.45.
MSE and RMSE
Smaller values for MSE indicate closer agreement between predicted and observed results, and an MSE of 0.0 indicates perfect agreement. It's important to remember, however, that the variation values are squared. When an error measurement is required that is in the same units as the data points, statisticians take the root mean square error (RMSE). They obtain this by taking the square root of the mean square error. For the example above, the RSME would be 0.671 or about 67 cents.