How to Calculate Correlation

Correlation coefficients suggest increasingly linear relationships as they approach 1 or -1; the closer they come to 0, however, the less linear the relationship between the points may be.
••• alphaspirit/iStock/Getty Images

The correlation coefficient, or r, always falls between -1 and 1 and assesses the linear relationship between two sets of data points such as x and y. You can calculate the correlation coefficient by dividing the sample corrected sum, or S, of squares for (x times y) by the square root of the sample corrected sum of x2 times y2. In equation form, this means: Sxy/ [√(Sxx * Syy)].

Calculating the Sample Corrected Sum

You derive S by squaring the sum of your data points, dividing by the number of total data points, and then subtracting this value from the sum of the squared data points. For example, given a set of x data points: 3, 5, 7, and 9, you would calculate the Sxx value by first squaring each point and then adding those squares together, which results in 164. Then subtract from this value the squared sum of these data points divided by the number of data points, or (24 * 24)/4, which equals 144. This results in Sxx = 20. Given a set of y data points: 2, 4, 6 and 10, you would proceed the same way to calculate Syy = 156 – [(22 * 22)/4], which equals 35, and Sxy = 158 – [(24 * 22)/4], which equals 26.

Final Correlation Coefficient Calculation

You can then plug the established values for Sxx, Syy and Sxy into the equation Sxy/ [√(Sxx * Syy)]. Using the values above, this results in 26/[√(20 * 35)], which equals 0.983. Since this value is very close to 1, it suggests a strong linear relationship between these two data sets.


About the Author

Teresa J. Siskin has been a researcher, writer and editor since 2009. She holds a doctorate in art history.

Photo Credits

  • alphaspirit/iStock/Getty Images