Correlations are important in many areas of science. Although correlation doesn’t equal causation, it’s often the first step to understanding the true relationship between two variables and can give a valuable hint that there is a causal relationship somewhere.
Learning to calculate a correlation is crucial, and you can easily find the “r value” in Excel using either built-in functions or by working through the calculation in pieces using the more basic functions of the program. The simplest way is using the built-in function, but understanding the calculation is helpful if you ever need to use a different program to find it.
What Is Pearson’s Correlation Coefficient?
Pearson’s correlation coefficient is a simple way of calculating the degree of correlation between two variables, returning a value (called r) ranging from −1 to 1. A perfect correlation (r = 1) between two variables would be where an increase in one variable by a certain amount leads to a correspondingly-sized increase in the other, or vice-versa.
A perfect negative correlation (r = −1) is basically the same, except an increase in one variable leads to a correspondingly-sized decrease in the other. Finally, no correlation whatsoever means there is no relationship at all between two things.
In practice, you’ll almost never see a perfect correlation, and most values will be some decimal value between −1 and 1. So when you find the Pearson r in Excel, the result will usually be some decimal value, where the magnitude of the number tells you the strength of the correlation between your variables.
Pearson Correlation in Excel
The easiest method for finding the Pearson correlation in Excel is using the built-in “Pearson” function or (equivalently) the “Correl” function. The function has a simple syntax: PEARSON(array 1, array 2).
In short, you just need two arrays of values (i.e. columns of results, for example, age and blood pressure arranged so there is a row for each individual patient) that are equal in length, then type “=PEARSON(” into an empty cell, followed by the range of values for the first array, a comma, then the range of values for the second. Then you close out the brackets, hit “Enter” and it will return the r value.
As always, you can highlight the values you want to search for correlations with your mouse or by navigating to the relevant cells with the arrow keys on your keyboard.
You can also use the “Correl” function, which performs the same calculation as “Pearson” and on versions of Excel from 2003 onward, leads to the exact same result. However, if you have an older version of Excel, you should use the “Correl” function because there can be rounding errors with “Pearson.”
Finding Pearson’s r “By Hand”
You can also calculate the r value in Excel in the more traditional method but with the help of the automatic calculations from the program. First, put the values for your variables (which can be referred to as x and y for clarity) in two columns, then create three more columns: xy, x2 and y2. Now multiply each value in the x column by the y column in the xy column (using the cell numbers in the calculation so you can drag it down for the rest of the column), square the x values for the next column, and square the y values for the final one.
Create a “sum” row underneath your data, and take the sum of all the values for each column. You can then use the formula to calculate your r value:
Here, n is the number of pairs of values you have. You can follow this through in pieces: Take the number of pairs of values, multiply it by the sum of your xy column, and then subtract the product of the sums of the x and y values.
Then, multiply the sum of your x2 column by n, subtract the sum of your x column squared, do the same thing for y and multiply these together, then take the square root of the whole thing. Finally, divide the first result by the second to get your r value.
- Statistics How To: Correlation Coefficient: Simple Definition, Formula, Easy Steps
- Excel Functions: The Excel PEARSON Function
- Microsoft Office: PEARSON Function
- Excel Easy: Correlation
- University of Connecticut: Calculating Pearson’s r Correlation Coefficient with Excel
- Extend Office: How to Calculate the Correlation Coefficient Between Two Variables in Excel?
- John Foxx/Stockbyte/Getty Images