How to Find the Correlation Coefficient for 'R' in a Scatter Plot

••• marekuliasz/iStock/GettyImages

Finding the strength of the association between two variables is an important skill for scientists of all types. If two variables are correlated with each other, it shows that there is a link between them. A positive correlation means that when one variable increases, the other one does too, and a negative correlation means that when one variable increases, the other one decreases. Correlations don’t prove causation, although it is possible that further tests will prove a causal relationship between the variables. The correlation coefficient R shows the strength of the relationship between the two variables, and whether it’s a positive or a negative correlation.

TL;DR (Too Long; Didn't Read)

Call one variable x and one variable y. Calculate the value of R using the formula:

R = [n(Σxy) – (Σx) (Σy)] ÷ √{[n Σx2− (Σx)2] [n Σy2− (Σy)2]}

Where n is your sample size.

    Make a table of your data. This should include one column for the participant number, one column for the first variable (labeled x) and one column for the second variable (labeled y). For example, if you’re looking to see whether there is a correlation between height and shoe size, one column would identify each person you measure, one column would show each person’s height and another would show their shoe size. Make three additional columns, one for xy, one for x2 and one for y2.

    Use your data to fill out the three additional columns. For example, imagine your first person measures 75 inches tall and has size 12 feet. The x (height) column would show 75, and the y (shoe size) column would show 12. You need to find xy, x2 and y2. So using this example:

    xy = 75 × 12 = 900

    x2 = 752 = 5,625

    y2 = 122 = 144

    Complete these calculations for every person for whom you have data.

    Create a new row at the bottom of your table for the sums of each column. Add together all of the x values, all of the y values, all of the xy values, all of the x2 values and all of the y2 values, and then put the results at the bottom of the corresponding column in your new row. You can label your new row “sum” or use a sigma (Σ) symbol.

    You find R from your data using the formula:

    R = [n(Σxy) – (Σx) (Σy)] ÷ √{[nΣx2− (Σx)2] [nΣy2− (Σy)2]}

    This looks a bit daunting, so you can split it into two parts, which we’ll call s and t.

    s = n(Σxy) – (Σx) (Σy)

    t = √{[n Σx2− (Σx)2] [n Σy2− (Σy)2]}

    In these equations, n is the number of participants you have (your sample size). The rest of the parts of the equation are the sums you calculated in the last step. So for s, multiply the size of your sample by the sum of the xy column, and then subtract the sum of the x column multiplied by the sum of the y column from this.

    For t, there are four main steps. First, calculate n multiplied by the sum of your x2 column, and then subtract the sum of your x column squared (multiplied by itself) from this value. Second, do exactly the same thing but with the sum of the y2 column and the sum of the y column squared in place of the x parts (i.e., n × Σy2 – [Σy × Σy]). Third, multiply these two results (for the xs and ys) together. Fourth, take the square root of this answer.

    If you’ve worked in parts, you can calculate R as simply R = s ÷ t. You will get an answer between −1 and 1. A positive answer shows a positive correlation, with anything over 0.7 generally being considered a strong relationship. A negative answer shows a negative correlation, with anything over −0.7 considered a strong negative relationship. Similarly ± 0.5 is considered a moderate relationship and ±0.3 is considered a weak relationship. Anything close to 0 shows a lack of correlation.

Related Articles

How to Determine the Bin Width for a Histogram
How to Find the Inequalities From a Graph
How to Calculate Pearson's R (Pearson Correlations)...
How to Calculate T-Test Statistics
How to Calculate Coefficient of Determination
How to Find Linear Functions
How to Calculate the Standard Error of a Slope
How to Factor Monomials
How to Find Standardized Values for Correlation
How to Calculate Statistical Significance
How to Find Y Value for the Slope of a Line
How to Calculate the Area Under a Normal Curve
How to Find an Ordered Pair From an Equation
Adding & Subtracting Fractions
How to Find P Values Using a Texas Instruments TI-83...
How to Generate a Box Plot, Stem-and-Leaf Plot and...
How to Calculate Slope Using the TI-83 Plus
How to Interpret an Independent T Test in SPSS
How to Learn Algebra in Easy Steps