How to Calculate a Correlation Matrix

••• Jupiterimages/Pixland/Getty Images

The correlation (r) is a measure of the linear relationship between two variables. For example, leg length and torso length are highly correlated; height and weight are less highly correlated, and height and name length (in letters) are uncorrelated.

A perfect positive correlation: r = 1. (When one goes up the other goes up) A perfect negative correlation: r = -1 (When one goes up, the other goes down) No correlation: r = 0 (There is no linear relationship)

A correlation matrix is a matrix of many correlations.

Computing a Correlation Matrix with R

    Get the data. If your data is in Excel, the easiest method is to save it as a .csv file (In Excel 7, click "File", then "Save as," then "other formats." Then in "Save as type," scroll down to CSV (comma separated values). Each row should have data on one subject, and each column should be one variable.

    Read the data into R using read.csv. For instance, if your data is in "c:\mydisk\mydir\data.csv" enter mydata <- read.csv ("c:/mydisk/mydir/data.csv").

    Calculate the correlation matrix using cor(). For example: cor(mydata). Or, you can store the correlation matrix as an object for later use, using: cormat <- cor(mydata).

Computing a Correlation Matrix with SAS

    Get the data. SAS can read data in many formats. If you store your data in Excel, have one subject on each row and one variable in each column

    Read the data into SAS. You can use the IMPORT wizard to get your data. Click on "File," then "Import data," then choose a data type using the drop-down menu. Click "Next" and navigate to your data, then click "Finish."

    Calculate the correlation matrix. If your data is saved in SAS as mydata, with variables VAR1, VAR2 and VAR3, then type: PROC CORR data = mydata; VAR var1 var2 var3; RUN;


    • In both SAS and R, there are options for different types of correlations (e,.g Pearson&#039;s, Spearman&#039;s). Remember that correlations only find linear relationships. If the relationship between two correlations is not linear, correlations are not a good choice. To get more help with R, start R, then type ?cor.


    • If the second reference below (R Help) does not work, then start R and type ?cor.