Coventry University Logo Sigma Logo

Correlation Using JASP

What is Correlation and when to use it

Correlation is a measure of the strength of the relationship between two variables.

It is measured using a number called the correlation coefficient which lies between -1 and +1.

  • If the correlation is positive, this means that one variable increases as the other increases.
  • If the correlation is negative, one variable decreases as the other variable increases.

Larger values closer to +1 or -1 indicate a stronger relationship. Values nearer to zero indicate weaker or even no relationship.

The most used measures are Pearson’s correlation coefficient and Spearman’s correlation coefficient.

For help on “What test do I need” go to the sigma website statistics resources page.

Pearson’s correlation measures the strength of the linear (i.e. straight-line) relationship, whereas Spearman’s correlation simply measures the strength of a general monotonic relationship which can be non-linear (monotonic means always increasing or always decreasing).

  • The variables used to calculate a correlation coefficient need to be scale or ordinal. Correlation is not appropriate if any of the variables are nominal.

  • Pearson’s correlation is appropriate if both variables are scale.

  • Spearman’s correlation can be used for any combination of scale and ordinal variables (i.e. both can be scale or both ordinal or one of each).

If one or both variables are scale, you should also obtain a scatter plot to visualize the relationship between them. We can also conduct a test on the correlation coefficient – see later.

Example

A student wanted to explore the relationship between knowledge about calcium and calcium intake, amongst sports science students. The data shown below can be downloaded in a CSV file called calcium.csv. Knowledge scores about calcium are in the column called Knowledge and calcium intake (in mg) is in the column called Calcium. A sample of 20 participants was taken.

Using JASP

Since both variables are scale, we will use Pearson’s correlation, but first we should examine the relationship using a scatter plot. To obtain a scatter plot in JASP, from the Descriptives menu open the Descriptive Statistics tab. Move ‘Knowledge’ and ‘Calcium’ into the Variables box.

Next open the Customizable Plots tab under Statistics and Basic plots and tick Scatter Plots. For Graph above scatter plot and Graph right of scatter plot, tick None. Optionally, Add regression line can be ticked for visualising linearity.

The scatter plot should look as follows in your Results screen (that should be on the right hand side).

In the plot, the points follow an increasing pattern and seem to be reasonably close to an underlying straight line. This suggests there is a strong relationship between the two variables and also that it looks reasonably linear.

A regression line can be added to visually see the pattern in the data (see above for this option).

To obtain the correlation coefficient using JASP, from the Regression menu select Correlation under Classical.

Move knowledge score and calcium intake into the Variables box.

Then tick the Pearson’s r option under Sample Correlation Coefficients box. Note that if required we could have opted for the Spearman’s r correlation if we had ordinal data.

Examining the Output

The JASP output should now include this table, titled Pearson’s Correlations:

The Pearson correlation coefficient is 0.882. We could report this as r=0.88 as two decimal places is sufficient. This is indicative of a strong relationship, as we saw earlier by the scatter plot. It is also positive which indicates that calcium intake increases as knowledge of calcium increases and vice-versa. A commonly used interpretation is based on benchmarks suggested by Cohen (1992). Here correlation strengths are classified as in the table below. Note our value of 0.88 falls in the strong category of 0.5 to 0.9.

Correlation Coefficient Value Interpretation
-0.3 to +0.3 Weak
-0.5 to -0.3 or 0.3 to 0.5 Moderate
-0.9 to -0.5 or 0.5 to 0.9 Strong
-1.0 to -0.9 or 0.9 to 1.0 Very Strong

Extracted from Cohen, L. (1992). Power Primer. Psychological Bulletin, 112(1) 155-159

The Pearson’s Correlations table in the earlier JASP output also includes the p-value, which is less than 0.001. This p-value is used to explore the research question:

Is there a true relationship between the intake of calcium and knowledge about calcium?

This can be tested formally using the hypotheses:

H0: There is no correlation between calcium intake and knowledge about calcium (equivalent to saying r=0)

H1: There is some correlation between calcium intake and knowledge about calcium (equivalent to saying r≠0)

Since our p-value is reported as less than 0.001, this means it is below the usual level of 0.05 used to test such hypotheses and so we can reject H0 and conclude there is evidence of a true correlation in the wider population of Sports Science students. The point here is that whilst our correlation coefficient of 0.88 indicates a strong relationship, this is true for our sample of 20 participants. However, can we use this as evidence to infer that a relationship truly exists between knowledge and intake of calcium amongst ALL Sports Science students (not just in our sample)? In our case, the test we did above says yes, we can, as the p-value is less than 0.05.

Reporting Results

We could report the results as:

”Amongst Sports Science students there is evidence that knowledge about calcium is related to calcium intake (p<0.001). Greater knowledge about calcium is associated with increased calcium intake and the correlation coefficient indicated a strong linear relationship (r=0.88)”

Note how we avoid suggesting that greater knowledge CAUSES increased calcium intake as correlation cannot be used to infer a cause-and-effect relationship.

For more resources, see sigma.coventry.ac.uk Adapted from material developed by Coventry University Creative Commons License