How to find the correlation coefficient for 'R' in a scatter plot

Written by jayne thompson Google
  • Share
  • Tweet
  • Share
  • Pin
  • Email
How to find the correlation coefficient for 'R' in a scatter plot
Finding the value of correlation coefficient R. (Ryan McVay/Photodisc/Getty Images)

Regression analysis is a branch of mathematics used for analysing numerical data which consists of an independent variable (the variable that we deliberately fix) and a dependent variable. The purpose of the analysis is to find a relationship between the two sets of data. The data is plotted onto a scatter graph, with the independent variable data plotted on the x axis and the dependent variable data plotted on the y axis. When a straight line, or regression line, is drawn through the centre of the plotted data points, the graph will give a good visual picture of the relationship between the two sets of variables.

Skill level:


  1. 1

    Construct a scatter graph from your data. Use this to see whether there is likely to be a linear relationship between the two variables, that is, whether you expect R to be close to +1, -1 or 0.

  2. 2

    Prepare a table of five columns. In the first column, list your x data. In the second, list your y data. In the third, calculate the value of x-squared. In the fourth, calculate the value of y-squared. In the final column, calculate x multiplied by y (xy).

  3. 3

    Calculate the average value for each of your five columns. For example, if your x data consists of the values 0, 2.2 and 4.6, your average x value will be (0 + 2.2 + 4.6) / 3 = 2.26.

  4. 4

    Calculate Sxy as follows: (total number of data points multiplied by average of xy) - (average x multiplied by average y). Calculate Sxx as follows: (total number of data points multiplied by average x-squared) - (average of x) squared.

  5. 5

    Calculate R. This is Sxy divided by the square root of (Sxx multiplied by Sxy).

Tips and warnings

  • Sense check your answer by using the scatter graph. You should be able to see at a glance whether R should have a value close to -1, +1, or 0.
  • If the data points cluster around the straight line, there is a strong linear relationship between the x and y data. If the points are randomly scattered across the graph, there is no relationship between the variables, or zero correlation.
  • The correlation coefficient, or R, indicates the extent to which the data points deviate from the regression line. In mathematical terms R operates on a scale of -1 to +1. +1 and -1 each represent a perfectly linear correlation between the data. If the R value is positive then the slope of a regression line will rise from left to right on the graph. If it is negative, the slope will fall. An R value of zero indicates no correlation at all.
  • If your value for R falls outside the -1 to +1 scale, you have made a mistake.

Don't Miss

  • All types
  • Articles
  • Slideshows
  • Videos
  • Most relevant
  • Most popular
  • Most recent

No articles available

No slideshows available

No videos available

By using the site, you consent to the use of cookies. For more information, please see our Cookie policy.