The Correlation Coefficient (r)

The sample correlation coefficient (r) is a measure up of the closeness of association of the clues in a scatter plot come a linear regression line based on those points, together in the example over for gathered saving end time. Possible values the the correlation coefficient range from -1 come +1, with -1 describe a perfectly linear negative, i.e., inverse, correlation (sloping downward) and also +1 describe a perfectly direct positive correlation (sloping upward).

You are watching: Describe the range of values for the correlation coefficient.

\"*\"

A correlation coefficient close come 0 says little, if any, correlation. The scatter plot argues that measure up of IQ carry out not readjust with raising age, i.e., there is no proof that IQ is associated with age.

\"*\"

Calculation that the Correlation Coefficient

The equations listed below show the calculations sed to compute \"r\". However, you do not have to remember these equations. We will usage R to execute these calculations for us. Nevertheless, the equations offer a sense of exactly how \"r\" is computed.

\"*\"

where Cov(X,Y) is the covariance, i.e., how much each observed (X,Y) pair is native the mean of X and the mean of Y, simultaneously, and also and sx2 and also sy2 are the sample variances for X and Y.

. Cov (X,Y) is computed as:

\"*\"

You don\"t have to memorize or use these equations because that hand calculations. Instead, we will use R to calculate correlation coefficients. For example, we might use the complying with command to compute the correlation coefficient because that AGE and TOTCHOL in a subset of the Framingham Heart study as follows:

> cor(AGE,TOTCHOL)<1> 0.2917043

Describing Correlation Coefficients

The table listed below provides some guidelines for exactly how to define the strength of correlation coefficients, yet these are just guidelines because that description. Also, keep in mind that even weak correlations have the right to be statistically significant, as you will discover shortly.

Correlation Coefficient (r)Description(Rough guideline )
+1.0Perfect optimistic + association
+0.8 come 1.0Very strong + association
+0.6 to 0.8Strong + association
+0.4 come 0.6Moderate + association
+0.2 come 0.4Weak + association
0.0 to +0.2Very weak + or no association
0.0 to -0.2Very weak - or no association
-0.2 come – 0.4Weak - association
-0.4 come -0.6Moderate - association
-0.6 come -0.8Strong - association
-0.8 to -1.0Very solid - association
-1.0Perfect an adverse association

The 4 images below give an idea of just how some correlation coefficients can look on a scatter plot.

\"*\"

The scatter plot below illustrates the relationship in between systolic blood pressure and also age in a huge number that subjects. It says a weak (r=0.36), however statistically far-reaching (p

Beware the Non-Linear Relationships

Many relationships in between measurement variables are sensibly linear, however others are not for example, the image below indicates the the danger of fatality is no linearly correlated with body mass index. Instead, this kind of connection is often defined as \"U-shaped\" or \"J-shaped,\" due to the fact that the value of the Y-variable at first decreases with rises in X, but with additional increases in X, the Y-variable rises substantially. The relationship in between alcohol consumption and also mortality is additionally \"J-shaped.\"

\"*\"

Source: Calle EE, et al.: N Engl J Med 1999; 341:1097-1105

A simple way to evaluate whether a partnership is reasonably linear is to study a scatter plot. Come illustrate, look at the scatter plot below of elevation (in inches) and body load (in pounds) utilizing data indigenous the Weymouth health and wellness Survey in 2004. R was offered to develop the scatter plot and also compute the correlation coefficient.

weyattach(wey) plot(hgt_inch,weight)cor(hgt_inch,weight)<1> 0.5653241

\"*\"

There is rather a the majority of scatter, and also the huge number the data points provides it daunting to totally evaluate the correlation, yet the trend is fairly linear. The correlation coefficient is +0.56.

See more: Mobile Suit Gundam Gundam Vs. Zeta Gundam : Gundam Vs, Mobile Suit Gundam: Gundam Vs

Beware that Outliers

Note likewise in the plot above that there room two people with evident heights of 88 and 99 inches. A height of 88 inches (7 feet 3 inches) is plausible, yet unlikely, and a height of 99 customs is certainly a coding error. Obvious coding errors must be excluded indigenous the analysis, due to the fact that they deserve to have an inordinate effect on the results. It\"s constantly a an excellent idea come look in ~ the raw data in order to identify any type of gross mistakes in coding.