Jexe 74 3 249-266
Jexe 74 3 249-266
Understanding Correlation:
Factors That Affect the Size of
r
a a
Laura D. Goodwin & Nancy L. Leech
a
University of Colorado at Denver and Health
Sciences Center
Published online: 07 Aug 2010.
To cite this article: Laura D. Goodwin & Nancy L. Leech (2006) Understanding
Correlation: Factors That Affect the Size of r, The Journal of Experimental Education,
74:3, 249-266, DOI: 10.3200/JEXE.74.3.249-266
Taylor & Francis makes every effort to ensure the accuracy of all the
information (the “Content”) contained in the publications on our platform.
However, Taylor & Francis, our agents, and our licensors make no
representations or warranties whatsoever as to the accuracy, completeness,
or suitability for any purpose of the Content. Any opinions and views
expressed in this publication are the opinions and views of the authors, and
are not the views of or endorsed by Taylor & Francis. The accuracy of the
Content should not be relied upon and should be independently verified with
primary sources of information. Taylor and Francis shall not be liable for any
losses, actions, claims, proceedings, demands, costs, expenses, damages,
and other liabilities whatsoever or howsoever caused arising directly or
indirectly in connection with, in relation to or arising out of the use of the
Content.
This article may be used for research, teaching, and private study purposes.
Any substantial or systematic reproduction, redistribution, reselling, loan,
sub-licensing, systematic supply, or distribution in any form to anyone is
expressly forbidden. Terms & Conditions of access and use can be found at
http://www.tandfonline.com/page/terms-and-conditions
Downloaded by [Washburn University] at 16:22 17 October 2014
Downloaded by [Washburn University] at 16:22 17 October 2014
Statistics, and
Measurement,
Research Design
The Journal of Experimental Education, 2006, 74(3), 251–266
Copyright © 2006 Heldref Publications
Understanding Correlation:
Factors That Affect the Size of r
Downloaded by [Washburn University] at 16:22 17 October 2014
LAURA D. GOODWIN
NANCY L. LEECH
University of Colorado at Denver and Health Sciences Center
ABSTRACT. The authors describe and illustrate 6 factors that affect the size of a
Pearson correlation: (a) the amount of variability in the data, (b) differences in the
shapes of the 2 distributions, (c) lack of linearity, (d) the presence of 1 or more “out-
liers,” (e) characteristics of the sample, and (f) measurement error. Also discussed
are ways to determine whether these factors are likely affecting the correlation, as
well as ways to estimate the size of the influence or reduce the influence of each.
Key words: correlation, errors, interpretation, Pearson product–moment correlation
251
252 The Journal of Experimental Education
can find it difficult to “diagnose” a low correlation—or just to fully interpret re-
sults of simple or multivariate statistical analyses in the correlation “family.”
The purpose of this article is to describe and illustrate six factors that affect the
size of correlations, including (a) the amount of variability in either variable, X or
Y; (b) differences in the shapes of the two distributions, X or Y; (c) lack of linear-
ity in the relationship between X and Y; (d) the presence of one or more “outliers”
in the dataset; (e) characteristics of the sample used for the calculation of the cor-
relation; and (f) measurement error. Where possible, we illustrate the effects of
these characteristics on the size of a correlation with a hypothetical data example.
Downloaded by [Washburn University] at 16:22 17 October 2014
statistics courses. Of the six characteristics that affect the value of r, lack of lin-
earity was covered most often (in 22, or 73%, of the textbooks). Next in order of
frequency of coverage was the lack of variability in X or Y (in 17, or 57%, of the
textbooks), followed by the presence of outliers (in 11, or 37%, of the textbooks).
The effect of measurement error on r was covered in only 4 (13%) of the books,
and the effect that dissimilar shapes of distributions for X and Y have on the max-
imum size of r was covered in just 2 (7%) of the books reviewed. Characteristics
of samples often overlap with other factors that affect the size of r, such as vari-
ability or presence of outliers. Thus, it was difficult to “rate” the textbooks on this
dimension; however, very few of the books included descriptions or examples that
went beyond the influence of variability on r. In terms of the other five dimen-
sions, we found only 1 textbook that directly addressed all of them.
1
A list of the textbooks reviewed is available from the first author.
2
To keep this example simple, our hypothetical data consist of one-item measures. In “real
life,” measuring abstract constructs such as “anxiety” or “interest” with just one-item
measures is, of course, inappropriate; the resulting sets of scores would most likely be
quite unreliable.
254 The Journal of Experimental Education
Participant X Y1 Y2 Y3 Y4
1 1 2 2 2 1
2 2 2 2 1 2
3 2 1 1 2 2
4 2 2 2 1 3
5 2 3 3 2 3
6 3 2 2 2 4
Downloaded by [Washburn University] at 16:22 17 October 2014
7 3 3 3 1 4
8 3 3 3 1 5
9 3 3 3 1 5
10 3 3 3 1 5
11 3 3 3 1 5
12 3 3 3 1 6
13 3 4 4 1 6
14 3 4 4 1 6
15 3 4 4 1 6
16 4 3 3 1 4
17 4 3 3 1 4
18 4 3 3 1 5
19 4 4 4 1 5
20 4 4 4 2 5
21 4 4 4 3 5
22 4 4 4 3 6
23 4 4 4 4 6
24 4 4 4 4 6
25 4 4 4 3 6
26 5 5 3 3
27 5 5 4 3
28 5 6 5 2
29 5 5 5 2
30 6 5 6 1
Statistic X Y1 Y2 Y3 Y4
sures of central tendency (mean, median, mode), standard deviations, and values
for the skewness and kurtosis of each distribution. Note that the distributions of
both X and Y1 were intentionally constructed to have identical values of these sta-
tistics and to be symmetrical in shape. The correlation between these two vari-
ables serves as the “original” correlation in this article—allowing for subsequent
comparisons by varying the values of the Y variable (i.e., Y2, Y3, and Y4). The cor-
relation between X and Y1 is .83; a scattergram illustrating this relationship is
shown in Figure 1.
Downloaded by [Washburn University] at 16:22 17 October 2014
Amount of Variability in X or Y
It is well known among statisticians that, other things being equal, the value of
r will be greater if there is more variability among the observations than if there
is less variability. However, many researchers are unaware of this fact (Glass &
4
Y1
0
0 1 2 3 4 5 6 7
X
Hopkins, 1996), and it is common for students in basic statistics courses (and
even some students in intermediate- and advanced-level courses) to have diffi-
culty comprehending this concept. Examples of this characteristic of r can be
found quite easily, and it is often termed range restriction, restriction of range,
or truncated range by authors of statistics and measurement textbooks (e.g.,
Abrami, Cholmsky, & Gordon, 2001; Aron & Aron, 2003; Crocker & Algina,
1986; Glenberg, 1996; Harris, 1998; Hopkins, 1998; Spatz, 2001; Vaughan,
1998). In predictive validity studies, the phenomenon occurs when a test is used
for selection purposes; subsequently, the scores obtained with the test are corre-
Downloaded by [Washburn University] at 16:22 17 October 2014
lated with an outcome variable that is only available for those individuals who
were selected for the educational program or job. For example, the correlation
between SAT scores and undergraduate grade point average (GPA) at some se-
lective universities is only about .20 (Vaughan). This does not necessarily mean
that there is little relationship between SAT scores and college achievement,
however. The range of SAT scores is small at selective colleges and universities
that use SAT scores as a criterion for admission. Furthermore, GPAs can be re-
stricted in elite colleges. Other things being equal, the correlation between SAT
scores and GPAs would be greater if there were a greater range of scores on the
SAT and a greater range of GPAs. Other examples of range restriction can be at-
tributed to the sampling methods used. For example, if individuals are chosen to
participate in a study based on a narrow range of scores on a variable, correla-
tions between that variable and any other variables will be low. The “ultimate”
situation in which low variability influences a correlation occurs when there is no
variability on either X or Y. In that case, the correlation between the variable with
no variability and any other variable is not even defined (Hays, 1994).
To illustrate the relationship between variability and the size of a correlation,
we reduced the amount of variability in both X and Y by removing the five high-
est scoring cases from the Y variable. This can be seen in the second distribution
of Y (Y2) in Table 1, in which all remaining participants’ scores range from 1 to
4 (rather than 1–6 in the original Y1 distribution; by removing 5 cases from Y,
those same cases are also removed from X when the correlation is calculated).
With those 5 cases removed, the value of the correlation shrinks from .83 to .71.
Although the nature of the relationships among the remaining 25 cases is essen-
tially the same as it was when the original correlation was calculated, the size of
the correlation now is smaller due to the shrinkage in the variability in the data.
Therefore, it appears that the relationship is smaller, too. This can also be seen in
Figure 1, in which the removed cases are surrounded by a dotted box. Without
those cases in the scattergram, the relationship is seen as less strong (more of a
“circle” in shape).
In trying to determine why a correlation might be lower than it was expected
to be (or, perhaps, lower than other researchers have reported), examining the
amount of variability in the data can be very helpful. This can be done visually
Goodwin & Leech 257
using the equation is not to actually estimate the size of the unrestricted correla-
tion but, rather, to illuminate “the consequence of restricted or exaggerated vari-
ability on the value of r so that it can be interpreted properly” (p. 122).
The correlation can achieve its maximum value of 1.0 (positive or negative)
only if the shapes of the distributions of X and Y are the same (Glass & Hopkins,
1996; Hays, 1994; Nunnally & Bernstein, 1994). Carroll (1961) showed that the
maximum value of r, when the distributions of X and Y do not have the same
shape, depends on the extent of dissimilarity (or lack of similarity in skewness
and kurtosis): the more dissimilar the shapes, the lower the maximum value of
the correlation. Nunnally and Bernstein also noted that the effect on the size of r
depends on how different the shapes of the distributions are, as well as how high
the correlation would be if the distributions had identical shapes. In terms of the
latter, the effect is greater if the correlation between the same-shaped distribu-
tions is greater (other things being equal). For example, if the correlation were
.90 between same-shaped distributions, changes in the shape of one of the distri-
butions could reduce the size of the correlation to .80 or .70. On the other hand,
if the correlation were .30 between same-shaped distributions, even dramatic
changes in the shape of one of the distributions will have relatively little effect
on the size of r (assuming that N is fairly large—approximately 30 or more sub-
jects—so that there is some stability in the data). Nunnally and Bernstein also
discussed the situation wherein one variable is dichotomous and the other is nor-
mally distributed. They showed that the maximum value of the correlation is
about .80, which can occur only if the p value (difficulty index) of the dichoto-
mous variable is .50; as the p value deviates from .50 (in either direction), the
ceiling on the correlation becomes lower than .80.
To illustrate this characteristic of the correlation, we retained the original X
variable’s distribution (which is symmetrical) but altered the distribution of the Y
variable. As compared with the original distribution for Y (Y1), the distribution of
Y3 is skewed positively. (See the value of the skewness statistic in Table 2.) The
258 The Journal of Experimental Education
Lack of Linearity
The correlation measures the extent and direction of the linear relationship be-
tween X and Y. If the actual relationship between X and Y is not linear—rather,
if it is a curvilinear or nonlinear relationship—the value of r will be very low and
might even be zero. Although the relationships between most variables examined
in educational and behavioral research studies are linear, there are interesting ex-
amples of nonlinear relationships among adults between age and psychomotor
4
Y3
0
0 1 2 3 4 5 6 7
X
skills that require coordination (Glass & Hopkins, 1996). Also, some researchers
studying the relationships between anxiety and test performance have reported
curvilinear relationships (Hopkins, 1998). Abrami et al. (2001) described the
anxiety–test performance relationship: “One of the most famous examples of a
curvilinear relationship in the social sciences is the inverted U-shaped relation-
ship between personal anxiety and test performance. It is now well known that
‘moderate’ levels of anxiety optimize test performance” (p. 434).
The best way to detect a curvilinear relationship between two variables is to
examine the scattergram. If a curvilinear relationship exists between X and Y, the
Downloaded by [Washburn University] at 16:22 17 October 2014
4
Downloaded by [Washburn University] at 16:22 17 October 2014
Y4
0
0 1 2 3 4 5 6 7
X
characteristics of a dataset that can affect the size of r, an outlier’s effect will be
greater in a small dataset than in a larger one. The presence of an outlier in a
dataset can result in an increase or decrease in the size of the correlation, de-
pending on the location of the outlier (Glass & Hopkins, 1996; Lockhart, 1998).
To illustrate the effect of an outlier on the correlation between X and Y, we
added a 31st case to the original X and Y distributions (i.e., the X and Y1 distri-
butions in Table 1). We assigned a value of 9 on X and 10 on Y for this addition-
al case. The new scattergram is shown in Figure 4, and the value of r is .91. (Re-
call that the original correlation was .83; adding the outlier case “stretched” the
scattergram and increased the calculated value of r.)
As is the case with nonlinear relationships, one simple way to detect the pres-
ence of one or more outliers is to examine the scattergram; statistical outlier
analysis (e.g., Tukey, 1977) can also be useful above and beyond analysis by vi-
sual inspection. If an outlier is present, the researcher should first check for data
collection or data entry errors. If there were no errors of this type and there is no
obvious explanation for the outlier—the outlier cannot be explained by a third
variable affecting the person’s score—the outlier should not be removed. If there
Goodwin & Leech 261
12
10
6
Y1
Downloaded by [Washburn University] at 16:22 17 October 2014
0
0 2 4 6 8 10
X
is a good reason for a participant responding or behaving differently than the rest
of the participants, the researcher can consider eliminating that case from the
analysis; however, the case should not be removed only because it does not fit
with the researcher’s hypotheses (Field, 2000). Sometimes the researcher has to
live with an outlier (because he or she cannot find an explanation for the odd re-
sponse or behavior). Also, as Cohen (2001) noted, the outlier might represent an
unlikely event that is not likely to happen again—hence, the importance of repli-
cation of the study.
Measurement Error
of a test with itself (Lockhart, 1998); if a test does not correlate with itself, it can-
not correlate with another variable. Consequently, the reported value of r may
“substantially underestimate the true correlation between the underlying vari-
ables that these imperfect measures are meant to reveal” (Aron & Aron, 1994, p.
90). Thus, the reliability of a measure places an upper bound on how high the
correlation can be between the measured variable and any other variable; the re-
liability index, which is the square root of the reliability coefficient, indicates the
maximum size of the correlation (Hopkins).
The reduction in the size of a correlation due to measurement error is called
Downloaded by [Washburn University] at 16:22 17 October 2014
attenuation, and there is a correction for attenuation that allows one to estimate
what the correlation between the two variables would be if all measurement error
were removed from both measures (Hopkins, 1998; Muchinsky, 1996; Nunnally
& Bernstein, 1994). To use this equation, the researcher needs to know the relia-
bility of each measure:
rxy
r* = ,
rxx ryy
where r* is the estimated correlation; rxy is the calculated correlation between the
two variables; and rxx and ryy are the reliability coefficients of the measures of X
and Y, respectively. This equation really results in an estimate rather than a cor-
rection—that is, the estimate of the correlation between two variables if both
measures were perfectly reliable. Nunnally and Bernstein advised caution in the
use of the formula, especially because it can be used to fool one into believing
that a higher correlation has been found than what actually occurred—and, with
very small samples, the corrected correlation can even surpass 1.0! They also
noted some appropriate uses:
However, there are some appropriate uses of the correction for attenuation given
good reliability estimates. One such use is in personality research to estimate the
correlation between two traits from imperfect indicators of these traits. Determining
the correlation between traits is typically essential in this area of research, but if the
relevant measures are only modestly reliable, the observed correlation will underes-
timate the correlations among the traits. (p. 257)
As an example of the use of the correction for attenuation formula, assume that
the correlation between measures of two traits is .65; the reliabilities of the mea-
sures are .70 and .80. The estimated correlation between two perfectly reliable
measures here is .86.
In this article, we have discussed and illustrated six factors that affect the size
of a correlation coefficient. When confronted with a very low or zero correlation,
researchers should ask questions about these factors. Is there a lack of variabili-
264 The Journal of Experimental Education
result in a powerful illustration of this fact. If two raters tend to agree on the rel-
ative placement of participants’ scores but differ dramatically in the levels of the
scores assigned, the correlation will be very high and positive but the two raters’
means will differ greatly. Similarly, linear transformations of scores (such as con-
verting raw scores to z scores) will not change the correlation between those data
and another variable. A second common misconception is that sample size (N)
has a direct relationship to the size of r—a misconception that often results in er-
roneous interpretations of reliability and validity coefficients (Goodwin & Good-
win, 1999). Although small samples can result in unstable or inaccurate results
Downloaded by [Washburn University] at 16:22 17 October 2014
(Hinkle, Wiersma, & Jurs, 2003), the size of N itself has no direct bearing on the
size of the calculated value of r.
Another misconception students and researchers sometimes develop is that a
correlation can be interpreted as a proportion or a percentage; therefore, under-
standing the difference between r and r2 is a very useful way to prevent this mis-
conception. The limitations of statistical significance tests—particularly in terms of
the ease with which a correlation can be found to be statistically significant when
the sample size is very large—is another important aspect of the study of correla-
tion; distinguishing between statistical and practical significance can be crucial.
Finally, no discussion of correlation is complete without emphasizing that the cor-
relations found in correlational research studies cannot be interpreted as causal re-
lationships between two variables. However, as one of the reviewers of this article
pointed out, in an experimental study where random assignment is used, a correla-
tion (point biserial) can be computed. In that case, a causal inference can be drawn
for the relationship between the grouping variable and the outcome variable.
Given that correlations are so widely used in research in education and the be-
havioral sciences—as well as in measurement research aimed at estimating va-
lidity and reliability—it is critical that students have knowledge of the important
(and sometimes subtle) factors that can affect the size of r. Knowledge of the role
these factors play is also very helpful when students or researchers find unex-
pectedly low correlations in their research. An unexpectedly low correlation
might be “explained” by one or more of the factors that affect the size of r; know-
ing this, a researcher would be encouraged to continue with his or her line of re-
search rather than abandon it under the mistaken impression that there is no re-
lationship between the variables of interest. It is also important to note that some
factors—such as outliers and sample characteristics—can result in spuriously
high correlations. In all cases, researchers should be advised to carefully consid-
er possible contributing factors when interpreting correlational results.
REFERENCES
Abrami, P. C., Cholmsky, P., & Gordon, R. (2001). Statistical analysis for the social sciences: An in-
teractive approach. Needham Heights, MA: Allyn & Bacon.
Aron, A., & Aron, E. N. (1994). Statistics for psychology. Englewood Cliffs, NJ: Prentice-Hall.
266 The Journal of Experimental Education
Aron, A., & Aron, E. N. (2003). Statistics for psychology (3rd ed.). Englewood Cliffs, NJ: Prentice-
Hall.
Brase, C. H., & Brase, C. P. (1999). Understanding statistics: Concepts and methods (6th ed.).
Boston: Houghton Mifflin.
Carroll, J. B. (1961). The nature of the data, or how to choose a correlation coefficient. Psychome-
trika, 26, 247–272.
Cohen, B. H. (2001). Explaining psychological statistics (2nd ed.). New York: Wiley.
Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. Fort Worth, TX:
Harcourt Brace Jovanovich.
Fancher, R. E. (1985). The intelligence men. New York: Norton.
Field, A. (2000). Discovering statistics using SPSS for Windows. London: Sage.
Glass, G. V., & Hopkins, K. D. (1996). Statistical methods in education and psychology (3rd ed.).
Downloaded by [Washburn University] at 16:22 17 October 2014