Scatterplots and Correlation: Some Notes
Scatterplots and Correlation: Some Notes
Scatterplots
Correlation:
Variables x and y for n individuals. Correlation measures the direction and strength of a relationship.
1 ⎛ xi − x ⎞ ⎛ yi − y ⎞
r= ∑ ⎜ ⎟⎜ ⎟
n − 1 ⎝ sx ⎠ ⎜⎝ s y ⎟⎠ = an average of the products of the standardized observations in x
and in y for n individuals.
Properties
9 Correlation makes no use of the distinction between explanatory and response variables
9 Requires that both variables be quantitative
9 The correlation does not change when we change the units of measurements of x, y, or both
9 The correlation r itself has no units; it is just a number
9 Positive r indicates positive association and negative r indicates negative association
9 The correlation r is always a number between -1 and 1. Values near 0 indicate weak linear
relationship. The strength of the relationship increases as r moves away from zero toward -1 or 1.
The extreme values -1 and 1 occur when the points in a scatterplot lie exactly along a straight line
9 Correlation measures the strength of only linear relationships. It does not describe curved
relationships, no matter how strong they are.
9 Like the mean and standard deviation, the correlation is not resistant to outlying observations.
Î Correlation alone is not a complete description of bivariate data. Always include the means and
standard deviations.