Pearson's Correlation Coefficient
Pearson's Correlation Coefficient
Pearson's Correlation Coefficient
• Just because two variables are highly correlated does not mean
that one necessarily causes the other. Remember, one variable
has to be consistently dependent on the other. Just ask
yourself, would changing x bring about a change in y?
• For example, your fatigue during the summer months may be
influenced by the warm temperature of the day. However, if you
happen to experience a lot of fatigue on some other day, this
will not indicate the temperature level of the day.
• There are several ways to determine the value of r from a data
set.
• Remember that you don't calculate the Pearson's correlation
coefficient unless you have first determined if it is appropriate.
You should first use a scatter plot to establish if the data
indicates a linear relationship. If the data indicates that there
is a possible linear relationship, then the calculation of the r
value is appropriate.
• To compute the value of r, we can use the following formulas:
s xy
r=
∑ ( x − x )( y − y ) or
r=
sxsy
∑(x − x) (y − y)
2 2
s xy = r • s x s y
x 2 3 4 6 8 9 10
y 21 19 18 17 15 13 12
First we input the data into the GDC and draw a scatter plot to
determine if it is appropriate to use Pearson's correlation
coefficient.
s xy
r=
sxsy
s xy = r • s x s y
• Example 3:
The following table represents the final averages of ten
students in math and science. Given that the covariance,
sxy = 465.66, calculate the product moment correlation
coefficient, r, correct to 2 decimal places.
WARNING!
We want sx and sy, but in the calculator we will use the σ x and
σy .
σx ≈
σy ≈
s xy
r= = _____________________ ≈
sx sy
Now, use the GDC to check this value of r. Use Stat, Calc, and
LinReg.
Comments about the findings: