Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
4 views

Week 12 Slides_Notes

Week 12 of PSY201 focuses on correlation, including hypothesis testing with the Pearson correlation coefficient and interpreting the relationship between two continuous variables. Upcoming assignments and final exam details are provided, along with a review session for the final exam. The document outlines steps for identifying which statistical tests to use and includes practice problems related to correlation.

Uploaded by

8scvf6qgxk
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Week 12 Slides_Notes

Week 12 of PSY201 focuses on correlation, including hypothesis testing with the Pearson correlation coefficient and interpreting the relationship between two continuous variables. Upcoming assignments and final exam details are provided, along with a review session for the final exam. The document outlines steps for identifying which statistical tests to use and includes practice problems related to correlation.

Uploaded by

8scvf6qgxk
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

#Statistics/Weeks 12

CORRELATION

PSY201 LEC 0201: Statistics I


Dr. Patricia Y. Sanchez
Week 12
OVERVIEW

• Intro to Correlation
• Hypothesis Testing with r
• Interpreting Correlation
• Wrap up!

2
COMING UP!

• Science Communication Assignment


• Initial responses due Friday, Nov 22 at 11:59pm
• Responses due by Tuesday, Dec 3 at 11:59pm
• Make sure your RLJs are all in!
• All due by Dec 3
• Final exam info
• Tuesday, Dec 10 9am-12pm
• In-person, two different classrooms
• See syllabus for breakdown (also on next slide) 3
FINAL EXAM ROOM SPLIT

• Last Names A-HO: Room MS 2170 (Medical Sciences Building)

• Last Names HU-ZZ: Room MS 3153 (Medical Sciences Building)

4
FINAL EXAM REVIEW SESSION

• “EXAM JAM” happening in Sid Smith on Dec 4

• I will be holding a review session for this section of PSY201


• Wednesday, Dec 4 from 10am-11:30am in SS 1071
• Strictly optional,
• I’m not planning on recording BUT I’ll be reviewing materials that
you’ll have access to
• In other words, this is a review session à I won’t be introducing
anything new at this point!
5
COMING UP!

• Course evaluations now available on Quercus!


• They close Wed, Dec 4

• See announcement on Quercus for more info

6
IN YOUR STATISTICAL TOOLBOX…

Descriptive Statistics Inferential Statistics

• Tables • Standard normal curve

• Graphs • Distribution of sample means

• Measures of Central Tendency • z-test for one sample mean

• Measures of Variability • t-test for one sample mean


• t-test for two independent sample means
• t-test for two related sample means (new)
• One-way ANOVA (new)

7
STEPS TO IDENTIFYING WHICH TEST TO USE

Is there an IV?

No Yes

Must be a z-score, How many levels?


z-test, or one-
sample t-test
2 3+

Between or
ANOVA
within subjects?
Between Within

Independent Paired samples


samples t-test t-test 8
INTRO TO CORRELATION

9
SO FAR WE HAVE BEEN COMPARING GROUPS…

• We’ve compared two groups:


• Independent samples t-test
• Related samples t-test
• We’ve compared 3 groups:
• One-way ANOVA

• But what about two continuous variables


(not categorical)? What if we want to look
at everyone, and not just groups?
10
RESEARCH PROBLEM

• What is the relationship between hours studying and


scores on a quiz?
• Conduct a non-experimental study
• n=6 students
• Measure hours studying for an exam (X)
• Record each student’s quiz score (Y)
• Examine associations between hours studying and quiz
scores
• Is study time associated with quiz scores?
11
RESEARCH PROBLEM

• Correlation
• Direction and strength of an association between two variables
(X,Y)
• Typically (but not only) used in non-experimental research
• Variables are measured, not manipulated
• Other examples
• Relationship between stressful life events (X) and number of
illness symptoms (Y)
• Relationship between years of education (X) and yearly income (Y)
12
TOOLS FOR CORRELATION

• The Scatterplot
• A figure
• Shows association between two variables

• The Pearson correlation coefficient


• A statistic
• Describes the direction and strength of a linear association
between two continuous variables

13
THE SCATTERPLOT

• Hours studying and quiz scores


Student Study Test Score
Hours (Y)
(X)
A 1 1
n=6 people, B 1 3
6 pairs of scores
C 3 4
D 4 5
E 6 4
n=6 F 7 6
14
THE SCATTERPLOT

• Hours studying and quiz scores


Student Study Test Score
Hours (Y)
(X)
A 1 1
B 1 3
C 3 4
D 4 5
E 6 4
F 7 6

15
SEEING RELATIONSHIPS

16
SUMMARIZING RELATIONSHIPS

17
SUMMARIZING RELATIONSHIPS

18
SUMMARIZING RELATIONSHIPS

19
SUMMARIZING RELATIONSHIPS

20
DESCRIBING RELATIONSHIPS

• When we talk about statistical relationships, we begin by


assessing the __________________, or degree to which two variables
vary together

• This statistic is used as the basis for the correlation coefficient, a


statistic that measures the relationship between variables.
• Pearson’s product-moment correlation: r
• Spearman’s rank-order correlation: rs
• Point-biserial correlation: rpb

21
THE CORRELATION COEFFICIENT: BASICS

• Pearson Correlation Coefficient


• Symbol: r
• Ranges from -1.0 to +1.0
• Sign (+/-) indicates “direction” of relationship
• Value indicates “strength” of relationship

• Some general guidelines:


• ±.10 is weak
• ±.30 is moderate
• ±.50 is strong 22
THE CORRELATION COEFFICIENT

Positive Correlation
X = Temperature
Y = Beer sales

23
THE CORRELATION COEFFICIENT

Negative Correlation
X = Temperature
Y = Coffee sales

24
THE CORRELATION COEFFICIENT

Perfect
Negative No Linear
Correlation Trend

r = -1.00 r = 0.00

Strong Weak
Positive Negative
Correlation Correlation

r = +0.90 r = -0.40

25
COMPUTING R

"#$%## &' ()*+) ,&. /0%1 &'$#&)#%


r =
"#$%## &' ()*+) ,&. /0%1 2#30%0&#41

+'/0%*05+# '6 ,&.


r =
/0%*05+# '6 ,&.

26
COVARIABILITY OF X AND Y

Covariance
between
X and Y
Variance Variance
in X alone in Y alone

X XY Y

27
COMPUTING R

• Computational formulas for Pearson r


SP = similar to
78 SS, but for
r = covariance
77! 77"
• Where:
∑:∑;
SP = ∑ "# − • SP = “Sum of products”
5 • SS = “Sum of squares”

# #
(∑ :) (∑ ;)
SSX = ∑ " < − SSY = ∑ # < −
5 5
28
LEARNING CHECK

• A scatterplot shows a set of data points that fit very loosely


around a line that slopes down to the right. Which of the
following values would be closest to the correlation for these
data?

• A. 0.75
• B. 0.35
• C. -0.75
• D. -0.35
29
LEARNING CHECK

• Decide if each of the following statements is True or False.

• A set of n=10 pairs of X and Y scores has !X = !Y = !XY = 20.


For this set of scores, SP = -20.
• If the Y variable decreases when the X variable decreases,
their correlation is negative.

31
HYPOTHESIS TESTING WITH R

34
HYPOTHESIS TESTING FOR R

• State research question


• Is there a significant linear association between X & Y?
• Is r significantly different from zero?
• ⍴ = “rho” the population parameter
• r = sample statistic

35
HYPOTHESIS TESTING FOR R

• Step 1: Statistical Hypotheses for r


• Almost always two-tailed (non-directional)
• H0: ⍴ = 0
• H1: ⍴ ≠ 0
• One-tailed upper (directional)
• H0: ⍴ ≤ 0
• H1: ⍴ > 0
• One-tailed lower (directional)
• H0: ⍴ ≥ 0
• H1: ⍴ < 0 36
HYPOTHESIS TESTING FOR R

• Step 2: Find critical value of r (Table)


• Need 3 pieces of info:
• !
• One-tailed or two-tailed?
• Degrees of freedom: df = n-2

37
HYPOTHESIS TESTING FOR R

38
HYPOTHESIS TESTING FOR R

• Step 2: Find critical value of r (Table)


• Need 3 pieces of info:
• !
• One-tailed or two-tailed?
• Degrees of freedom: df = n-2

• Step 3: Compute observed r


• Step 4: Make a decision
• Reject H0 if observed r exceeds rcritical
• Step 5: Summarize and report findings
39
LET’S PRACTICE!

• Research Question
• Is there a significant linear association between hours
studying and quiz score?
• Is r significantly different from zero?
• Step 1: Statistical Hypotheses
• H0: ⍴ = 0
• H1: ⍴ ≠ 0
• Step 2: Find rcritical in Table # = .05
• Two-tailed
• df = n-2; df = 6 - 2 = 4 40
LET’S PRACTICE!

41
LET’S PRACTICE!

• Research Question
• Is there a significant linear association between hours
studying and quiz score?
• Is r significantly different from zero?
• Step 1: Statistical Hypotheses
• H0: ⍴ = 0
• H1: ⍴ ≠ 0
• Step 2: Find rcritical in Table # = .05
• Two-tailed
• df = n-2; df = 6 - 2 = 4
• From table à rcrit = ±.811 42
LET’S PRACTICE!

• Step 3: Compute observed r


• Steps in computing r:
• Compute SSX
• Compute SSY
• Compute SP
• Compute r

43
LET’S PRACTICE!

!
! (∑ $)
SSX = ∑ " −
• Hours studying and quiz scores &

Student Study Test Score


Hours (Y)
(X)
A 1 1
B 1 3
C 3 4
D 4 5
E 6 4
F 7 6

n=6 !X = 22 44
LET’S PRACTICE!

!
! (∑ $)
SSX = ∑ " −
• Hours studying and quiz scores &

Student Study Test Score X2


Hours (Y)
(X)
A 1 1 1
B 1 3 1
C 3 4 9
D 4 5 16
E 6 4 36
F 7 6 49

n=6 !X = 22 !X2 = 112 45


LET’S PRACTICE!

• Step 3.1: Compute SSX

!
(∑ $)
SSX = ∑ " ! −
&

!!!
SSX = 112 − = &'. &&&
'

46
LET’S PRACTICE!

!
! (∑ ')
SSY = ∑ % −
• Hours studying and quiz scores &

Student Study Test Score X2


Hours (Y)
(X)
A 1 1 1
B 1 3 1
C 3 4 9
D 4 5 16
E 6 4 36
F 7 6 49

n=6 !X = 22 !Y = 23 !X2 = 112 47


LET’S PRACTICE!

!
! (∑ ')
SSY = ∑ % −
• Hours studying and quiz scores &

Student Study Test Score X2 Y2


Hours (Y)
(X)
A 1 1 1 1
B 1 3 1 9
C 3 4 9 16
D 4 5 16 25
E 6 4 36 16
F 7 6 49 36

n=6 !X = 22 !Y = 23 !X2 = 112 !Y2 = 103 48


LET’S PRACTICE!

• Step 3.2: Compute SSY

!
(∑ ()
SSY = ∑ ) ! −
&

!)!
SSY = 103 − = '*. +&&
'

49
LET’S PRACTICE!

∑$∑'
SP = ∑ "% −
• Hours studying and quiz scores &

Student Study Test Score X2 Y2


Hours (Y)
(X)
A 1 1 1 1
B 1 3 1 9
C 3 4 9 16
D 4 5 16 25
E 6 4 36 16
F 7 6 49 36

n=6 !X = 22 !Y = 23 !X2 = 112 !Y2 = 103 50


LET’S PRACTICE!

∑$∑'
SP = ∑ "% −
• Hours studying and quiz scores &

Student Study Test Score X2 Y2 XY


Hours (Y)
(X)
A 1 1 1 1 1
B 1 3 1 9 3
C 3 4 9 16 12
D 4 5 16 25 20
E 6 4 36 16 24
F 7 6 49 36 42

n=6 !X = 22 !Y = 23 !X2 = 112 !Y2 = 103 !XY = 102 51


LET’S PRACTICE!

• Step 3.3: Compute SP

∑"∑#
SP = ∑ "# −
$

%% %&
SP = &'( − = &*. ,,*
'

52
LET’S PRACTICE!

• Step 3.4: Finally, compute r!

"#
r =
""! """

$%.''% $%.''%
r = = = +. %&'
()$.))))($+.,)))) -$..%$

53
LET’S PRACTICE!

• Step 4: Make a Decision


• Reject H0: robs (+.819) exceeds rcrit (±.811)

• Step 5. Summarize and report findings


• “There was a statistically significant positive correlation
between hours studying and quiz scores, r(4) = .82, p < .05,
two-tailed, r2 = .67. Students who studied longer earned
higher scores on the quiz.”
Notice: No causal
language!
54
LET’S PRACTICE!

• Compute r2
• Effect size
• r2 = .8192 = .67
• 67% of the variance in quiz scores is explained by hours
studying (and vice versa)

55
REPORTING AN R

• A closer look… One- or


Alpha
Test two-tailed?
level
Statistic

r(4) = .82, p < .05, two-tailed, r2 = .67


Observed
Degrees of Value Effect
freedom Significance Size
Sig? p < &
Nonsig? p > &
56
ASSUMPTIONS FOR PEARSON’S R

• Quantitative data (continuous)


• Independent observations
• Random sampling
• Linear relationship

57
LEARNING CHECK

• Decide if each of the following statements is True or False.

• You could use the Pearson correlation to examine the


relationship between people’s age and size of popcorn (S, M, L)
they purchase at the movies.
• The direction of the relationship between X and Y is determined
by the sum of the products.

58
INTERPRETING FINDINGS

61
PROCEED WITH CAUTION…

• 1. Correlation is sensitive to outliers


• 2. Correlation is only appropriate for describing linear relationships
• 3. Correlation is sensitive to restriction of range (lack of generalization)
• 4. Beware of heterogenous samples
• 5. Correlation does not imply causation

62
1. SENSITIVE TO OUTLIERS

• An outlier is an extremely deviant individual in the sample


• Characterized by a much larger (or smaller) score than all others in the sample
• In a scatterplot, the point is clearly different from all the other points
• Outliers produce a disproportionately large impact on the correlation coefficient 63
2. LINEAR RELATIONSHIPS ONLY

64
3. RESTRICTION OF RANGE

65
4. HETEROGENEOUS SAMPLES

66
5. CORRELATION IS NOT CAUSATION

67
BIG GAP IN STATISTICAL LITERACY ON CORRELATION

68
BIG GAP IN STATISTICAL LITERACY ON CORRELATION

69
BIG GAP IN STATISTICAL LITERACY ON CORRELATION

70
BIG GAP IN STATISTICAL LITERACY ON CORRELATION

71
SO, WHAT HAVE WE LEARNED?

72
73
CORRELATION PRACTICE

74
CORRELATION ACTIVITY

• On the next slides, you will be given a headline and some “facts” from
the article. Answer the following questions:

• 1. Does the data support the headline?


• 2. What are some “third-variable” or alternative explanations?
• 3. How could you reword the headline?

75
1. HEADLINE:
“DIET OF FISH ‘CAN PREVENT’ TEEN VIOLENCE”

• Participants were a group of 3-year-olds given an “enriched diet,


exercise, and cognitive stimulation.” They were compared to a
control group who did not go through this same program.
• By age 23 they were 64% less likely than a control group of children
not on the program to have criminal records.
• Assume, of course, that the enriched diet included fish.
• Note, also, that the media article does not mention what the other
kids ate or did.
76
2. HEADLINE:
“HIGHER BEER PRICES ‘CUT GONORRHEA RATES’”

• The research suggests “that raising the price of a six-pack of beer by


20 cents would cut gonorrhea rates by almost 9%.”
• Researchers considered gonorrhea rates from 1981 to 1995 among
teens and young adults in states that raised the legal drinking age or
increased the state beer tax.
• “Of the 36 beer tax increases that we reviewed, gonorrhea rates
declined among teens aged 15 to 19 in 24 instances. Among young
adults aged 20 to 24, they declined in 26 instances.”
• Important side note: 1981 is also when the CDC recognized AIDS and
HIV; condoms protect against both HIV and gonorrhea.
77
3. HEADLINE:
“FEAR OF HELL MAKES US RICHER, FEDS SAY”

• Studied 35 countries including the United States, Japan, and Turkey.


• Found that religion shed some “useful light.”
• In countries where large percentage of the population believe in hell,
there seems to be less corruption and a higher standard of living.
• For instance, 71% of the U.S. population believe in hell and the
country boasts the world’s highest per capita income.

78
NAME THAT CORRELATION!

79
NAME THAT CORRELATION!

80
NAME THAT CORRELATION!

81
NAME THAT CORRELATION!

82
WHAT’S WRONG WITH THIS PICTURE?

83
WHAT’S WRONG WITH THIS PICTURE?

84
WHAT’S WRONG WITH THIS PICTURE?

85
WHAT’S WRONG WITH THIS RESEARCH?

• “The data showed a strong and highly significant positive correlation


between date of onset of sexual activity and current level of sexual
activity (r = 0.75, p < .01), suggesting that teenagers who begin having
sex at an earlier age are more promiscuous in college as a result.”

• “The negative correlation coefficient shows that there is no


relationship between these traits.”

• “The correlation was significant (r = -1.22)…”


86
WRAPPING UP

All remaining problem sets (see Quercus and MindTap)

All RLJs due by Dec 3

Science Communication responses due Dec 3

87
88

You might also like