Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
3 views

Unit 4 Study Guide

Uploaded by

nuhamowlana2004
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Unit 4 Study Guide

Uploaded by

nuhamowlana2004
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Unit 4.

Evaluating association claims


Overview
Each self-paced unit of the course includes readings, two lessons, a knowledge check,
and an assignment. Please use the checklist below to navigate through this week’s
activities.

Readings and preparation


¨ This Unit 3 Study Guide
¨ From the free online textbook: Research Methods in Psychology. Focus on:
¨ Section 57. Understanding null hypothesis significance testing.
¨ Section 59. Additional considerations
¨ Section 28. Overview of non-experimental research
¨ Section 29. Correlational research
¨ Section 30. Complex correlations – focus on: Assessing Relationships Among
Multiple Variables”, specifically the content on how to read a correlation matrix.
(For now, you can ignore the sections on factor analysis, causal relationships,
and regression).
¨ Illustrative article: Mehl et al. (2007) Are Women Really More Talkative than Men?
¨ Illustrative article: Mehl et al. (2010) Eavesdropping on happiness…

Lesson 4A. Understanding statistical inference


 4A.1. Video: Understanding statistical significance [19:43 min]
 4A.2. Try it! Interpreting statistical results [~10 min]
 4A.3. Video: Statistical significance in context [16:31 min]
 4A.4. Video: Understanding correlation coefficients [11:54 min]
 4A.5. Try it! Interpreting correlation coefficients [~ 10 min]

Lesson 4B. Reading and understanding research articles


 4B.1. Video: Effect sizes and interpretation [24:49 min]
 4B.2. Video: Types of peer reviewed articles [3:53 min]
 4B.3. Video: Reading empirical journal articles [6:53 min]
 4B.4. Try it! Parts of a research article [~20 min]

Assignments
 Unit 4 Knowledge Check
 Media Assignment: Steps 1 & 2
 Library lab
Unit 4 Learning Outcomes
By the end of this unit you should be able to:
 Recognize a variety of descriptive statics.
 Recognize the relationship between margin of error and confidence intervals.
 Interpret statistical significance using a confidence intervals approach and a p-
values approach.
 Assess the statistical validity of group comparisons and correlational claims.
 Interpret Cohen’s d and Pearson’s correlation coefficient r effect size indicators.
 Differentiate between Type 1 and Type 2 errors.
 Recognize the importance of margin of error, statistical significance, effect sizes,
and power analysis in relation to one another.

Infographic: Evaluating statistical validity

In this unit, we will introduce a framework for evaluating statistical validity. This
framework focuses on asking questions about:
 Margin of error. In acknowledging that data from samples can only provide a
rough point estimate of the true population parameter, what is the estimated
margin of error / confidence interval around the estimate?
 Statistical significance. Can an effect still be detected, even once this margin of
error has been accounted for?
 Effect size. If the effects are “statistically significant”, what is the general size of
the effect? How does the size of the effect impact interpretation?
 Statistical power. Is the sample size large enough to trust the indicators of
statistical significance / effect size? What are the risks of Type 1 and Type 2
error?
 Direct replication. Have these results been directly replicated with independent
samples? Do the results replicate? Can we trust the overall pattern of results?

PSYB70 Unit 3 Study Guide, Page 2


Lesson 4A. Understanding statistical inference
4A.1. Understanding statistical significance

Illustrative article: Mehl et al. (2007)


To help us learn about statistical interpretation, we will be focusing on the statistical
results presented in Mehl et al.'s (2007) article focused on testing the research question,
"Are women really more talkative than men?"

“Significance testing”

 Prior to analyzing their data, researchers must decide what decision rules they
will use to test their hypotheses.

 Null hypothesis significance testing (NHST) includes a set of decision rules that
can help a researcher use the margin of error to determine if an observed effect
is extreme enough to “reject the null hypothesis” and conclude that the
researcher’s alternative hypothesis is supported.

Is it “statistically significant”?
 An effect is considered “statistically significant” if an effect can be detected
(effect ≠ 0), even when the margin of error is factored in (the effect shines
through the “noise”).
 An effect is considered “not statistically significant” if the margin of error is so
great that it calls into question whether an effect exists or not (the “noise” drowns
out any effects).

What do we mean by an “effect”?


 An effect is the specific outcome that you are testing.
 Group comparisons: A type of effect that compares two or more groups.
 Correlation: A type of effect that examines the association between variables.

It all starts with the assumption that there is “no effect”


 Significance testing starts with an assumption that there is no effect. This is
called the null hypothesis (or null effect). It is the opposite of the researcher’s
hypothesis that there is an effect.
 The null hypothesis is the starting assumption that there is no effect (i.e., there
are no group differences, this is no association), i.e. effect = 0.
o Women and men do not differ in the total words spoken per day.
o Small talk is not associated with well-being.
 The researcher’s hypothesis is that there is an effect, i.e., there are group
differences or an association, i.e., effect ≠ 0.
o Women and men do differ in the total words spoken per day.
o Small talk is associated with well-being.

PSYB70 Unit 3 Study Guide, Page 3


Approaches
In our course, we are going to use two general approaches for assessing statistical
significance: (a) the confidence interval approach and (b) the p-values approach. Follow
along in-class or with the video to see explanations and examples of both approaches.

The confidence interval approach


The researcher starts by using hand calculations or a computer program to construct a
confidence interval around an effect (e.g., the mean difference or a correlation
coefficient).

The researcher then assesses if the confidence interval around the effect include zero?
 If no, the results are statistically significant (effect ≠ 0)
 If yes, the results are not statistically significant (effect = 0)

If the confidence interval of the effect does not include “0”:


 The results are “statistically significant”
 Reject the null hypothesis
 The researcher can take the next steps to explore their hypothesis.

If the confidence interval of the effect does include “0”:


 The results are “not statistically significant”
 Fail to reject the null hypothesis
 The researcher’s hypothesis is not supported

PSYB70 Unit 3 Study Guide, Page 4


The p-values approach

Step 1. Set the significance level


Similar to the confidence interval approach, the researcher starts by identifying the
level of confidence (e.g., 95% confidence level; 99% confidence level – see the
lesson on margin of error for a discussion of confidence levels).

This confidence level is used to identify the “significance level” (expressed using
alpha, α). If a researcher adopts a 95% confidence level, this produces a 5%
significance level (α = .05). If a researcher adopts a 99% confidence level, this
produces a 1% significance level (α = .01), as outlined below.

95% confidence level (α = .05)


 A 95% confidence interval defines a range of values likely to capture the
“true” value in 95 out 100 samples.
 5% risk of error
 Significance level = 5%
 α = .05

99% confidence level (α = .01)


 A 99% confidence interval allows us to feel more confident in our sample,
but at the cost of widening the confidence interval around each point
estimate.
 1% risk of error
 Significance level = 1%
 α = .01

Most psychologists adopt a 95% confidence level (α = .05)


 A 95% confidence level strikes the balance between precision and
likelihood of error.

Step 2. Calculate the effect and p-value


The researcher can then use hand-calculations or a computer software program to
calculate the effect (e.g., the mean difference or a correlation coefficient, etc.) and the
probability value that an effect this large would emerge if the effect were actually zero.
This probability value is called the p-value.
 P-value: The calculated probability that an effect this large would emerge if the
effect were actually zero.

PSYB70 Unit 3 Study Guide, Page 5


Step 3. Compare the p-value to the statistical significance
If the probability value that an effect this large would emerge if the effect were actually
zero (i.e., the p-value) is lower than the significance level, the results can be considered
“statistically significant”.

 If the p-value is less than alpha (e.g., α = .05 and p < .05)
o The results are “statistically significant”
o Reject the null hypothesis
o The researcher can take the next steps to explore their hypothesis.

 If the p-value is greater than alpha (e.g., α = .05 and p > .05)
o The results are “not statistically significant”
o Fail to reject the null hypothesis
o The researcher’s hypothesis is not supported

Two approaches; same foundation


Although we have presented two approaches for assessing statistical significance, they
are both rooted in the same foundation. As such, we can use one (i.e., p-values) to
make inferences about the other (confidence intervals) and vice versa.

If p < α, then the confidence interval of the effect does not include “0”.
If the confidence interval of the effect does not include “0”, then p < α.
 The 95% confidence interval of the effect does not include “0”.
 The probability of observing an effect this large if the effect were “0” is < α.
 The results are “statistically significant”
 The researcher can take the next steps to explore their hypothesis.

PSYB70 Unit 3 Study Guide, Page 6


If p > α, then the confidence interval of the effect does include “0”
If the confidence interval of the effect does include “0”, then p > α.
 The 95% confidence interval of the effect does not include “0”.
 The probability of observing an effect this large if the effect were “0” is > α.
 The results are “not statistically significant”
 Fail to reject the null hypothesis
 The researcher’s hypothesis is not supported

4A.2. Try it! Interpreting statistical significance

Follow along in-class or online to test your understanding of the article published by
Mehl et al. (2007). (See Quercus to access the article).

 Mehl, M. R., Vazire, S., Ramírez-Esparza, N., Slatcher, R. B., & Pennebaker,
J. W. (2007). Are women really more talkative than men? Science, 317(5834),
82-82.

Understanding effect size


 Statistical significance can tell you if an effect is likely to be different from zero.
However, it does not tell a researcher the size of the effect.
 As such, researchers need to also consider the size of an effect.
 An effect-size considers the size of the group difference and/or the strength of
the association.
 Cohen’s d is an effect size used to compare the means across two groups.

Cohen's d
Small | 0.20 |
Medium | 0.50 |
Large | 0.80 |

PSYB70 Unit 3 Study Guide, Page 7


When people here about the difference in two groups being “statistically significant”,
they likely picture the image in the lower right below (labeled Cohen’s d = 3.0). But in
reality, most effects in psychology are small (Cohen’s d = .20) to medium (Cohen’s d
= .50) effects, and very rarely large effects (Cohen’s d = .80). Therefore, for most
findings in psychology, the differences between groups are subtle, not huge.

4A.3. Statistical significance in context

Cautions about statistical significance

 Statistical significance only tells you if the effect is likely to different from “0” in
the population for which the sample represents.
 Statistical significance tells us nothing about the size of the effect. (With large
enough sample sizes, even very small effects might be statistically significant).
 Tests of statistical significance are not at all reliable when sample sizes are low.
 Because the data from samples are merely estimates of the true population
parameters, we are always at risk of making an error.

PSYB70 Unit 3 Study Guide, Page 8


Two types of statistical error

In real life: The null is In real life: The null is


TRUE (there is no effect) FALSE (there is an effect)
The data suggests that
Correct conclusion Type 2 error
there is no effect
The data suggests that
Type 1 error Correct conclusion
there is an effect

 Type 1 error (false positive): This occurs when the researcher concludes that
there is an effect, when in reality there is not one.
 Type 2 error (false negative): This occurs when the researcher concludes that
there is not an effect, but the reality is that there is one.

Statistical significance testing controls for Type 1 error

 Statistical significance testing is designed to prevent researchers from making


Type 1 errors.
 The stricter that you are in trying to avoid Type 1 error, the greater your chance
of making a Type 2 error (and vice versa).
 Adopting a 99% confidence level decreases the risk of Type 1 error (false
positive), but increases the risk of Type 2 error (false negative).
 Adopting a 90% confidence level decreases the risk of Type 2 error (false
negative), but increases the risk of Type 1 error (false positive).
 A 95% confidence interval strikes the balance between Type 1 and Type 2 error.
 Increasing sample size also helps to control for both Type 1 and Type 2 error.

Interpreting significant effects


 An effect different from the null hypothesis can be detected
 BUT, there is always a possibility of Type 1 error due to:
o Sampling error: The sampling method could be biased.
o Low power: Low sample size could result in an inflated estimate.
o Measurement error: Unreliable measurements could inflated effects.
o p-hacking: Explorative or biased analysis could inflate the effects.

Interpreting null effects


 An effect different from the null hypothesis cannot be detected
 BUT, there is always a possibility of Type 2 error due to:
o Sampling error: The sampling method could be biased.
o Low power: Low sample size could mask real effects.
o Measurement error: Unreliable measurements could mask real effects.
o High variability: A lot of variability could mask real effects.

PSYB70 Unit 3 Study Guide, Page 9


Statistical power
What about Type 2 error?
• Statistical power: The probability a study will detect an effect if that effect
actually exists.
• Power analysis: The size of a sample that would be needed to statistically
detect an effect at a given significance level and desired level of power.

Understanding statistical power


 Statistical power is affected by sample size and effect size.
• Larger effects are easier to detect than smaller ones.
• Larger sample sizes make it easier to detect an effect.
• Smaller sample sizes increase the risk of both Type 1 and Type 2 errors.

The importance of sample size


 Increasing sample size is a key way to reduce statistical errors!

NHST cautions
 An over-reliance on null hypothesis significance testing led to:
o Questionable data mining and data exploration practices.
o An increase in false positive results being published.
o A failure to replicate some core findings in the literature.

Psychology's replication crisis


 The replication crisis in psychology (and other fields of study) arose when
key scientific findings could not be independently replicated by other research
teams.
o Researchers ignored important rules and assumptions around null
hypothesis significance testing (NHST).
o Researchers failed to interpret their findings within the context of
important caveats and limitations to NHST.

Limits to NHST
 Not reliable at all when sample sizes are low.
 Tells one nothing about the size of the effect.

PSYB70 Unit 3 Study Guide, Page 10


4A.4. Understanding correlation coefficients

Correlation: A type of effect that examines the association between two variables.

Illustrative article: Mehl, M. R., Vazire, S., Holleran, S. E., & Clark, C. S. (2010).
Eavesdropping on happiness: Well-being is related to having less small talk and more
substantive conversations. Psychological science, 21(4), 539-541.

Describing and visualizing association

Interpreting association
 Bar graph – A visualization of the differences between groups (often expressed
as mean differences).
 Scatterplot – A visualization of the correlation between variables.
 Positive correlation – the two variables co-vary in the same direction
o as one variable increases, the other variable increases
o as one variable decreases, the other variable decreases
 Negative correlation – the two variables co-vary in opposite directions
o as one variable increases, the other variable decreases
o as one variable decreases, the other variable increases

Interpreting correlation coefficients


 Correlation coefficient – A numerical representation of the correlation that
varies between -1 and +1.
 Direction of the correlation – the sign of the correlation indicates its direction
o Positive sign (0 to +1) indicates a positive correlation
o Negative sign (-1 to 0) indicates a negative correlation
 Strength of the correlation – the size of correlation indicates its strength
o Correlations closer to |0| are weaker correlations
o Correlations closer to |1| are strong correlations

PSYB70 Unit 3 Study Guide, Page 11


Reading correlation tables

 Line up the row and the column of the correlational table to find the correlation.

 What can we infer about these results?

Inferential statistics
• Inferential statistics help researchers determine if an effect is strong enough to
be detected above and beyond this assumed amount of sampling error.

The asterisk as a short hand for significance


 The asterisk (*) is commonly used as a short-hand for presenting p-values.
 Look at table notes for details, but often the following short hand is used:
o * p < .05
o ** p < .01
o *** p < .001
o No asterisk = not statistically significant

The p-values for correlation coefficients are interpreted the same way:
 If the p-value is less than alpha (e.g., p < .05):
o The 95% confidence interval of the effect does not include “0”.
o The researcher can reject the null hypothesis
o The results are “statistically significant”.
o The researcher can take the next steps to explore their hypothesis.
 If the p-value is greater than alpha (e.g., α = 05): p < .05
o The 95% confidence interval of the effect does include “0”.
o The researcher cannot reject the null hypothesis
o The results are not “statistically significant”.
o The researcher must conclude their hypothesis is not supported.

4A.5. Try it! Interpreting correlation coefficients


Use this practice article critique to assess your understanding of the illustrative article by
Mehl et al. (2010).

PSYB70 Unit 3 Study Guide, Page 12


Lesson 4B. Reading and understanding research articles

4B.1. Effect sizes and interpretation

How big is the effect?


 Effect size: considers the strength of the relationship or effect between two or
more variables.

Cohen's d Correlation r
Small | 0.20 | | .10 |
Medium | 0.50 | | .30 |
Large | 0.80 | | .50 |

Are the conclusions valid?


 Do women talk more than men?
o Mehl et al. (2007) found a mean difference of 546 words spoken between
the 186 female and 210 male students, but concluded that “the data fail to
reveal a reliable sex difference in daily word use” (p. 82).

 Is talking linked to greater wellbeing?


o Mehl et al. (2010) found that the correlation between well-being and
substantial talk was, r = .28, and the correlation with small talk was r =
-.33. They concluded that “higher well-being was associated with having
less small talk and having more substantive conversation” (p. 539-540).

Do the effects replicate?


 Within any given study it is impossible to know if the results are “true” or due to
Type 1 or Type 2 error.
 Too many factors influence the results:
o The size of the “true” effect, BUT ALSO:
o Sampling error (the sampling method may be biased).
o Low power (low sample size may exaggerate the estimate).
o Measurement error (unreliable or insensitive measurements).
o *p-hacking (explorative or biased analysis of the data).
 Replication. When an effect is independently replicated across multiple studies,
one can feel more confident that it is a “true” effect.

Meta-analysis
 A meta-analysis averages the effect size for each study () to calculate an
overall effect size ().

PSYB70 Unit 3 Study Guide, Page 13


Calculating the overall effect
 A meta-analysis averages the effect size for each study () to calculate an
overall effect size, .
 Sometimes different studies will be given different weights in the calculation
based on the sample size, quality of the design, etc.
o  = lower weight
o  = higher weight
 The distribution of the effect sizes, along with the overall effect size can be
compared to a line of ‘no effect’ |. This line defines the null hypothesis.
o Values to the right of the line of no effect represent studies with a positive
effect size (+r, +d, etc.)
o Values to the left of the line of no effect represent studies with a negative
effect size (-r, -d, etc.)

Interpreting Forest Plots

4B.2. Types of peer reviewed articles

Three types of peer reviewed articles

Follow along with the lecture to identify the term that goes with each description.
Then use the 'try it' exercise to test your understanding:

__________________ An article that publishes the purpose, methods, and


results of one or more original research studies.

__________________ An article that summarizes key theoretical trends across


multiple studies on the same topic.

__________________ An article that averages the statistical results of multiple


studies on a topic to get an overall effect.

PSYB70 Unit 3 Study Guide, Page 14


4A.3. Reading empirical journal articles
 Overview. Within the field of psychology, empirical research reports are typically
prepared in "APA-style". Having a consistent manuscript style for empirical
reports makes it easier for journal editors to quickly adapt a manuscript into an
empirical article that fits the style of that particular journal. In this video, I discuss
the key components of a typical research article within psychology.

 See Chapter XI in your textbook for more information about APA-style


 Section 48. APA style
 Section 49. Writing a research report in APA style

Indexing information
 Authors: Who wrote the article? Which organizations are they affiliated with?
 The authors of a paper are listed in a specific order, usually with the
principle investigator (or lead researcher) listed first and collaborators and
other authors/members of the team listed afterwards.
 Publication year: What year was the article published?
 Article title: The title of the paper communicates the main topic area of the
research.
 Journal: In what journal is the article published?
 Volume number: Most journals will publish multiple editions of the article
each year. The volume number keeps track of each edition.
 Page numbers: Lists the pages of the journal on which the article
appears.
 Digital object identifier: Each article is assigned a unique alpha-numeric
number which makes it easier to track each article.

Abstract
 Abstract: concise summary of an article, about 120-150 words long.
 Topic and focus
 Key research methods
 Major results
 Useful for making decisions about which articles to read.

Introduction
 Discusses the theoretical foundation for the research.
 Describes what is currently known about a topic.
 Offers an explanation for the existing evidence.
 Identifies “gaps” in the evidence and makes predictions.
 Discusses how the researcher will test the theory.
 The introduction typically ends with a clear statement of the research question
and/or research hypotheses.
 A hypothesis is a specific, testable statement of the result that the
researcher expects to observe from the data if the proposed theory is true.

Method

PSYB70 Unit 3 Study Guide, Page 15


 Describes the procedures, participants, design, and variables of a study.
 Discusses what special materials or apparatus were used to conduct the study.

Results
 Reports the results of the study.

Discussion
 Focuses on providing an interpretation of the results, a critical evaluation of the
strengths and weaknesses of the method, and ideas for future research.
 The critical evaluation of the study often discusses the choices and trade-offs
that had to be made between the different types of validity (construct validity,
external validity, internal validity, and statistical validity).

References
 Citing means to indicate the source of information when that information is used
within a paper (i.e., “in-text”).
 References all of the sources cited in the text must appear in a list of references
at the end of your paper.

Assignments
Knowledge Check
Use the Knowledge Check to assess your understanding of this unit’s content.
 Tip 1: Work towards mastery. Keep re-doing the assignment and learning from
your mistakes until you have earned a 100% on the knowledge check and you
feel confidence in your understanding of the content.
 Tip 2: If there is content from this unit or questions from the knowledge check
that you do not understand, post your questions on the Q&A discussion board
and/or attend one of our online help tutorials (see Quercus for details).
 Tip 3: Feel free to return to the knowledge checks from earlier units to review key
terms and concepts and keep them fresh in your mind.

Media Assignment: Steps 1 & 2


If you submitted an article for Step 1A (find a media article and submit it for approval),
please wait until the Teaching Assistant approves your article. Once the TA approves
your media article, Step 1B will unlock, allowing you to post the approved headline on
the discussion board and react to the headlines posted by your classmates (see
Quercus for deadlines).
 Step 1B. Post the approved headline on the discussion board.
 Step 2. React to five headlines posted by your classmates.

PSYB70 Unit 3 Study Guide, Page 16


Library lab
The library lab is designed to teach you how to use PsycINFO to find different types of
research articles (empirical, meta-analysis, review) and then use the indexing
information from these articles to correctly cite and reference those sources. Although
you can complete the Library Lab at any point in the term, you have been provided with
designated time to complete the lab during the fifth week of the course. With that said,
you have until the last day of class to complete the assignment, so please feel free to
complete the assignment any time between now and the last day of class.

Midterm Test 1
Midterm Test 1 will cover the content from Units 1 – 4. Information about the Midterm
Test, including the exact date and time of the test, can be found on Quercus. In that
same folder, you can find access to a practice test and study tips.

PSYB70 Unit 3 Study Guide, Page 17

You might also like