Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Methods notes

The document outlines various statistical methods and assumptions related to hypothesis testing, including the central limit theorem, confidence intervals, t-tests, ANOVA, and experimental design. It discusses the importance of understanding statistical concepts such as Type 1 and Type 2 errors, effect sizes, and the handling of missing data. Additionally, it highlights the significance of ensuring internal and external validity in research designs and the limitations of correlational research.

Uploaded by

joeladams
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Methods notes

The document outlines various statistical methods and assumptions related to hypothesis testing, including the central limit theorem, confidence intervals, t-tests, ANOVA, and experimental design. It discusses the importance of understanding statistical concepts such as Type 1 and Type 2 errors, effect sizes, and the handling of missing data. Additionally, it highlights the significance of ensuring internal and external validity in research designs and the limitations of correlational research.

Uploaded by

joeladams
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Methods notes

Central limit theorem—sample mean is part of the distribution of the


population, so we can ignore the population mean and use the sample

Confidence interval needs sample mean, sd, and sample size. 95% of
the time, the CI of a sample will include the population mean.

Z stat is the rep of the coverage of the confidence intervals in terms of


sd, i.e. 95% = 2 because 2 sd from mean covers 95% of the
distribution. The value of the z-score tells you how many standard
deviations you are away from the mean. ... A positive z-score indicates
the raw score is higher than the mean average. For example, if a z-
score is equal to +1, it is 1 standard deviation above the mean. A
negative z-score reveals the raw score is below the mean average. Z
stat is for more than 30 observations

Sigma is sd of pop; S is sd of sample

T stat is for smaller samples and determined by degrees of freedom (n-


1). Looks like Z dist at n=30. Otherwise more kurtotic. t-
value measures the size of the difference relative to the variation in
your sample data. Put another way, T is simply the calculated
difference represented in units of standard error. The greater the
magnitude of T, the greater the evidence against the null hypothesis.

Question: if we’re using sample mean to calculate z stat, and we’re


using a 95% CI, aren’t we misdefining the estimate of sigma once
every 20 times? And then building everything off that
misrepresentation without ever validating against the population?

Assumptions for one-sample t-test:


⁃ one continuous dependent variable
⁃ Data are not correlated
⁃ No significant outliers (don’t just remove them!)
⁃ Data are normally distributed (not completely necessary if
sample is large—test is robust to normality)

Type 1 error: incorrectly reject null


Type 2 error: incorrectly fail to reject null

Assumptions for Paired-samples t-test


⁃ one continuous dependent variable
⁃ One dv with two related groups
⁃ No significant outliers
⁃ Differences between groups normally distributed

Paired observations = two time points of measurement or two


conditions

Effect size = abs of mean difference divided by sd. Use Cohen’s d


guide to assess size: small-.2, med-.5, large-.8

One way ANOVA assumptions


⁃ one continuous dv
⁃ One iv with 2+ categorical independent variables
⁃ Independence of observations (no one is in more than one group
⁃ No significant outliers
⁃ DV is normally distributed, it’s robust, but only if the abnormality
is in the same direction
⁃ Homogeneity of variance

Question: What situations are best for each common transformation?

An F statistic is a value you get when you run an ANOVA test or


a regression analysis to find out if the means between two populations
are significantly different. It’s similar to a T statistic from a T-Test; A T-
test will tell you if a single variable is statistically significant and an F
test will tell you if a group of variables are jointly significant.

Levene’s test (should not be significant) tests for homogeneity of


variance.

Question: LSD v Tukey v Bonferroni?

Partial eta sq is basically sample level effect size.

Shapiro-Wilk test (should be significant) for normality

Two-way ANOVA assumptions


⁃ one continuous dv
⁃ Independence of observations (no one is in more than one group
⁃ No significant outliers
⁃ Balanced design (equal sample sizes at each level—robust to
this)
⁃ Homogeneity of variance

One-way reputed measures ANOVA assumptions


⁃ one continuous dv
⁃ One IV with three or more levels
⁃ Sphericity
⁃ No significant outliers
⁃ Normal distribution at each level (robust to this)

Mauchley’s test of sphericity (should not be significant)

multicollinearity measured by tolerance and VIF (10+ is an issue)


reliability is the ceiling to validity

The logarithm and square root transformations are commonly used for
positive data, and the multiplicative inverse (reciprocal) transformation
can be used for non-zero data. The power transformation is a family of
transformations parameterized by a non-negative value λ that includes
the logarithm, square root, and multiplicative inverse as special cases.
If values are naturally restricted to be in the range 0 to 1, not including
the end-points, then a logit transformation may be appropriate: this
yields values in the range (−∞,∞).

OLS assumptions:
⁃ Weak exogeneity. This essentially means that the predictor
variables x can be treated as fixed values, rather than random
variables. This means, for example, that the predictor variables are
assumed to be error-free—that is, not contaminated with measurement
errors.
⁃ Linearity. This means that the mean of the response variable is
a linear combination of the parameters (regression coefficients) and
the predictor variables.
⁃ Constant variance (a.k.a. homoscedasticity). This means that
the variance of the errors does not depend on the values of the
predictor variables.
⁃ Independence of errors. This assumes that the errors of the
response variables are uncorrelated with each other.
⁃ Lack of perfect multicollinearity in the predictors.
More than 20% of data missing from a case, delete case. If less,
replace with median of similar responses (from that person).

More than 20% of data missing from a variable, delete variable. If less,
replace with median of similar responses (from that case, as above).

Alternatively, impute a “missing value marker” like -99, or any other


marker value extremely unlikely in the data set. (Enter in variables
view—under missing, click none, change to discrete and choose value)

Q: doesn’t this regress to the median and potentially conflate items?

The best way to deal with missing values is to avoid them in the first
place. If you can prevent missing values, then you don’t have to handle
them later. Missing values are often a result of:
1. Survey fatigue, which can be prevented by using a shorter survey
and short questions.
2. Question ambiguity, which can be prevented by thoroughly
pretesting your measures.
3. Response reluctance, which can be prevented by careful wording
so that no question feels threatening.
4. Unfinished survey, which cannot be prevented, but you can still
put your most important questions (such as DVs) at the beginning.
A univariate outlier is an unusual, unexpected, or out of scope value for
a single variable (whereas multivariate outliers are unusual or
unexpected cases with regards to the relationship between multiple
variables)

An unengaged respondent is someone who does not pay attention


when they fill out your survey. Such a respondent does harm to your
data by providing erroneous and invalid response values. Each of that
person’s responses can be considered as outliers, because they are all
erroneous. In such a case, you are justified in removing the entire row
of data. You can identify an unengaged respondent using one or more
of the following strategies:
 Failure to notice reverse-coded questions
 Unrealistically short survey duration (e.g., finished 60 questions
in 2 minutes)
 Patterned responses (e.g., 3, 3, 3, 3, 3, 3, 3, 3… or 1, 1, 1, 1, 2, 2,
2, 2, 3, 3, 3, 3)
 Failure to respond correctly to attention traps (e.g., they answer
strongly agree to “A square is a circle”)
Purpose of experimental design:
1.Answer research questions
2.Control variance
 Experimental variance
 Variance of the dependent variable influenced by the
independent variable
 This refers to your IV & this is what we want to maximize (in
general)
 Make the conditions of the IV as different as possible from one
another while still preserving external validity
 Extraneous variance
 Homogenize the extraneous variable
o If you think GMA will be an issue, you can only include
those with the same IQ scores
o Pros and Cons?
o If something does not vary, it cannot impact results
o May not generalize beyond this homogenous group
 Randomize
 Make it an IV or additional condition
 Good with demographics that can’t be randomized
 Allows for tests of interactions
 Note that if it can’t be assigned, then it’s not a causal link and
there could be intervening variables
 Matching
 Pair individuals on the extraneous variable
 Still randomly assign pair to tx or control
 Need to have evidence that the matching variable (e.g.,
personality) has a strong correlation to the DV, otherwise it’s a
waste and/or misleading
 Often lose subjects and this only increases as number of
traits/attributes matched increase
 Error variance

Experimental Designs
 Single group design
o XO
o OXO
o X is an intervention
o O is an observation
o In S&S speak, these are “one-shot case study” and “one-group
pretest-posttest design”
 Single Group Threats
o History
o Maturation
o Testing (pre-post design only)
o Did subjects “learn” how to answer questions
o Instrumentation (pre-post design only)
o Did any change occur during the study in the way the
dependent variable was measured?
o Mortality
o Regression to the mean
 Static Group comparison
o XO
o O
o X is an intervention
o O is an observation
o Largely controls for history, testing, instrumentation, and
regression to the mean
o Other Internal validity threats persist
 Completely randomized design
o R X O vs. R O
o R O X O vs. R O O
o True experimental designs
o X is an intervention
o O is an observation
o R is random assignment
o What can’t be addressed is whether randomization did it’s job
(top), effect of pretest (bottom), or whether there is an
interaction b/w pretest and IV (bottom)
o Solomon Group Design

o
o Can examine effect of IV, pretesting alone, interaction b/w
pretest and IV, and randomization (O1 vs. O3)
 Randomized block design
o Certain characteristics can’t be randomized and might be seen
as a nuisance variable (e.g., job type, sex, health)
o Doesn’t immediately randomly assign participants to
experimental conditions
o We first sort them into a characteristic or blocking variable
that we expect to play a role

o
 Latin square design
o Weird
 Matched pairs design
o Comparing 2 treatments with same experimental units (i.e.,
participants)
o Everybody gets both treatments
o Randomize treatment order & then compare performance on
DV
 Similar experimental units design (matched samples)
o Similar to a block design, but the block is a continuous variable or a set of variables
o Sleep deprivation and test performance
o Can’t have same subject in both sleep deprivation and control group (they would
know what’s on test on second try)
o May match them on some nuisance variable (like in block designs)

Multi-group threats to internal validity


• Selection and all its interactions
– Selection x History
– Selection x Maturation
– Selection x Testing
– Selection x Instrumentation
– Selection x Mortality
– Selection x Regression
• All boils down to whether the groups were equivalent from the start

Social interaction threats to internal validity


• These only apply to multi-group designs
• Trick is to keep the groups separated as much as possible, or you could end up with…
– Diffusion or Imitation of Treatment
– Compensatory rivalry
– Resentful demoralization
– Compensatory equalization (by researcher/3rd party)
• Why would these also be a threat to external validity?
External validity
• Results generalize to and across individuals, settings, and times.
• Divides into
– Population validity
• Does the sample match the population
• The more populations and groups it generalizes to, the greater the
external population validity
– Ecological validity
• degree that a result generalizes across settings (realism)
• Some (e.g., Highhouse) place this slightly outside of EV

Correlational Research: Major Features


• No independent variables are manipulated
• Two or more variables are measured and a relationship established
• Correlational research does not show causality
• Don’t confuse statistics with research design
• Correlation coefficients ( a statistic) can be used in correlational or
experimental research designs (although they are more commonly used
in correlational designs)

Directionality of Effect Problem

Third Variable Problem

Simulations
• Robustness of a technique or finding can be tested via simulation
– You create the population parameters
– Sample characteristics (e.g., N)
– Run repeated trials to determine difference between sample statistics and
population parameters
• Monte Carlo studies are dependent on the representativeness of the conditions
modeled
• one potential concern is that “the constructed model may not reflect real-world
conditions
If true, “even the most elegantly designed study may not be informative if the conditions
included are not relevant to the type of data one typically encounters in practice” (Bandalos &
Gagne, 2012, p. 96

You might also like