0% found this document useful (0 votes)

2 views

Methods notes

The document outlines various statistical methods and assumptions related to hypothesis testing, including the central limit theorem, confidence intervals, t-tests, ANOVA, and experimental design. It discusses the importance of understanding statistical concepts such as Type 1 and Type 2 errors, effect sizes, and the handling of missing data. Additionally, it highlights the significance of ensuring internal and external validity in research designs and the limitations of correlational research.

Uploaded by

joeladams

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Methods notes

Uploaded by

joeladams

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Methods notes

Central limit theorem—sample mean is part of the distribution of the

population, so we can ignore the population mean and use the sample

Confidence interval needs sample mean, sd, and sample size. 95% of
the time, the CI of a sample will include the population mean.

Z stat is the rep of the coverage of the confidence intervals in terms of

sd, i.e. 95% = 2 because 2 sd from mean covers 95% of the
distribution. The value of the z-score tells you how many standard
deviations you are away from the mean. ... A positive z-score indicates
the raw score is higher than the mean average. For example, if a z-
score is equal to +1, it is 1 standard deviation above the mean. A
negative z-score reveals the raw score is below the mean average. Z
stat is for more than 30 observations

Sigma is sd of pop; S is sd of sample

T stat is for smaller samples and determined by degrees of freedom (n-

1). Looks like Z dist at n=30. Otherwise more kurtotic. t-
value measures the size of the difference relative to the variation in
your sample data. Put another way, T is simply the calculated
difference represented in units of standard error. The greater the
magnitude of T, the greater the evidence against the null hypothesis.

Question: if we’re using sample mean to calculate z stat, and we’re

using a 95% CI, aren’t we misdefining the estimate of sigma once
every 20 times? And then building everything off that
misrepresentation without ever validating against the population?

Assumptions for one-sample t-test:

⁃ one continuous dependent variable
⁃ Data are not correlated
⁃ No significant outliers (don’t just remove them!)
⁃ Data are normally distributed (not completely necessary if
sample is large—test is robust to normality)

Type 1 error: incorrectly reject null

Type 2 error: incorrectly fail to reject null

Assumptions for Paired-samples t-test

⁃ one continuous dependent variable
⁃ One dv with two related groups
⁃ No significant outliers
⁃ Differences between groups normally distributed

Paired observations = two time points of measurement or two

conditions

Effect size = abs of mean difference divided by sd. Use Cohen’s d

guide to assess size: small-.2, med-.5, large-.8

One way ANOVA assumptions

⁃ one continuous dv
⁃ One iv with 2+ categorical independent variables
⁃ Independence of observations (no one is in more than one group
⁃ No significant outliers
⁃ DV is normally distributed, it’s robust, but only if the abnormality
is in the same direction
⁃ Homogeneity of variance

Question: What situations are best for each common transformation?

An F statistic is a value you get when you run an ANOVA test or

a regression analysis to find out if the means between two populations
are significantly different. It’s similar to a T statistic from a T-Test; A T-
test will tell you if a single variable is statistically significant and an F
test will tell you if a group of variables are jointly significant.

Levene’s test (should not be significant) tests for homogeneity of

variance.

Question: LSD v Tukey v Bonferroni?

Partial eta sq is basically sample level effect size.

Shapiro-Wilk test (should be significant) for normality

Two-way ANOVA assumptions

⁃ one continuous dv
⁃ Independence of observations (no one is in more than one group
⁃ No significant outliers
⁃ Balanced design (equal sample sizes at each level—robust to
this)
⁃ Homogeneity of variance

One-way reputed measures ANOVA assumptions

⁃ one continuous dv
⁃ One IV with three or more levels
⁃ Sphericity
⁃ No significant outliers
⁃ Normal distribution at each level (robust to this)

Mauchley’s test of sphericity (should not be significant)

multicollinearity measured by tolerance and VIF (10+ is an issue)

reliability is the ceiling to validity

The logarithm and square root transformations are commonly used for
positive data, and the multiplicative inverse (reciprocal) transformation
can be used for non-zero data. The power transformation is a family of
transformations parameterized by a non-negative value λ that includes
the logarithm, square root, and multiplicative inverse as special cases.
If values are naturally restricted to be in the range 0 to 1, not including
the end-points, then a logit transformation may be appropriate: this
yields values in the range (−∞,∞).

OLS assumptions:
⁃ Weak exogeneity. This essentially means that the predictor
variables x can be treated as fixed values, rather than random
variables. This means, for example, that the predictor variables are
assumed to be error-free—that is, not contaminated with measurement
errors.
⁃ Linearity. This means that the mean of the response variable is
a linear combination of the parameters (regression coefficients) and
the predictor variables.
⁃ Constant variance (a.k.a. homoscedasticity). This means that
the variance of the errors does not depend on the values of the
predictor variables.
⁃ Independence of errors. This assumes that the errors of the
response variables are uncorrelated with each other.
⁃ Lack of perfect multicollinearity in the predictors.
More than 20% of data missing from a case, delete case. If less,
replace with median of similar responses (from that person).

More than 20% of data missing from a variable, delete variable. If less,
replace with median of similar responses (from that case, as above).

Alternatively, impute a “missing value marker” like -99, or any other

marker value extremely unlikely in the data set. (Enter in variables
view—under missing, click none, change to discrete and choose value)

Q: doesn’t this regress to the median and potentially conflate items?

The best way to deal with missing values is to avoid them in the first
place. If you can prevent missing values, then you don’t have to handle
them later. Missing values are often a result of:
1. Survey fatigue, which can be prevented by using a shorter survey
and short questions.
2. Question ambiguity, which can be prevented by thoroughly
pretesting your measures.
3. Response reluctance, which can be prevented by careful wording
so that no question feels threatening.
4. Unfinished survey, which cannot be prevented, but you can still
put your most important questions (such as DVs) at the beginning.
A univariate outlier is an unusual, unexpected, or out of scope value for
a single variable (whereas multivariate outliers are unusual or
unexpected cases with regards to the relationship between multiple
variables)

An unengaged respondent is someone who does not pay attention

when they fill out your survey. Such a respondent does harm to your
data by providing erroneous and invalid response values. Each of that
person’s responses can be considered as outliers, because they are all
erroneous. In such a case, you are justified in removing the entire row
of data. You can identify an unengaged respondent using one or more
of the following strategies:
 Failure to notice reverse-coded questions
 Unrealistically short survey duration (e.g., finished 60 questions
in 2 minutes)
 Patterned responses (e.g., 3, 3, 3, 3, 3, 3, 3, 3… or 1, 1, 1, 1, 2, 2,
2, 2, 3, 3, 3, 3)
 Failure to respond correctly to attention traps (e.g., they answer
strongly agree to “A square is a circle”)
Purpose of experimental design:
1.Answer research questions
2.Control variance
 Experimental variance
 Variance of the dependent variable influenced by the
independent variable
 This refers to your IV & this is what we want to maximize (in
general)
 Make the conditions of the IV as different as possible from one
another while still preserving external validity
 Extraneous variance
 Homogenize the extraneous variable
o If you think GMA will be an issue, you can only include
those with the same IQ scores
o Pros and Cons?
o If something does not vary, it cannot impact results
o May not generalize beyond this homogenous group
 Randomize
 Make it an IV or additional condition
 Good with demographics that can’t be randomized
 Allows for tests of interactions
 Note that if it can’t be assigned, then it’s not a causal link and
there could be intervening variables
 Matching
 Pair individuals on the extraneous variable
 Still randomly assign pair to tx or control
 Need to have evidence that the matching variable (e.g.,
personality) has a strong correlation to the DV, otherwise it’s a
waste and/or misleading
 Often lose subjects and this only increases as number of
traits/attributes matched increase
 Error variance

Experimental Designs
 Single group design
o XO
o OXO
o X is an intervention
o O is an observation
o In S&S speak, these are “one-shot case study” and “one-group
pretest-posttest design”
 Single Group Threats
o History
o Maturation
o Testing (pre-post design only)
o Did subjects “learn” how to answer questions
o Instrumentation (pre-post design only)
o Did any change occur during the study in the way the
dependent variable was measured?
o Mortality
o Regression to the mean
 Static Group comparison
o XO
o O
o X is an intervention
o O is an observation
o Largely controls for history, testing, instrumentation, and
regression to the mean
o Other Internal validity threats persist
 Completely randomized design
o R X O vs. R O
o R O X O vs. R O O
o True experimental designs
o X is an intervention
o O is an observation
o R is random assignment
o What can’t be addressed is whether randomization did it’s job
(top), effect of pretest (bottom), or whether there is an
interaction b/w pretest and IV (bottom)
o Solomon Group Design

o
o Can examine effect of IV, pretesting alone, interaction b/w
pretest and IV, and randomization (O1 vs. O3)
 Randomized block design
o Certain characteristics can’t be randomized and might be seen
as a nuisance variable (e.g., job type, sex, health)
o Doesn’t immediately randomly assign participants to
experimental conditions
o We first sort them into a characteristic or blocking variable
that we expect to play a role

o
 Latin square design
o Weird
 Matched pairs design
o Comparing 2 treatments with same experimental units (i.e.,
participants)
o Everybody gets both treatments
o Randomize treatment order & then compare performance on
DV
 Similar experimental units design (matched samples)
o Similar to a block design, but the block is a continuous variable or a set of variables
o Sleep deprivation and test performance
o Can’t have same subject in both sleep deprivation and control group (they would
know what’s on test on second try)
o May match them on some nuisance variable (like in block designs)

Multi-group threats to internal validity

• Selection and all its interactions
– Selection x History
– Selection x Maturation
– Selection x Testing
– Selection x Instrumentation
– Selection x Mortality
– Selection x Regression
• All boils down to whether the groups were equivalent from the start

Social interaction threats to internal validity

• These only apply to multi-group designs
• Trick is to keep the groups separated as much as possible, or you could end up with…
– Diffusion or Imitation of Treatment
– Compensatory rivalry
– Resentful demoralization
– Compensatory equalization (by researcher/3rd party)
• Why would these also be a threat to external validity?
External validity
• Results generalize to and across individuals, settings, and times.
• Divides into
– Population validity
• Does the sample match the population
• The more populations and groups it generalizes to, the greater the
external population validity
– Ecological validity
• degree that a result generalizes across settings (realism)
• Some (e.g., Highhouse) place this slightly outside of EV

Correlational Research: Major Features

• No independent variables are manipulated
• Two or more variables are measured and a relationship established
• Correlational research does not show causality
• Don’t confuse statistics with research design
• Correlation coefficients ( a statistic) can be used in correlational or
experimental research designs (although they are more commonly used
in correlational designs)

Directionality of Effect Problem

Third Variable Problem

Simulations
• Robustness of a technique or finding can be tested via simulation
– You create the population parameters
– Sample characteristics (e.g., N)
– Run repeated trials to determine difference between sample statistics and
population parameters
• Monte Carlo studies are dependent on the representativeness of the conditions
modeled
• one potential concern is that “the constructed model may not reflect real-world
conditions
If true, “even the most elegantly designed study may not be informative if the conditions
included are not relevant to the type of data one typically encounters in practice” (Bandalos &
Gagne, 2012, p. 96

CHYS 3P15 Final Exam Review
No ratings yet
CHYS 3P15 Final Exam Review
7 pages
Statistics & Molecular MRCP1
100% (2)
Statistics & Molecular MRCP1
87 pages
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
From Everand
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
Lee Baker
No ratings yet
Research and Methodology Common Issues
No ratings yet
Research and Methodology Common Issues
2 pages
Chapter 5 Hypothesis Testing
100% (1)
Chapter 5 Hypothesis Testing
27 pages
Glossary Statistics
No ratings yet
Glossary Statistics
6 pages
Assumptions I
No ratings yet
Assumptions I
50 pages
2802 Key Points
No ratings yet
2802 Key Points
34 pages
Chapter 5 Hypothesis Testing
No ratings yet
Chapter 5 Hypothesis Testing
27 pages
Lab Viva
No ratings yet
Lab Viva
8 pages
Analysis of Variance
No ratings yet
Analysis of Variance
40 pages
Inferential Report 1
No ratings yet
Inferential Report 1
7 pages
Chapter 5 T test & ANOVA
No ratings yet
Chapter 5 T test & ANOVA
26 pages
Julie Ann A. Analista February 2014
No ratings yet
Julie Ann A. Analista February 2014
67 pages
An o Va (Anova) : Alysis F Riance
No ratings yet
An o Va (Anova) : Alysis F Riance
29 pages
AP Psych Prep 2 Part II More Methods Statistics
No ratings yet
AP Psych Prep 2 Part II More Methods Statistics
37 pages
Descriptive Statistics and Inferential Statistics: Part 1
No ratings yet
Descriptive Statistics and Inferential Statistics: Part 1
65 pages
Statistics Review
No ratings yet
Statistics Review
7 pages
Allama Iqbal Open University Islamabad: Muhammad Ashraf
No ratings yet
Allama Iqbal Open University Islamabad: Muhammad Ashraf
25 pages
Module 4
No ratings yet
Module 4
9 pages
Drexel_Fundamentals_Medical_Research_Seminar_2014_Gracely_Multivariate
No ratings yet
Drexel_Fundamentals_Medical_Research_Seminar_2014_Gracely_Multivariate
8 pages
Quantitative Techniques For Research: Ahmed Arif
No ratings yet
Quantitative Techniques For Research: Ahmed Arif
11 pages
Osborne (2008) CH 22 Testing The Assumptions of Analysis of Variance
No ratings yet
Osborne (2008) CH 22 Testing The Assumptions of Analysis of Variance
29 pages
Biostats 2
No ratings yet
Biostats 2
7 pages
Inferential Statistics For Data Science
100% (1)
Inferential Statistics For Data Science
10 pages
One Way ANOVA
No ratings yet
One Way ANOVA
46 pages
Lecture 1 Notes
No ratings yet
Lecture 1 Notes
6 pages
SPSS Training Program & Introduction To Statistical Testing: Variance and Variables
No ratings yet
SPSS Training Program & Introduction To Statistical Testing: Variance and Variables
13 pages
STAT1103 RensNotes
No ratings yet
STAT1103 RensNotes
8 pages
Pertemuan 14. Descriptive Inferential Statistics - PDF
No ratings yet
Pertemuan 14. Descriptive Inferential Statistics - PDF
24 pages
HỎI-XOÁY-ĐÁP-XOAY-CÙNG-THẦY-DŨNG
No ratings yet
HỎI-XOÁY-ĐÁP-XOAY-CÙNG-THẦY-DŨNG
4 pages
Notes For RM Lab Viva
No ratings yet
Notes For RM Lab Viva
9 pages
The Stack
No ratings yet
The Stack
4 pages
Mm13 Content Module 8
No ratings yet
Mm13 Content Module 8
15 pages
Anova
No ratings yet
Anova
57 pages
Basic Concepts
No ratings yet
Basic Concepts
9 pages
Data Analysis
100% (1)
Data Analysis
34 pages
Basic Statistics: Basic Statistical Interview Question
No ratings yet
Basic Statistics: Basic Statistical Interview Question
5 pages
Assignment 1
No ratings yet
Assignment 1
9 pages
Understanding The One-Way ANOVA
No ratings yet
Understanding The One-Way ANOVA
13 pages
What Is A Hypothesis
No ratings yet
What Is A Hypothesis
4 pages
Day 3 Statistics Interview QnA
No ratings yet
Day 3 Statistics Interview QnA
5 pages
Chap 4
No ratings yet
Chap 4
7 pages
Parametric Tests
No ratings yet
Parametric Tests
16 pages
Applied Research Methods - Lectures 1-3
No ratings yet
Applied Research Methods - Lectures 1-3
6 pages
Finallynotesexam
No ratings yet
Finallynotesexam
5 pages
What Is A T
No ratings yet
What Is A T
5 pages
Biostatistics Notes: Descriptive Statistics
No ratings yet
Biostatistics Notes: Descriptive Statistics
16 pages
Biostatistics Notes
No ratings yet
Biostatistics Notes
8 pages
Theory Hypothesis Design Data: To Answer / To Test Research Study Collect
No ratings yet
Theory Hypothesis Design Data: To Answer / To Test Research Study Collect
44 pages
MANOVA - Analysis
No ratings yet
MANOVA - Analysis
33 pages
Viva Update For BS
No ratings yet
Viva Update For BS
10 pages
Biostatistics Notes
No ratings yet
Biostatistics Notes
10 pages
Parametric & Non Parametric Tests
No ratings yet
Parametric & Non Parametric Tests
18 pages
Statistics and Probability Q4 - M1 - LAS
No ratings yet
Statistics and Probability Q4 - M1 - LAS
5 pages
EDU 411 Topic 5 Data Analysis
No ratings yet
EDU 411 Topic 5 Data Analysis
9 pages
Multivariate Analysis
No ratings yet
Multivariate Analysis
11 pages
Statistics in Neurosciences: Basics - Field Chapter 1-5
No ratings yet
Statistics in Neurosciences: Basics - Field Chapter 1-5
33 pages
T-Test Material
No ratings yet
T-Test Material
10 pages
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
Prospectus
No ratings yet
Prospectus
3 pages
Illustrated Parts Manual: Transmatic Lawn Tractor - Models 760-779
No ratings yet
Illustrated Parts Manual: Transmatic Lawn Tractor - Models 760-779
28 pages
Operator'S Manual: Transmatic Lawn Tractor - Model Series 77T
No ratings yet
Operator'S Manual: Transmatic Lawn Tractor - Model Series 77T
48 pages
KCSP Letter
No ratings yet
KCSP Letter
1 page
What Is A Public Charter School?
No ratings yet
What Is A Public Charter School?
3 pages
Practical - Regression
No ratings yet
Practical - Regression
114 pages
Financial Econometrics Homework
100% (1)
Financial Econometrics Homework
7 pages
Using R For Introductory Econometrics
No ratings yet
Using R For Introductory Econometrics
378 pages
A Robust Test for Weak Instruments
No ratings yet
A Robust Test for Weak Instruments
13 pages
FabroSittie - Chapter 1 5 For Turnintin PDF
No ratings yet
FabroSittie - Chapter 1 5 For Turnintin PDF
50 pages
4 5935800856413736668
No ratings yet
4 5935800856413736668
10 pages
Chandan Mukherjee, Howard White, Marc Wuyts - Econometrics and Data Analysis For Developing Countries-Routledge (1998)
No ratings yet
Chandan Mukherjee, Howard White, Marc Wuyts - Econometrics and Data Analysis For Developing Countries-Routledge (1998)
515 pages
Rmprobit
No ratings yet
Rmprobit
8 pages
Multiple Choice - Chow Test Parameter Stability
No ratings yet
Multiple Choice - Chow Test Parameter Stability
9 pages
B2531112222
No ratings yet
B2531112222
12 pages
Chapter 05
No ratings yet
Chapter 05
23 pages
Econ103 Fall2024
No ratings yet
Econ103 Fall2024
5 pages
ARMA and ARIMA
No ratings yet
ARMA and ARIMA
22 pages
Biometry
No ratings yet
Biometry
76 pages
1760-Article Text-3236-1-10-20210120
No ratings yet
1760-Article Text-3236-1-10-20210120
18 pages
Inventory Management Practices and Financial Performance of Small and Medium Scale Enterprise in Keya
No ratings yet
Inventory Management Practices and Financial Performance of Small and Medium Scale Enterprise in Keya
16 pages
Comparing Automated Valuation Models For Real Estate
No ratings yet
Comparing Automated Valuation Models For Real Estate
27 pages
3.1 Business Understanding
No ratings yet
3.1 Business Understanding
15 pages
30 Days of Interview Preparation
100% (1)
30 Days of Interview Preparation
415 pages
Multilevel Modeling Using R 1st Edition Edition W. Holmes Finch download pdf
100% (3)
Multilevel Modeling Using R 1st Edition Edition W. Holmes Finch download pdf
71 pages
Understanding Social Inequalities in Pakistan: An Intersectionality Perspective On Ethnicity, Income, and Education
No ratings yet
Understanding Social Inequalities in Pakistan: An Intersectionality Perspective On Ethnicity, Income, and Education
18 pages
Regression Analysis 2022
No ratings yet
Regression Analysis 2022
92 pages
Assignment 11 (Anova)
No ratings yet
Assignment 11 (Anova)
5 pages
Mixed
No ratings yet
Mixed
4 pages
The Linkage Among Economic Growth, Education and Health: Empirical Study in Java Island
No ratings yet
The Linkage Among Economic Growth, Education and Health: Empirical Study in Java Island
16 pages
Newbold, P., Carlson, W.L. and Thorne, B. Statistics For Business and Economics
No ratings yet
Newbold, P., Carlson, W.L. and Thorne, B. Statistics For Business and Economics
11 pages
The effect of capital structure on performance empirical evidence from manufacturing companies in Ethiopia
No ratings yet
The effect of capital structure on performance empirical evidence from manufacturing companies in Ethiopia
20 pages
The Effect of Public Debt On Growth in Multiple Regimes
No ratings yet
The Effect of Public Debt On Growth in Multiple Regimes
51 pages
Econometric Mod L
No ratings yet
Econometric Mod L
8 pages
Afshan Hamid 468 P (45-64)
No ratings yet
Afshan Hamid 468 P (45-64)
20 pages