Practical Missing Data Analysis in SPSS
Practical Missing Data Analysis in SPSS
Practical Missing Data Analysis in SPSS
Analysis in SPSS
(v17 onwards)
Peter T. Donnan
Professor of Epidemiology and Biostatistics
Objectives
How to impute missing values
in SPSS, specifically MI
How to implement analyses
with multiple imputed values
Interpretation of the output
Practical tips
Example data
From trial of pedometers+advice vs
advice vs controls in sedentary elderly
women
Follow-up at 3 and 6 mnths
Main outcome measure of activity
from accelerometer counts
210 randomised / 170 at 3 months
Comparison of completers at
3 months and drop-outs
Completers (n
=172)
Dropped out at
3 months (n =
32)
Chi-squared
or t-test pvalue
77.1 (5.0)
78.5 (5.6)
0.137
130695 (47991)
113381 (50444)
8.69 (2.25)
7.41 (2.86)
199.59
(306.74)
404.29 (1289.54)
58 (85.3%)
10 (14.7%)
52 (77.6%)
15 (22.4%)
62 (92.5%)
5 (7.5%)
48 (76.2%)
15 (23.8%)
No
124 (87.9%
17 (12.1%)
0.065
0.028
0.402
0.052
0.033
Execution of MI in SPSS
So assuming MAR we can use the
available data to predict missing values
in SPSS:
Analyze
Multiple Imputation
Impute Missing Data Values
Execution of MI in SPSS
Enter ALL variables
you think associated
with missingness
Note default
imputation number =
5
Create new dataset
to store results
Note icon indicating
procedures that
allow MI analysis
Execution of MI in SPSS
Automatic method
lets SPSS chose
Custom gives more
flexibility
Can include all 2-way
interactions
Linear Regression
model prediction
Execution of MI in SPSS
List of variables
chosen
Define Each variable
for imputation or
predictor or BOTH
N.b. Recommend
including the
OUTCOME as both
predictor and
outcome
Output of MI in SPSS
Results
presented for
the original
data and for
each imputed
dataset
separately
SE
Sig.
Fractio
n
missing
Constant
15607
7808
1.999
0.047
0.173
AccelVM1
a
0.852
0.051
16.630
0.000
0.124
Pedomete 11310
r Group
6131
1.845
0.066
0.138
Advice
only
6526
2.687
0.009
0.266
Pooled
Larger
effect
sizes in
both
groups
17536
Interpretation
Compare pooled results with the original as a
form of sensitivity analysis
If results similar suggests the original results
fairly robust
Consider whether MAR is reasonable assumption
Consider whether you have included all factors
(including the outcome) related to the
missingness in the imputation model as a crucial
assumption
Summary
Consider assumptions of MI