Practical Missing Data Analysis in SPSS

Practical Missing Data
Analysis in SPSS
(v17 onwards)
Peter T. Donnan
Professor of Epidemiology and Biostatistics
Objectives
How to impute missing values
in SPSS, specifically MI
How to implement analyses
with multiple imputed values
Interpretation of the output
Practical tips
Example data
From trial of pedometers+advice vs
advice vs controls in sedentary elderly
women
Follow-up at 3 and 6 mnths
Main outcome measure of activity
from accelerometer counts
210 randomised / 170 at 3 months
Example data Pedometer

trial
Read in data SPSS Study databse.sav
Main outcome is:
3 mnth activity AccelVM2
Baseline activity AccelVM1a
Trial arm represented by two dummy

variables:
Grp1 = Pedom. Vs. control
Grp2 = Advice vs. control
Main analysis Pedometer

trial
Regression on 3
months activity
adjusting for
baseline activity
and two dummy
variables
representing trial
arm contrasts
Main analysis Pedometer

trial
Note that n =170

with 40 missing in
complete case analysis
and so potential for
bias
Missing at Random (MAR)

Prob (Missing) is independent of:
1) unobserved data but

2) dependent on observed data
Essentially observed data is a random
sample of full data in each stratum
MAR is weaker version of MCAR
assumption
If MAR is assumed, many methods possible
to impute data using observed data.
Comparison of completers at
3 months and drop-outs
Completers (n
=172)
Dropped out at
3 months (n =
32)
Chi-squared
or t-test pvalue
77.1 (5.0)
78.5 (5.6)
0.137
130695 (47991)
113381 (50444)
8.69 (2.25)
7.41 (2.86)
199.59
(306.74)
404.29 (1289.54)
Pedometer Group N (%)
58 (85.3%)
10 (14.7%)
BCI Group N (%)
52 (77.6%)
15 (22.4%)
Control Group N (%)
62 (92.5%)
5 (7.5%)
Stairs difficult Yes
48 (76.2%)
15 (23.8%)
No
124 (87.9%
17 (12.1%)
Age Mean (SD)

Accelerometer VM
Mean (SD)
Limb Function Mean
(SD)
NHS Costs previous 3
months Mean (SD)
0.065
0.028
0.402
0.052
0.033
Execution of MI in SPSS
So assuming MAR we can use the
available data to predict missing values
in SPSS:
Analyze
Multiple Imputation
Impute Missing Data Values
Enter ALL variables
you think associated
with missingness
Note default
imputation number =
5
Create new dataset
to store results
Note icon indicating
procedures that
allow MI analysis
Automatic method
lets SPSS chose
Custom gives more
flexibility
Can include all 2-way
interactions
Linear Regression
model prediction
List of variables
chosen
Define Each variable
for imputation or
predictor or BOTH
N.b. Recommend
including the
OUTCOME as both
predictor and
outcome
Output of MI in SPSS
Note main interest

in outcome VM2 but
other factors with
missing values also
imputed
Step 2 - Using Imputed

datasets in analysis
Note new dataset has IMPUTATION number
as first column and contains in order the
original dataset (n = 210), IMPUTATION = 0
and concatenated below it a further 5 new
datasets (each n = 210) but now with imputed
values, IMPUTATION = 1 to 5
Most analyses can now be implemented if the
fossil shell spiral symbol is present
Repeat Main analysis

Need Pooled Results
Procedure exactly
same as before
SPSS will do the
pooled analysis if
the icon (above)
is present in the
drop-down menu
Pooled Analysis in SPSS
Results
presented for
the original
data and for
each imputed
dataset
separately
Results of pooled analysis

from 5 imputed datasets
Model
SE
Sig.
Fractio
n
missing
Constant
15607
7808
1.999
0.047
0.173
AccelVM1
a
0.852
0.051
16.630
0.000
0.124
Pedomete 11310
r Group
6131
1.845
0.066
0.138
Advice
only
6526
2.687
0.009
0.266
Pooled
Larger
effect
sizes in
both
groups
17536
Greater power gives

more significance
Interpretation
Compare pooled results with the original as a
form of sensitivity analysis
If results similar suggests the original results
fairly robust
Consider whether MAR is reasonable assumption
Consider whether you have included all factors
(including the outcome) related to the
missingness in the imputation model as a crucial
assumption
Summary
SPSS now includes Multiple imputation in its

armoury
Consider assumptions of MI
Compare results under different assumption

to assess robustness of results
If MAR assumption o.k. then MI provides

results that are less biased than complete
case analysis

Practical Missing Data Analysis in SPSS

Uploaded by

Copyright:

Available Formats

Practical Missing Data Analysis in SPSS

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Practical Missing Data Analysis in SPSS

Uploaded by

Copyright:

Available Formats

Practical Missing Data

Example data Pedometer

Trial arm represented by two dummy

Main analysis Pedometer

Main analysis Pedometer

Note that n =170

Missing at Random (MAR)

1) unobserved data but

Pedometer Group N (%)

BCI Group N (%)

Control Group N (%)

Stairs difficult Yes

Age Mean (SD)

Note main interest

Step 2 - Using Imputed

Repeat Main analysis

Pooled Analysis in SPSS

Results of pooled analysis

Greater power gives

SPSS now includes Multiple imputation in its

Compare results under different assumption

If MAR assumption o.k. then MI provides

You might also like