Introduction To SEM Using SAS
Introduction To SEM Using SAS
Structural Equation
Modeling
Course Notes
An Introduction to Structural Equation Modeling Course Notes was developed by Werner Wothke, Ph.D.,
of the American Institute for Research. Additional contributions were made by Bob Lucas and Paul
Marovich. Editing and production support was provided by the Curriculum Development and Support
Department.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of
SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
Book code E70366, course code LWBAWW, prepared date 21-Feb-07 LWBAWW_002
ISBN 978-1-59994-400-5
For Your Information iii
Table of Contents
1.1 Introduction......................................................................................................................1-2
1.7 Conclusions....................................................................................................................1-34
iv For Your Information
Course Description
This session focuses on structural equation modeling (SEM), a statistical technique that combines
elements of traditional multivariate models, such as regression analysis, factor analysis, and simultaneous
equation modeling. SEM can explicitly account for less than perfect reliability of the observed variables,
providing analyses of attenuation and estimation bias due to measurement error. The SEM approach is
sometimes also called causal modeling because competing models can be postulated about the data and
tested against each other. Many applications of SEM can be found in the social sciences, where
measurement error and uncertain causal conditions are commonly encountered. This presentation
demonstrates the structural equation modeling approach with several sets of empirical textbook data. The
final example demonstrates a more sophisticated re-analysis of one of the earlier data sets.
To learn more…
A full curriculum of general and statistical instructor-based training is available
at any of the Institute’s training facilities. Institute instructors can also provide
on-site training.
For information on other courses in the curriculum, contact the SAS Education
Division at 1-800-333-7660, or send e-mail to training@sas.com. You can also
find this information on the Web at support.sas.com/training/ as well as in the
Training Course Catalog.
For a list of other SAS books that relate to the topics covered in this
Course Notes, USA customers can contact our SAS Publishing Department at
1-800-727-3228 or send e-mail to sasbook@sas.com. Customers outside the
USA, please contact your local SAS office.
Also, see the Publications Catalog on the Web at www.sas.com/pubs for a
complete list of books and a convenient order form.
Chapter 1 Introduction to Structural
Equation Modeling
1.1 Introduction.....................................................................................................................1-2
1.1 Introduction
Course Outline
1. Welcome to the Webcast
2. Structural Equation Modeling—Overview
3. Two Easy Examples
a. Regression Analysis
b. Factor Analysis
4. Confirmatory Models and Assessing Fit
5. More Advanced Examples
a. Structural Equation Model (Incl. Measurement
Model)
b. Effects of Errors-in-Measurement on Regression
6. Conclusion
3
1.2 Structural Equation Modeling – Overview 1-3
SEM—Some Origins
Psychology — Factor Analysis:
Spearman (1904), Thurstone (1935, 1947)
Human Genetics — Regression Analysis:
Galton (1889)
Biology — Path Modeling:
S. Wright (1934)
Economics — Simultaneous Equation Modeling:
Haavelmo (1943), Koopmans (1953), Wold (1954)
Statistics — Method of Maximum Likelihood Estimation:
R.A. Fisher (1921), Lawley (1940)
Synthesis into Modern SEM and Factor Analysis:
Jöreskog (1970), Lawley & Maxwell (1971), Goldberger
& Duncan (1973)
10
1-4 Chapter 1 Introduction to Structural Equation Modeling
11
1.3 Example 1: Regression Analysis 1-5
15
16
1-6 Chapter 1 Introduction to Structural Equation Modeling
e Performance ValueOrientation
e is the JobSatisfaction
prediction
error.
18
1.3 Example 1: Regression Analysis 1-7
19
20
1-8 Chapter 1 Introduction to Structural Equation Modeling
22
LINEQS
<equation>, … , <equation>;
STD
<variance-terms>;
COV VAR (optional
<covariance-terms>; statement)
RUN; to select and
reorder variables
from <inputfile>
24
LINEQS
<equation>, … , <equation>;
STD
<variance-terms>;
COV
<covariance-terms>;
RUN; Put all model
equations in the
LINEQS section,
separated by
commas.
25
1-10 Chapter 1 Introduction to Structural Equation Modeling
26
27
1.3 Example 1: Regression Analysis 1-11
Knowledge
b1
e_var
b2
e Performance ValueOrientation
b3
JobSatisfaction
28
29
1-12 Chapter 1 Introduction to Structural Equation Modeling
32
33
1-14 Chapter 1 Introduction to Structural Equation Modeling
continued...
35
36
1.3 Example 1: Regression Analysis 1-15
37 continued...
Optimization Results
Iterations 2
...
Max Abs Gradient Element 4.348156E-14
...
GCONV2 convergence criterion satisfied.
Example 1: Summary
Tasks accomplished:
1. Set up a multiple regression model with both
PROC REG and PROC CALIS.
2. Estimated the regression parameters both ways
3. Verified that the results were comparable
4. Inspected iterative model fitting by PROC CALIS
39
1.4 Example 2: Factor Analysis 1-17
43
44
1-18 Chapter 1 Introduction to Structural Equation Modeling
StraightOr Paragraph
e7 CurvedCapitals Comprehension e4
Sentence
e8 Addition Speed Verbal Completion
e5
Counting Word
e9 e6
Dots Meaning
Factor analysis, N=145
Holzinger and Swineford (1939)
Grant-White Highschool
45
LINEQS
VisualPerception = a1 F_Visual + e1,
PaperFormBoard = a2 F_Visual + e2,
Factor names
FlagsLozenges = a3 F_Visual + e3,
start with ‘F’ in
46 PROC CALIS.
1.4 Example 2: Factor Analysis 1-19
Visual
Factor
phi1
phi2 variances
are 1.0 for
correlation
Speed Verbal matrix.
1.0 phi3 1.0
Factor
correlation
STD
F_Visual F_Verbal F_Speed = 1.0 1.0 1.0, terms go into
...; the COV
section.
COV
F_Visual F_Verbal F_Speed = phi1 phi2 phi3;
47
e1 e2 e3
List of
e7 e4 residual
e8 e5
terms
followed by
e9 e6 list of
variances
STD
...,
e1 e2 e3 e4 e5 e6 e7 e8 e9 = e_var1 e_var2 e_var3
e_var4 e_var5 e_var6 e_var7 e_var8 e_var9;
48
1-20 Chapter 1 Introduction to Structural Equation Modeling
49
χ ML
2
⎢⎣
(
= (N − 1)⎡ trace S ∑ ) ˆ ⎞⎟ − ln ( S )⎤
ˆ −1 − p + ln⎛⎜ ∑
⎝ ⎠ ⎦⎥
positive. The term is
zero only when the
match is exact.
This gives the test statistic for the null hypotheses that the predicted
matrix ∑ˆ has the specified model structure against the alternative
that ∑ˆ is unconstrained.
Parameter Estimates 21 DF = 45 – 21
Functions (Observations) 45 = 24
52
0.05
0
0 10 20 30 40 50
48.05
54
Residual
covariances, divided
by their approximate
standard error
55
1.4 Example 2: Factor Analysis 1-23
StraightOr
Curved
WordMeaning Capitals Addition CountingDots
56
59
Visual
StraightOr Paragraph
e7 CurvedCapitals Comprehension e4
Sentence
e8 Addition Speed Verbal Completion
e5
Counting Word
e9 e6
Dots Meaning
Factor analysis, N=145
Holzinger and Swineford (1939)
Grant-White Highschool
60
1.4 Example 2: Factor Analysis 1-25
62
1-26 Chapter 1 Introduction to Structural Equation Modeling
Nested Models
Suppose there are two models for the same data:
A. a base model with q1 free parameters
B. a more general model with the same q1 free
parameters, plus an additional set of q2 free
parameters
Models A and B are considered to be nested. The nesting
relationship is in the parameters—Model A can be
thought to be a more constrained version of Model B.
65
1.4 Example 2: Factor Analysis 1-27
66
correlated—the
data support the .73 .54 .61
notion of separate
Visual
abilities. r² = .58
.48
r² = .75
StraightOr .39 .57 Paragraph
e7 CurvedCapitals Comprehension e4
.43 .86
r² = .47 r² = .70
.69 .83 Sentence
e8 Addition Speed Verbal Completion e5
r² = .73 r² = .68
.24
Counting .86 .82 Word
e9 Dots Meaning e6
68
Example 2: Summary
Tasks accomplished:
1. Set up a theory-driven factor model for nine variables,
in other words, a model containing latent or
unobserved variables
2. Estimated parameters and determined that the first
model did not fit the data
3. Determined the source of the misfit by residual
analysis and modification indices
4. Modified the model accordingly and estimated its
parameters
5. Accepted the fit of new model and interpreted the
results
69
1.5 Example 3: Structural Equation Model 1-29
72
The _name_
column has been
removed here to
save space.
74
e1 e2 e3 e4
d1
F_Alienation F_Alienation d2
67 71
SES 66 is F_SES
a leading 66
indicator.
YearsOf SocioEco
School66 Index
Disturbance, prediction
error of latent endogenous
e5 e6
variable. Name must start
75
with the letter ‘d’.
1.5 Example 3: Structural Equation Model 1-31
1.0 p1 1.0 p2
d1
F_Alienation F_Alienation d2
67 71
F_SES
66
1.0
YearsOf SocioEco
School66 Index
e5 e6
76
77
1-32 Chapter 1 Introduction to Structural Equation Modeling
78
79
1.5 Example 3: Structural Equation Model 1-33
80
82
83
1.5 Example 3: Structural Equation Model 1-35
84
86
1-36 Chapter 1 Introduction to Structural Equation Modeling
AIC = χ 2 − 2⋅df
Consistent Akaike's Information Criterion (CAIC)
This is another criterion, similar to AIC, for selecting the best model
among alternatives. CAIC imposed a stricter penalty on model
complexity when sample sizes are large.
Notes:
Each of the three information criteria favors the time-invariant
model.
We would expect this model to replicate or cross-validate well
with new sample data.
88
1.5 Example 3: Structural Equation Model 1-37
-.58
6.79 -.22 Large positive
F_SES
66 autoregressive
1.00 5.23
effect of Alienation
YearsOf SocioEco
School66 Index
But note the negative
2.81 regression weights between
e5 e6
Alienation and SES!
89
YearsOf SocioEco
50% of the variance of
School66 Index Alienation determined
r² = .71 r² = .41 by “history”
e5 e6
90
1-38 Chapter 1 Introduction to Structural Equation Modeling
+ 1.0000 d2
Cool, regressions
among unobserved
variables!
92
1.5 Example 3: Structural Equation Model 1-39
Socio
YearsOf Economic
Powerlessness71 School66 Index
Anomia67 -0.883389142 1.217289084 -1.113169201
Powerlessness67 0.051352815 -1.270143495 1.143759617
Anomia71 -0.736453922 0.055115253 -1.413361725
Powerlessness71 0.033733409 0.515612093 0.442256742
YearsOfSchool66 0.515612093 0.000000000 0.000000000
SocioEconomicIndex 0.442256742 0.000000000 0.000000000
Example 3: Summary
Tasks accomplished:
1. Set up several competing models for time-dependent
variables, conceptually and with PROC CALIS
2. Models included measurement and structural
components
3. Some models were time-invariant, some had
autocorrelated residuals
4. Models were compared by chi-square statistics and
information criteria
5. Picked a winning model and interpreted the results
94
1-40 Chapter 1 Introduction to Structural Equation Modeling
99
1.6 Example 4: Effects of Errors-in-Measurement on Regression 1-41
F_Knowledge
beta
1
1 Knowledge_2 e6
e9
alpha gamma
1 1 Value 1
e2 Performance_1 1 1 Orientation_1 e3
F_Value
F_Performance Orientation
alpha gamma
1 1 Value 1
e1 Performance_2 1 e4
Orientation_2
delta
1
1 Satisfaction_1 e7
F_Satisfaction 1.2
delta
1.2 Satisfaction_2 e8
100
101
1-42 Chapter 1 Introduction to Structural Equation Modeling
Comment:
The model fit is acceptable.
102
+ 0.0617*F_Satisfaction + 1.0000 d1
0.0588 b3
1.0490
104
1.6 Example 4: Effects of Errors-in-Measurement on Regression 1-43
Standard
Variable Parameter Estimate Error t Value
F_Knowledge v_K 0.03170 0.00801 3.96
F_ValueOrientation v_VO 0.07740 0.01850 4.18
F_Satisfaction v_S 0.05850 0.01094 5.35
e1 alpha 0.00745 0.00107 6.96
e2 alpha 0.00745 0.00107 6.96
e3 beta 0.04050 0.00582 6.96
e4 beta 0.04050 0.00582 6.96
e5 gamma 0.08755 0.01257 6.96
e6 gamma 0.08755 0.01257 6.96
e7 delta 0.03517 0.00505 6.96
e8 delta 0.03517 0.00505 6.96
d1 v_d1 0.00565 0.00213 2.66
105
Hypothetical R-square
for 100% reliable
106
variables, up from 0.40.
1-44 Chapter 1 Introduction to Structural Equation Modeling
Example 4: Summary
Tasks accomplished:
1. Set up model to study effect of measurement error in
regression
2. Used split-half version of original variables as parallel
tests
3. Fixed parameters according to measurement model
(just typed in their fixed values)
4. Obtained an acceptable model
5. Found that predictability of JobPerformance could
potentially be as high as R-square=0.67
108
1.7 Conclusions 1-45
1.7 Conclusions
Conclusions
Course accomplishments:
1. Introduced Structural Equation Modeling in relation
to regression analysis, factor analysis, simultaneous
equations
2. Showed how to set up Structural Equation Models
with PROC CALIS
3. Discussed model fit by comparing covariance
matrices, and considered chi-square statistics,
information criteria, and residual analysis
4. Demonstrated several different types of modeling
applications
110
Comments
Several components of the standard SEM curriculum
were omitted due to time constraints:
Model identification
Non-recursive models
Multi-group analyses
Model replication
Power analysis
111
1-46 Chapter 1 Introduction to Structural Equation Modeling
Current Trends
Current trends in SEM methodology research:
1. Statistical models and methodologies for missing data
2. Combinations of latent trait and latent class
approaches
3. Bayesian models to deal with small sample sizes
4. Non-linear measurement and structural models
(such as IRT)
5. Extensions for non-random sampling, such as
multi-level models
112