Analisis Jalur

MATERI KE-1
PATH ANALYSIS
In such a scenario, the model becomes complex and path analysis comes handy in
such situations. Path analysis is an extension of multiple regression. It allows for the
analysis of more complicated models. It is helpful in examining situations where
there are multiple intermediate dependent variables and in situations where Z is
dependent on variable Y, which in turn is dependent on variable X. It can compare
different models to determine which one best fits the data.
Path analysis was earlier also known as ‘causal modeling’; however, after strong
criticism people refrain from using the term because it’s not possible to establish
causal relationships using statistical techniques. Causal relationships can only be
established through experimental designs. Path analysis can be used to disprove a
model that suggests a causal relationship among variables; however, it cannot be
used to prove that a causal relation exist among variables.
Let’s understand the terminology used in the path analysis. We don’t variables as
independent or dependent here; rather, we call them exogenous or endogenous
variables. Exogenous variables (independent variables in the world of regression)
are variables which have arrows starting from them but none pointing towards them.
Endogenous variables have at least one variable pointing towards them. The reason
for such a nomenclature is that the factors that cause or influence exogenous
variables exist outside the system while, the factors that cause endogenous
variables exist within the system. In the above image, X is an exogenous variable;
while, Y and Z are endogenous variables. A typical path diagram is as shown below.
In the above figure, A, B, C, D and E are exogenous variables; while, I and O are
endogenous variables. ‘d’ is a disturbance term which is analogous to residuals in
regression.
Now, let’s go through the assumptions that we need to consider before we use path
analysis. Since, path analysis is an extension of multiple regression, most of
assumptions of multiple regression hold true for path analysis as well.
1. All the variables should have linear relations among each other.
2. Endogenous variable should be continuous. In case of ordinal data, minimum
number of categories should be five.
3. There should be no interaction among variables. In case of any interaction, a
separate term or variable can be added that reflects the interaction between the
two variables.
4. Disturbance terms are uncorrelated or covariance among the disturbance terms is
zero.
Now, let’s move a step ahead and understand the implementation of path analysis in
R. We will first try out with a toy example and then take a standard dataset available
in R.
install.packages("lavaan")
install.packages("OpenMx")
install.packages("semPlot")
install.packages("GGally")
install.packages("corrplot")
library(lavaan)
library(semPlot)
library(OpenMx)
library(GGally)
library(corrplot)
Now, let’s create our own dataset and try out path analysis. Please note that the rationale for
doing this exercise is to develop intuition to understand path analysis.
For examples:
# Let's create our own dataset and play around that first
set.seed(11)
a = 0.5
b = 5
c = 7
d = 2.5
x1 = rnorm(20, mean = 0, sd = 1)
x2 = rnorm(20, mean = 0, sd = 1)
x3 = runif(20, min = 2, max = 5)
Y = a*x1 + b*x2
Z = c*x3 + d*Y
data1 = cbind(x1, x2, x3, Y, Z)
head(data1, n = 10)
> head(data1, n = 10)

x1 x2 x3 Y Z
[1,] -0.59103110 -0.68251762 2.152597 -3.70810366 5.797922
[2,] 0.02659437 -0.01585819 3.488896 -0.06599378 24.257289
[3,] -1.51655310 -0.44260479 3.524391 -2.97130048 17.242488
[4,] -1.36265335 0.35255750 2.707776 1.08146082 21.658085
[5,] 1.17848916 0.07317058 4.441204 0.95509749 33.476170
[6,] -0.93415132 0.00715880 3.257310 -0.43128166 21.722969
[7,] 1.32360565 -0.18760011 2.574199 -0.27619773 17.328901
[8,] 0.62491779 -0.76570065 3.946699 -3.51604433 18.836781
[9,] -0.04572296 -0.22105682 4.439842 -1.12814558 28.258531
[10,] -1.00412058 -0.98358859 2.676505 -5.42000323 5.185524
Now, we have created this dataset. Let’s see the correlation matrix for these variables. This
will tell us how strongly and which all variables are correlated to each other.
> cor1 = cor(data1)

> corrplot(cor1, method = 'square')
The above chart shows us that Y is very strongly correlate with X2; while, Z is
strongly correlated with X2 and Y. The impact of X1 on Y is not as strong as that of
X2.
model1 = 'Z ~ x1 + x2 + x3 + Y
Y ~ x1 + x2'
fit1 = cfa(model1, data = data1)
summary(fit1, fit.measures = TRUE, standardized = TRUE, rsquare = TRUE)
> summary(fit1, fit.measures = TRUE, standardized = TRUE, rsquare = TRUE)

** WARNING ** lavaan (0.6-1) did NOT converge after 90 iterations
** WARNING ** Estimates below are most likely unreliable
Number of observations 20
Estimator ML
Model Fit Test Statistic NA
Degrees of freedom NA
P-value NA
Parameter Estimates:
Information Expected
Information saturated (h1) model Structured
Standard Errors Standard
Regressions:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Z ~
x1 0.721 NA 0.721 0.072
x2 0.328 NA 0.328 0.028
x3 1.915 NA 1.915 0.179
Y 1.998 NA 1.998 0.867
Y ~
x1 0.500 NA 0.500 0.115
x2 5.000 NA 5.000 0.968
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Z 14.773 NA 14.773 0.215
.Y 0.000 NA 0.000 0.000
R-Square:
Estimate
Z 0.785
Y 1.000
> semPaths(fit1, 'std', layout = 'circle')

The above plot shows us that Z is strongly dependent on Y and weakly dependent
on X3 and X1. Y is strongly dependent on X2 and weakly dependent on X1. This is
the same intuition that we have built earlier in this article. This is the beauty of path
analysis and this is how analysis can be used.
The values between the lines are path coefficients. Path coefficients are
standardized regression coefficients, similar to beta coefficients of multiple
regression. These path coefficients should be statistically significant, which can be
checked from the summary output (we will see this in the next example).
Let’s move to our second example. In this example, we will use standard dataset
‘mtcars’ available in R.
# Let's take second example where we take standard dataset 'mtcars'
available in R
data2 = mtcars
head(data2, n = 10)
> head(data2, n = 10)

mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
model2 = 'mpg ~ hp + gear + cyl + disp + carb + am + wt

hp ~ cyl + disp + carb'
fit2 = cfa(model2, data = data2)
> summary(fit2)
lavaan (0.6-1) converged normally after 62 iterations
Number of observations 32
Estimator ML
Model Fit Test Statistic 7.901
Degrees of freedom 3
P-value (Chi-square) 0.048
Information saturated (h1) model Structured
Standard Errors Standard
Regressions:
Estimate Std.Err z-value P(>|z|)
mpg ~
hp -0.022 0.016 -1.388 0.165
gear 0.586 1.247 0.470 0.638
cyl -0.848 0.710 -1.194 0.232
disp 0.006 0.012 0.512 0.609
carb -0.472 0.620 -0.761 0.446
am 1.624 1.542 1.053 0.292
wt -2.671 1.267 -2.109 0.035
hp ~
cyl 7.717 6.554 1.177 0.239
disp 0.233 0.087 2.666 0.008
carb 20.273 3.405 5.954 0.000
Variances:
Estimate Std.Err z-value P(>|z|)
.mpg 5.011 1.253 4.000 0.000
.hp 644.737 161.184 4.000 0.000
In the above summary output, we can see that wt is a significant variable for mpg at
5 percent level; while, dsp and crb are significant variables for hp. ‘Hp’ itself is not a
significant variable for mpg. We will examine this model using a path diagram using
semPlot package.
> semPaths(fit2, 'std', 'est', curveAdjacent = TRUE, style = "lisrel")
The above plot shows that mpg is strongly dependent on wt; while, hp is strongly
dependent on dsp and crb. There is a weak relation between hp and mpg. Same
inference was derived from the above output.
semPaths function can be used to create above chart in multiple ways. You can go
through the documentation for semPaths and explore different options.
There are few considerations that you should keep in mind while doing path analysis.
Path analysis is very sensitive to omission or addition of variables in the model. Any
omission of relevant variable or addition of extra variable in the model may
significantly impact the results. Also, path analysis is a technique for testing out
models and not building them. If you were to use path analysis in building models,
then you may end with endless combination of different models and choosing the
right model may not be possible. So, path analysis can be used to test a specific
model or compare multiple models to choose the best possible.
There are numerous other ways you can use path analysis. We would love to hear
your experiences of using path analysis in different contexts. Please share your
examples and experiences in the comments section below.
path analysis is structural equation modeling (SEM). There are a few packages to do
SEM in R, like: lavaan, SEM.
a simple example;
x3 affects both x1 and x2 and x2 affects x1
##############R-code##############
library(lavaan)
model1<-'x3 ~ x1 + x2
x2 ~ x1'
fit1<-sem(model1)
#Summary of the fitted model
summary(fit1)
#check the coefficients
coef(fit1)
#and as dataframe
parameterEstimates(fit1)
############end R-code############
for more details and examples on lavaan package
try http://users.ugent.be/~yrosseel/lavaan/lavaanIntroduction.pdf
MATERI KE-2
How to Run Path Analysis with R
For this path analysis practice exercise, I continue to use the election
data I used in the previous post. Instead of using some datasets that I
am not quite familiar with, using my own data really helps make my
learning experience more relatable and personal,
Step 1: Install and load lavaan package

It looks like one R package I can use to perform path analysis is
called lavaan by Yves Rosseel (many thanks!). Before doing anything else,
I have to install and load lavaan package. This step is now kind of
simple. I just type the following.
1 install.packages("lavaan")
2 library(lavaan)
Step 2: Specify a model

To run path analysis with lavaan, I have to next specify a model to
estimate. Path analysis necessarily involves some kind of indirect effects
from X to Y mediated by a third variable Z or Zs.
In this practice exercise, I have 2 mediating variables, party affiliation

and political interest, which carry the prior effects of age, sex, race,
education, and income on support for Trump.
In other words, age, sex, race, education, and income are specified to
predict party affiliation and political interest, respectively. Then, party
affiliation and political interest, respectively, are specified to predict
support for Trump. It might be that people who are older, males,
Caucasians, less educated, and less rich may lean toward Republicans
and have more interest in this election, which in turn predict support for
Donald Trump.
 Disclaimer: the model I outline here is not based on any theory. It’s more
of a post-hoc model. When I first ran path analysis, I included political
ideology in the model as another mediating variable. But, this model did
not fit the data well. For some reason, when I removed political ideology,
the model fit the data well. So, I just decided to use the above model for
pretty much this reason. It’s always good to see good fit indices!
Now, with lavaan, it looks like I have to first store a model in a new
variable, which I label model1. Each mediator and the final outcome
variable are placed on the left-hand side, followed by tilde (~). Then, I
place predictors of each mediator and the outcome variable on the right-
hand side. A model is enveloped with single quotes (‘ ‘). So, I type the
following
1 model1 <- 'party ~ age + sex + race + educ + inco
2 inter ~ age + sex + race + educ + inco
3 suppt ~ party + inter'
Step 3: Run analysis!

Once a model is specified, I estimate the model. Doing so seems pretty
straightforward. I can use the sem() function. And inside the parentheses,
I type model1 – the variable that I stored the model I specified – and
data.
1 results1 <- sem(model1, data=election)
When I run this code, R will store results in another new variable that I
create, results1. To see the results stored in results1, I use the summary ()
function and enter results1 inside the parentheses.
1 summary(results1)
Then, I get the following results.
> summary(results1)
l a v a a n ( 0 . 5 - 2 2 ) c o n v e rged normally after 36

iterations
Used Total
Nu m b e r o f o b s e r v a t i o n s 630 677
Estimator ML
Mi n i m u m F u n c t i o n T e s t Statistic 13.542
De g r e e s o f f r e e d o m 6
P- v a l u e ( C h i- s q u a r e ) 0.035
St a n d a r d E r r o r s S t a n d ard
Regressions:
Es t i m a t e S t d . E r r z- v alue P(>|z|)
pa r t y ~
ag e 0 . 0 5 6 0 . 0 4 7 1 . 1 8 5 0.236
se x 0 . 2 3 9 0 . 1 6 0 1 . 4 9 1 0.136
ra c e - 1 . 1 8 8 0 . 1 8 5 - 6 .408 0.000
ed u c 0 . 1 4 5 0 . 0 5 1 2 . 8 1 9 0.005
in c o - 0 . 1 2 5 0 . 0 3 9 - 3 .228 0.001
in t e r ~
ag e 0 . 1 8 1 0 . 0 4 4 4 . 1 4 4 0.000
se x - 0. 1 8 5 0 . 1 4 8 - 1 . 248 0.212
ra c e - 0 . 0 3 4 0 . 1 7 1 - 0 .198 0.843
ed u c 0 . 0 1 8 0 . 0 4 7 0 . 3 8 7 0.699
in c o 0 . 1 0 4 0 . 0 3 6 2 . 9 0 0 0.004
su p p t ~
pa r t y -0 . 5 6 7 0 . 0 2 2 - 25.368 0.000
in t e r 0 . 1 4 5 0 . 0 2 5 5 . 8 78 0.000
Variances:
Es t i m a t e S t d . E r r z- v alue P(>|z|)
.p a r t y 3 . 4 7 9 0 . 1 9 6 1 7 .748 0.000
.i n t e r 2 . 9 7 3 0 . 1 6 8 1 7 .748 0.000
.s u p p t 1 . 1 9 9 0 . 0 6 8 1 7 .748 0.000
These results indicate that respondents who were Caucasians and who
had higher income were stronger Republicans. In contrast, those who
had higher education were stronger Democrats.
Age and income had a positive relationship with political interest. Older
and richer respondents showed higher levels of interest in politics. Then,
stronger Republicans and more politically interested individuals more
strongly supported Trump.
Before moving forward!

Before moving forward, I need to consider a few things.
1. The coefficients I get are all unstandardized. If I want to see the
relative importance of each predictor and directly compare which
variable plays a bigger or smaller role in this model, maybe I need
standardized coefficients.
2. Maybe, I need to know how much variance in support for Trump this
model accounts for.
3. Looking at the model fit index – Minimum Function Test Statistic

(namely chi-square statistics) and its corresponding p-value – it’s
significant. This means the model was actually significantly different from
the data. Or, the model did not fit the data well. Maybe, I need to modify
the model a little bit – free up some parameters to improve the fit. How
do I do that? With lavaan, I can request modification indices.
4. Finally, the above result only shows one model fit index: Minimum
Function Test Statistic (chi-square). But, chi-square tends to be sensitive
to sample size. When data have large sample size, chi-square tends to
be significant (indicating that the model is significantly different from
data, instead of approximating data). So, I may end up making an
erroneous conclusion. I need to see other model fit indices.
What if I need these statistics? I learned I can add a few things in

the summary () function.
1 summary(results1, standardized=TRUE, fit.measures=TRUE, rsq=TRUE, modindices=TRUE)
standardized=TRUE means I am requesting standardized coefficients.
 fit.measures=TRUE means I am requesting other model fit indices.
 modindices=TRUE means I am requesting modification indices
Then, I get the following results!
> s u m m a r y ( r e s u l t s 1 , s t andardized=TRUE,
f i t . m e a s u r e s = T R U E , r s q =TRUE, modindices=TRUE)

iterations
Used Total
Estimator ML
M o d e l t e s t b a s e l i n e m o del:
M i n i m u m F u n c t i o n T e s t Statistic 566.979
De g r e e s o f f r e e d o m 1 8
P- v a l u e 0 . 0 0 0
U s e r m o d e l v e r s u s b a s e line model:
C o m p a r a t i v e F i t I n d e x (CFI) 0.986
Tu c k e r -L e w i s I n d e x ( TLI) 0.959
L o g l i k e l i h o o d a n d I n f o rmation Criteria:
L o g l i k e l i h o o d u s e r m o d el (H0) -7954.627
Lo g l i k e l i h o o d u n r e s t r icted model (H1) - 7947.856
N u m b e r o f f r e e p a r a m e t ers 15
Ak a i k e ( A I C ) 1 5 9 3 9 . 2 5 4
Ba y e s i a n ( B I C ) 1 6 0 0 5 . 940
Sa m p l e -s i z e a d j u s t e d Bayesian (BIC) 15958.316
R o o t M e a n S q u a r e E r r o r of Approximation:
RMSEA 0.045
90 P e r c e n t C o n f i d e n c e Interval 0.011 0.077
P- v a l u e R M S E A < = 0 . 0 5 0.558
S t a n d a r d i z e d R o o t M e a n Square Residual:
SRMR 0.019
Regressions:
Es t i m a t e S t d . E r r z- v alue P(>|z|) Std.lv Std.all
pa r t y ~
ag e 0 . 0 5 6 0 . 0 4 7 1 . 1 8 5 0.236 0.056 0.047
se x 0 . 2 3 9 0 . 1 6 0 1 . 4 9 1 0.136 0.239 0.061
ra c e - 1 . 1 8 8 0 . 1 8 5 - 6 .408 0.000 -1.188 - 0.270
ed u c 0 . 1 4 5 0 . 0 5 1 2 . 8 1 9 0.005 0.145 0.122
in c o - 0 . 1 2 5 0 . 0 3 9 - 3 .228 0.001 -0.125 - 0.138
in t e r ~
ag e 0 . 1 8 1 0 . 0 4 4 4 . 1 4 4 0.000 0.181 0.167
se x - 0. 1 8 5 0 . 1 4 8 - 1 . 248 0.212 - 0.185 - 0.052
ra c e - 0 . 0 3 4 0 . 1 7 1 - 0 .198 0.843 -0.034 - 0.009
ed u c 0 . 0 1 8 0 . 0 4 7 0 . 3 8 7 0.699 0.018 0.017
in c o 0 . 1 0 4 0 . 0 3 6 2 . 9 0 0 0.004 0.104 0.126
su p p t ~
pa r t y -0 . 5 6 7 0 . 0 2 2 - 25.368 0.000 -0.567 - 0.701
in t e r 0 . 1 4 5 0 . 0 2 5 5 . 8 7 8 0.000 0.145 0.162

Variances:
.p a r t y 3 . 4 7 9 0 . 1 9 6 1 7 .748 0.000 3.479 0.914
.i n t e r 2 . 9 7 3 0 . 1 6 8 1 7 .748 0.000 2.973 0.948
.s u p p t 1 . 1 9 9 0 . 0 6 8 1 7 .748 0.000 1.199 0.481
R -S q u a r e :
Es t i m a t e
pa r t y 0 . 0 8 6
in t e r 0 . 0 5 2
su p p t 0 . 5 1 9
Modification Indices:
l h s o p r h s m i e p c s e p c .lv sepc.all sepc.nox
1 6 a g e ~ ~ a g e 0 . 0 0 0 0 . 000 0.000 0.000 0.000
1 7 a g e ~ ~ s e x 0 . 0 0 0 0 . 000 0.000 0.000 0.000
1 8 a g e ~ ~ r a c e 0 . 0 0 0 0 .000 0.000 0.000 0.000
1 9 a g e ~ ~ e d u c 0 . 0 0 0 0 .000 0. 000 0.000 0.000

2 0 a g e ~ ~ i n c o 0 . 0 0 0 0 .000 0.000 0.000 0.000
2 1 s e x ~ ~ s e x 0 . 0 0 0 0 . 000 0.000 0.000 0.000
2 2 s e x ~ ~ r a c e 0 . 0 0 0 0 .000 0.000 0.000 0.000
2 3 s e x ~ ~ e d u c 0 . 0 0 0 0 .000 0.000 0.000 0.000
2 4 s e x ~ ~ i n c o 0 . 0 0 0 0 .000 0.000 0.000 0.000
2 5 r a c e ~ ~ r a c e 0 . 0 0 0 0.000 0.000 0.000 0.000
2 6 r a c e ~ ~ e d u c 0 . 0 0 0 0.000 0.000 0.000 0.000
2 7 r a c e ~ ~ i n c o 0 . 0 0 0 0.000 0.000 0.000 0.000
2 8 e d u c ~ ~ e d u c 0 . 0 0 0 0.000 0.000 0.000 0.000
2 9 e d u c ~ ~ i n c o 0 . 0 0 0 0.000 0.000 0.000 0.000
3 0 i n c o ~ ~ i n c o 0 . 0 0 0 0.000 0.000 0.000 0.000
3 1 p a r t y ~ ~ i n t e r 0 . 1 0 0 -0.041 - 0.041 - 0.012 -0.012
3 2 p a r t y ~ ~ s u p p t 3 . 2 1 5 0.498 0.498 0.162 0.162
3 3 i n t e r ~ ~ s u p p t 4 . 5 2 4 0.699 0.699 0.250 0.250
3 4 p a r t y ~ i n t e r 0 . 1 0 0 - 0.014 -0.014 -0.012 -0.012
3 5 p a r t y ~ s u p p t 1 . 4 9 0 0.223 0.223 0.181 0.181
3 6 i n t e r ~ p a r t y 0 . 1 0 0 - 0.012 -0.012 -0.013 -0.013
3 7 i n t e r ~ s u p p t 0 . 6 3 7 0.050 0.050 0.045 0.045
3 8 s u p p t ~ a g e 1 . 0 4 9 - 0.028 - 0.028 -0.029 -0.018
3 9 s u p p t ~ s e x 0 . 0 5 6 - 0.021 - 0.021 -0.007 -0.013

4 0 s u p p t ~ r a c e 1 . 8 2 2 0.137 0.137 0.038 0.087
4 1 s u p p t ~ e d u c 9 . 2 8 3 -0.082 -0.082 - 0.085 - 0.052
4 2 s u p p t ~ i n c o 4 . 2 2 6 -0.042 -0.042 - 0.058 - 0.027
4 3 a g e ~ p a r t y 0 . 0 0 0 0 .000 0.000 0.000 0.000
4 4 a g e ~ i n t e r 0 . 0 0 0 0 .000 0.000 0.000 0.000
4 5 a g e ~ s u p p t 0 . 8 3 4 - 0.051 - 0.051 -0.049 -0.049
4 6 a g e ~ s e x 0 . 0 0 0 0 . 0 00 0.000 0.0 00 0.000
4 7 a g e ~ r a c e 0 . 0 0 0 0 . 000 0.000 0.000 0.000
4 8 a g e ~ e d u c 0 . 0 0 0 0 . 000 0.000 0.000 0.000
4 9 a g e ~ i n c o 0 . 0 0 0 0 . 000 0.000 0.000 0.000
5 0 s e x ~ p a r t y 0 . 0 0 0 0 .000 0.000 0.000 0.000
5 1 s e x ~ i n t e r 0 . 0 0 0 0 .000 0.000 0.000 0.000
5 2 s e x ~ s u p p t 0 . 8 5 5 - 0.015 - 0.015 -0.047 -0.047
5 3 s e x ~ a g e 0 . 0 0 0 0 . 0 00 0.000 0.000 0.000
5 4 s e x ~ r a c e 0 . 0 0 0 0 . 000 0.000 0.000 0.000
5 5 s e x ~ e d u c 0 . 0 0 0 0 . 000 0.000 0.000 0.000
5 6 s e x ~ i n c o 0 . 0 0 0 0 . 000 0.000 0.000 0.000
5 7 r a c e ~ p a r t y 0 . 0 0 0 0.000 0.000 0.000 0.000
5 8 r a c e ~ i n t e r 0 . 0 0 0 0.000 0.000 0.000 0.000
5 9 r a c e ~ s u p p t 2 . 2 8 7 0.021 0.021 0.075 0.075

6 0 r a c e ~ a g e 0 . 0 0 0 0 . 000 0.000 0.000 0.000
6 1 r a c e ~ s e x 0 . 0 0 0 0 . 000 0.000 0.000 0.000
6 2 r a c e ~ e d u c 0 . 0 0 0 0 .000 0.000 0.000 0.000
6 3 r a c e ~ i n c o 0 . 0 0 0 0 .000 0.000 0.000 0.000
6 4 e d u c ~ p a r t y 0 . 0 0 0 0.000 0.000 0.000 0.000
6 5 e d u c ~ i n t e r 0 . 0 0 0 0.000 0.000 0.000 0.000
6 6 e d u c ~ s u p p t 4 . 2 9 7 -0.105 -0.105 - 0.101 - 0.101
6 7 e d u c ~ a g e 0 . 0 0 0 0 . 000 0.000 0.000 0.000
6 8 e d u c ~ s e x 0 . 0 0 0 0 . 000 0.000 0.000 0.000
6 9 e d u c ~ r a c e 0 . 0 0 0 0 .000 0.000 0.000 0.000
7 0 e d u c ~ i n c o 0 . 0 0 0 0 .000 0.000 0.000 0.000
7 1 i n c o ~ p a r t y 0 . 0 0 0 0.000 0.000 0.000 0.000
7 2 i n c o ~ i n t e r 0 . 0 0 0 0.000 0.000 0.000 0.000
7 3 i n c o ~ s u p p t 0 . 6 0 7 -0.052 -0.052 - 0.038 - 0.038
7 4 i n c o ~ a g e 0 . 0 0 0 0 . 000 0.000 0.000 0 .000
7 5 i n c o ~ s e x 0 . 0 0 0 0 . 000 0.000 0.000 0.000
7 6 i n c o ~ r a c e 0 . 0 0 0 0 .000 0.000 0.000 0.000
7 7 i n c o ~ e d u c 0 . 0 0 0 0 .000 0.000 0.000 0.000
Looking at the fit indices, the model seems pretty good.

Step 4: Test indirect effects
When I see the results under “Regressions,” I just see path coefficients. I
do not see specific significance tests for the indirect effects of age, sex,
race, education, and income on support for Trump through party
affiliation and political interest.
To test the indirect effects with lavaan, apparently I need to give labels
to each parameter and use those labels in a model syntax. Then, I use
the “:=” operator to define new parameters. So, I type the following.
1
2 model2 <- 'party ~ a1*age + a2*sex + a3*race + a4*educ + a5*inco
3 inter ~ a6*age + a7*sex + a8*race + a9*educ + a10*inco
4 suppc ~ b1*party + b2*inter + c1*age + c2*sex + c3*race + c4*educ + c5*inco
a1b1 := a1*b1
5 a2b1 := a2*b1
6 a3b1 := a3*b1
7 a4b1 := a4*b1
8 a5b1 := a5*b1
9 a6b2 := a6*b2
a7b2 := a7*b2
10 a8b2 := a8*b2
11 a9b2 := a9*b2
12 a10b2 := a10*b2
13 total := c1 + c2 + c3 + c4+ c5 + (a1*b1) + (a2*b1) + (a3*b1) + (a4*b1) + (a5*b1) +
14
What the above code means is that I am assigning a label to each

parameter to estimate. I use a for paths from exogeneous variables to
mediators, use b for paths from mediators to the outcome variable,
and c for direct paths from exogeneous variables to the outcome
variable.
For example, the path from age to party gets a1. With *, this path is
labeled a1*age. Apparently, I need to use this asterisk (*) between a1
and the variable name. I put this label on the right-hand side of one line. On the left-hand
side of the same line, I type a1b1 (indirect effect of age on support for Trump through party
affiliation) – this is just a descriptive label that I give for this path – followed by the := operator.
The path from sex to party is now labeled a2*sex. Again, that is placed on the right-hand side of one
line. On the left-hand side of the same line, I type a2b1, followed by the := operator.
The path from party to suppt (support for Trump) gets b1*party. The
path from inter to suppt gets b2*inter. The direct path from age to suppt
gets c1*age, and so forth. And, everything is enveloped with single
quotes (‘ ‘). Hope this is all correct!
I run this code and use the sem() function next.

1 results2 <- sem(model2, data=election)
Then see results stored in results2 – I remove modindices=TRUE,
because it makes the output a bit too lengthy.
1 summary(results2, standardized=TRUE, fit.measures=TRUE, rsq=TRUE)
Now, I get the following results.
> r e s u l t s 2 < - s e m ( m o d e l2, data=election)
> s u m m a r y ( r e s u l t s 2 , s t andardized=TRUE,
f i t . m e a s u r e s = T R U E , r s q =TRUE)

iterations
Us e d T o t a l
Es t i m a t o r M L
M o d e l t e s t b a s e l i n e m o del:
De g r e e s o f f r e e d o m 1 8
P- v a l u e 0 . 0 0 0
U s e r m o d e l v e r s u s b a s e line model:
Co m p a r a t i v e F i t I n d e x (CFI) 1.000
Tu c k e r -L e w i s I n d e x ( TLI) 1.029
L o g l i k e l i h o o d a n d I n f o rmation Criteria:
Lo g l i k e l i h o o d u s e r m o del (H0) - 7941.289
Lo g l i k e l i h o o d u n r e s t r icted model (H1) - 7941.239
Nu m b e r o f f r e e p a r a m e ters 20
Ak a i k e ( A I C ) 1 5 9 2 2 . 5 7 8
Ba y e s i a n ( B I C ) 1 6 0 1 1 . 492
Sa m p l e -s i z e a d j u s t e d Bayesian (BIC) 15947.994
R o o t M e a n S q u a r e E r r o r of Approximation:
RM S E A 0 . 0 0 0
90 P e r c e n t C o n f i d e n c e Interval 0.000 0.073
P- v a l u e R M S E A < = 0 . 0 5 0.884
S t a n d a r d i z e d R o o t M e a n Square Residual:
SR M R 0 . 0 0 2
In f o r m a t i o n E x p e c t e d
Regressions:
pa r t y ~
ag e ( a 1 ) 0 . 0 5 6 0 . 0 4 7 1.185 0.236 0.056 0.047
se x ( a 2 ) 0 . 2 3 9 0 . 1 6 0 1.491 0.136 0.239 0.061
ra c e ( a 3 ) - 1 . 1 8 8 0 . 1 85 -6.408 0.000 -1.188 -0.270

ed u c ( a 4 ) 0 . 1 4 5 0 . 0 5 1 2.819 0.005 0.145 0.122
in c o ( a 5 ) - 0 . 1 2 5 0 . 0 39 -3.228 0.001 -0.125 -0.138
in t e r ~
ag e (a 6 ) 0 . 1 8 1 0 . 0 4 4 4.144 0.000 0.181 0.167
se x ( a 7 ) - 0 . 1 8 5 0 . 1 4 8 - 1.248 0.212 - 0.185 - 0.052
ra c e ( a 8 ) - 0 . 0 3 4 0 . 1 71 -0.198 0.843 -0.034 -0.009
ed u c ( a 9 ) 0 . 0 1 8 0 . 0 4 7 0.387 0.699 0.018 0.017
in c o ( a 1 0 ) 0 . 1 0 4 0 . 0 3 6 2.900 0.004 0.104 0.126
su p p c ~
pa r t y ( b 1 ) 0 . 5 7 1 0 . 0 2 3 24.937 0.000 0.571 0.706
in t e r ( b 2 ) 0 . 0 7 8 0 . 0 2 5 3.155 0.002 0.078 0.088
ag e ( c 1 ) - 0 . 0 3 4 0 . 0 2 7 - 1.236 0.217 - 0.034 - 0.035
se x ( c 2 ) 0 . 1 1 3 0 . 0 9 2 1.226 0.220 0.113 0.036
ra c e ( c 3 ) - 0 . 2 6 8 0 . 1 10 -2.436 0.015 -0.268 -0.075
ed u c ( c 4 ) 0 . 0 2 1 0 . 0 3 0 0.714 0.475 0.021 0.022
in c o ( c 5 ) 0 . 0 0 4 0 . 0 2 3 0.180 0.857 0.004 0.006
Variances:
.p a r t y 3 . 4 7 9 0 . 1 9 6 1 7 .748 0.000 3.479 0.914

.i n t e r 2 . 9 7 3 0 . 1 6 8 1 7 .748 0.000 2.973 0.948
.s u p p c 1 . 15 0 0 . 0 6 5 1 7.748 0.000 1.150 0.461
R -S q u a r e :
Es t i m a t e
pa r t y 0 . 0 8 6
in t e r 0 . 0 5 2
su p p c 0 . 5 3 9
Defined Parameters:
a1 b 1 0 . 0 3 2 0 . 0 2 7 1 . 1 8 4 0.236 0.032 0.033
a2 b 1 0 . 1 3 6 0 . 0 9 2 1 . 4 8 9 0.137 0.136 0.043
a3 b 1 - 0 . 6 7 8 0 . 1 0 9 - 6 .207 0.000 -0.678 - 0.190
a4 b 1 0 . 0 8 3 0 . 0 2 9 2 . 8 0 2 0.005 0.083 0.086
a5 b 1 - 0 . 0 7 2 0 . 0 2 2 - 3 .201 0.001 -0.072 - 0.098
a6 b 2 0 . 0 1 4 0 . 0 0 6 2 . 5 1 1 0.012 0.014 0.015
a7 b 2 - 0 . 0 1 4 0 . 0 1 2 - 1 .161 0.246 -0.014 - 0.005
a8 b 2 - 0 . 0 0 3 0 . 0 1 3 - 0 .198 0.843 -0 .003 - 0.001
a9 b 2 0 . 0 0 1 0 . 0 0 4 0 . 3 8 4 0.701 0.001 0.001

a1 0 b 2 0 . 0 0 8 0 . 0 0 4 2 . 1 35 0.033 0.008 0.011
to t a l -0 . 6 5 6 0 . 1 6 5 - 3.982 0.000 - 0.656 -0.151
>
When I see the results at the very bottom of the output, I see statistical
significance of each indirect effect specified. For example, the indirect
effect of age on support for Trump through party affiliation (a1b1) is not
significaint (p = .236). However, education has an indirect effect on
support for Trump through party affiliation (b = .083, p = .005).
Generate confidence intervals (CIs)

It seems the summary() function can provide basic information. But, if I
want more detailed information, then I can use the parameterEstimates()
function. Based on the lavaan package page, I type the following.
1 parameterEstiamtes(results2, ci = TRUE, level = 0.95, boot.ci.type = "perc", stand
With the above code, I am requesting 95% percentile bootstrap confidence

intervals, as well as standardized coefficients. Then, I get the following. I
think lhs in the header refers to left-hand side column, and rhs right-hand
side column. Op refers to operator
> p a r a m e t e r E s t i m a t e s ( r esults2, ci=TRUE, level=0.95,

b o o t . c i . t y p e = " p e r c " , s tandardized=TRUE)
lh s o p r h s l a b e l e s t se z pvalue ci.lower ci.upper

std.lv std.all std.nox
1 p a r t y ~ a g e a 1 0 . 0 5 6 0.047 1.185 0.236 - 0.037 0.148

0.056 0.047 0.029
2 p a r t y ~ s e x a 2 0 . 2 3 9 0.160 1.491 0.136 - 0.075 0.552

0.239 0.061 0.122
3 p a r t y ~ r a c e a 3 - 1 . 188 0.185 - 6.408 0.000 -1.551 -
0 . 8 2 5 - 1 . 1 8 8 - 0 . 2 7 0 - 0.609
4 p a r t y ~ e d u c a 4 0 . 1 4 5 0.051 2.819 0.005 0.044 0.245

0.145 0.122 0.074
5 p a r t y ~ i n c o a 5 - 0 . 125 0.039 - 3.228 0.001 -0.202 -

0 . 0 4 9 - 0 . 1 2 5 - 0 . 1 3 8 - 0.064
6 i n t e r ~ a g e a 6 0 . 1 8 1 0.044 4.144 0.000 0.095 0.266

0.181 0.167 0.102
7 i n t e r ~ s e x a 7 -0 . 1 85 0.148 -1.248 0.212 - 0.475

0 . 1 0 5 - 0 . 1 8 5 - 0 . 0 5 2 - 0.104
8 i n t e r ~ r a c e a 8 - 0 . 034 0.171 - 0.198 0.843 -0.370

0 . 3 0 2 - 0 . 0 3 4 - 0 . 0 0 9 - 0.019
9 i n t e r ~ e d u c a 9 0 . 0 1 8 0.047 0.387 0.699 -0.075

0.111 0.018 0.017 0.010
1 0 i n t e r ~ i n c o a 1 0 0 . 104 0.036 2.900 0.004 0.034

0.175 0.104 0.126 0.059
1 1 s u p p c ~ p a r t y b 1 0 . 571 0.023 24.937 0.000 0.526

0.616 0.571 0.706 0.706
1 2 s u p p c ~ i n t e r b 2 0 . 078 0.025 3.155 0.002 0.030

0.127 0.078 0.088 0.088
1 3 s u p p c ~ a g e c 1 - 0 . 034 0.027 - 1.236 0.217 -0.088

0 . 0 2 0 - 0 . 0 3 4 - 0 . 0 3 5 - 0.022
1 4 s u p p c ~ s e x c 2 0 . 1 1 3 0.092 1.226 0.220 -0.068

0.294 0.113 0.036 0.072
1 5 s u p p c ~ r a c e c 3 - 0 .268 0.110 -2.436 0.015 -0.483 -

0 . 0 5 2 - 0 . 2 6 8 - 0 . 0 7 5 - 0.170
1 6 s u p p c ~ e d u c c 4 0 . 0 21 0.030 0.714 0.475 - 0.037

0.079 0.021 0.022 0.013
1 7 s u p p c ~ i n c o c 5 0 . 0 04 0.023 0.180 0.857 - 0.040
0.049 0.004 0.006 0.003
1 8 p a r t y ~ ~ p a r t y 3 . 4 7 9 0.196 17.748 0.000 3.095

3 . 8 6 4 3 . 4 7 9 0 . 9 1 4 0 . 9 14
1 9 i n t e r ~ ~ i n t e r 2 . 9 7 3 0.168 17.748 0.000 2.645

3.302 2.973 0.948 0.948
2 0 s u p p c ~ ~ s u p p c 1 . 1 5 0 0.065 17.748 0.000 1.023

1.276 1.150 0.461 0.461
2 1 a g e ~ ~ a g e 2 . 6 7 2 0 . 000 NA NA 2.672 2.672 2.672

1.000 2.672
2 2 a g e ~ ~ s e x 0 . 0 5 3 0 . 000 NA NA 0.053 0.053 0.053

0.064 0.053
2 3 a g e ~ ~ r a c e 0 . 1 6 4 0 .000 NA NA 0.164 0.164 0.164

0.227 0.164
2 4 a g e ~ ~ e d u c 0 . 3 0 7 0 .000 NA NA 0.307 0.307 0.307

0.115 0.307
2 5 a g e ~ ~ i n c o 0 . 3 0 8 0 .000 NA NA 0.308 0.308 0.308

0.088 0.308
2 6 s e x ~ ~ s e x 0 . 2 5 0 0 . 000 NA NA 0.250 0.250 0.250

1.000 0.250
2 7 s e x ~ ~ r a c e 0 . 0 8 1 0 .000 NA NA 0.081 0.081 0.081

0.364 0.081
2 8 s e x ~ ~ e d u c - 0 . 0 6 1 0.000 NA NA -0.061 - 0.061 -

0.061 -0.075 -0.061
2 9 s e x ~ ~ i n c o - 0 . 0 5 6 0.000 NA NA -0.056 - 0.056 -

0.056 -0.053 -0.056
3 0 r a c e ~ ~ r a c e 0 . 1 9 6 0.000 NA NA 0.196 0.196 0.196

1.000 0.196
3 1 r a c e ~ ~ e d u c - 0 . 0 5 1 0.000 NA NA -0.051 -0.051 -
0.051 -0.071 -0.051
3 2 r a c e ~ ~ i n c o 0 . 0 1 5 0.000 NA NA 0.015 0.015 0.015

0.015 0.015
3 3 e d u c ~ ~ e d u c 2 . 6 8 9 0.000 NA NA 2.689 2.689 2.689

1.000 2.689
3 4 e d u c ~ ~ i n c o 1 . 5 9 0 0. 000 NA NA 1.590 1.590 1.590

0.451 1.590
3 5 i n c o ~ ~ i n c o 4 . 6 1 6 0.000 NA NA 4.616 4.616 4.616

1.000 4.616
3 6 a 1 b 1 : = a 1 * b 1 a 1 b 1 0.032 0.027 1.184 0.236 - 0.021

0.085 0.032 0.033 0.020
3 7 a 2 b 1 : = a 2 * b 1 a 2 b 1 0.136 0.092 1.489 0.137 - 0.043

0.316 0.136 0.043 0.086
3 8 a 3 b 1 : = a 3 * b 1 a 3 b 1 -0.678 0.109 -6.207 0.000 -

0 . 8 9 3 - 0 . 4 6 4 - 0 . 6 7 8 - 0.190 -0.430
3 9 a 4 b 1 : = a 4 * b 1 a 4 b 1 0.083 0.029 2.802 0.005 0.025

0.140 0.083 0.086 0.052
4 0 a 5 b 1 : = a 5 * b 1 a 5 b 1 -0.072 0.022 -3.201 0.001 -

0 . 1 1 6 - 0 . 0 2 8 - 0 . 0 7 2 - 0.098 -0.045
4 1 a 6 b 2 : = a 6 * b 2 a 6 b 2 0.014 0.006 2.511 0.012 0.003

0.025 0.014 0.015 0.009
4 2 a 7 b 2 : = a 7 * b 2 a 7 b 2 -0.014 0.012 -1.161 0.246 -

0 . 0 3 9 0 . 0 1 0 -0 . 0 1 4 - 0 .005 - 0.009
4 3 a 8 b 2 : = a 8 * b 2 a 8 b 2 -0.003 0.013 -0.198 0.843 -

0 . 0 2 9 0 . 0 2 4 -0 . 0 0 3 - 0 .001 - 0.002
4 4 a 9 b 2 : = a 9 * b 2 a 9 b 2 0.001 0.004 0.384 0.701 - 0.006

0.009 0.001 0.001 0.001
4 5 a 1 0 b 2 : = a 1 0 * b 2 a 1 0 b2 0.008 0.004 2.135 0.033
0 . 0 0 1 0 . 0 1 6 0 . 0 0 8 0 . 0 1 1 0.005
46 total :=
c 1 + c 2 + c 3 + c 4 + c 5 + ( a 1 * b 1 ) +(a2*b1)+(a3*b1)+(a4*b1)+(a5*b1
) + ( a 6 * b 2 ) + ( a 7 * b 2 ) + ( a 8 * b2)+(a9*b2)+(a10*b2) total -
0 . 6 5 6 0 . 1 6 5 -3 . 9 8 2 0 . 0 00 -0.979 -0.333 - 0.656 - 0.151
- 0. 4 1 6
>
Wrapping Up
I think there are some other things I should do, such as analyzing
localized residuals and replicating the results with existing SEM
packages. But, with lavaan, I learned I can do many things.
When I was reading Webpages on lavaan, I could only find mediation

examples involving only one mediator. But, with the above code, I could
run mediation analysis with multiple mediators. I hope this is all correct –
at least R did not give me any error messages.

Analisis Jalur

Uploaded by

Copyright:

Available Formats

Analisis Jalur

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Analisis Jalur

Uploaded by

Copyright:

Available Formats

MATERI KE-1

> head(data1, n = 10)

> cor1 = cor(data1)

> summary(fit1, fit.measures = TRUE, standardized = TRUE, rsquare = TRUE)

> semPaths(fit1, 'std', layout = 'circle')

> head(data2, n = 10)

model2 = 'mpg ~ hp + gear + cyl + disp + carb + am + wt

Step 1: Install and load lavaan package

Step 2: Specify a model

In this practice exercise, I have 2 mediating variables, party affiliation

Step 3: Run analysis!

Then, I get the following results.

l a v a a n ( 0 . 5 - 2 2 ) c o n v e rged normally after 36

Before moving forward!

3. Looking at the model fit index – Minimum Function Test Statistic

What if I need these statistics? I learned I can add a few things in

l a v a a n ( 0 . 5 - 2 2 ) c o n v e rged normally after 36

Lo g l i k e l i h o o d u n r e s t r icted model (H1) - 7947.856

Sa m p l e -s i z e a d j u s t e d Bayesian (BIC) 15958.316

90 P e r c e n t C o n f i d e n c e Interval 0.011 0.077

Es t i m a t e S t d . E r r z- v alue P(>|z|) Std.lv Std.all

ag e 0 . 0 5 6 0 . 0 4 7 1 . 1 8 5 0.236 0.056 0.047

se x 0 . 2 3 9 0 . 1 6 0 1 . 4 9 1 0.136 0.239 0.061

ra c e - 1 . 1 8 8 0 . 1 8 5 - 6 .408 0.000 -1.188 - 0.270

ed u c 0 . 1 4 5 0 . 0 5 1 2 . 8 1 9 0.005 0.145 0.122

in c o - 0 . 1 2 5 0 . 0 3 9 - 3 .228 0.001 -0.125 - 0.138

ag e 0 . 1 8 1 0 . 0 4 4 4 . 1 4 4 0.000 0.181 0.167

se x - 0. 1 8 5 0 . 1 4 8 - 1 . 248 0.212 - 0.185 - 0.052

ra c e - 0 . 0 3 4 0 . 1 7 1 - 0 .198 0.843 -0.034 - 0.009

ed u c 0 . 0 1 8 0 . 0 4 7 0 . 3 8 7 0.699 0.018 0.017

in c o 0 . 1 0 4 0 . 0 3 6 2 . 9 0 0 0.004 0.104 0.126

pa r t y -0 . 5 6 7 0 . 0 2 2 - 25.368 0.000 -0.567 - 0.701

in t e r 0 . 1 4 5 0 . 0 2 5 5 . 8 7 8 0.000 0.145 0.162

Es t i m a t e S t d . E r r z- v alue P(>|z|) Std.lv Std.all

.p a r t y 3 . 4 7 9 0 . 1 9 6 1 7 .748 0.000 3.479 0.914

.i n t e r 2 . 9 7 3 0 . 1 6 8 1 7 .748 0.000 2.973 0.948

.s u p p t 1 . 1 9 9 0 . 0 6 8 1 7 .748 0.000 1.199 0.481

l h s o p r h s m i e p c s e p c .lv sepc.all sepc.nox

1 6 a g e ~ ~ a g e 0 . 0 0 0 0 . 000 0.000 0.000 0.000

1 7 a g e ~ ~ s e x 0 . 0 0 0 0 . 000 0.000 0.000 0.000

1 8 a g e ~ ~ r a c e 0 . 0 0 0 0 .000 0.000 0.000 0.000

1 9 a g e ~ ~ e d u c 0 . 0 0 0 0 .000 0. 000 0.000 0.000

2 1 s e x ~ ~ s e x 0 . 0 0 0 0 . 000 0.000 0.000 0.000

2 2 s e x ~ ~ r a c e 0 . 0 0 0 0 .000 0.000 0.000 0.000

2 3 s e x ~ ~ e d u c 0 . 0 0 0 0 .000 0.000 0.000 0.000

2 4 s e x ~ ~ i n c o 0 . 0 0 0 0 .000 0.000 0.000 0.000

2 5 r a c e ~ ~ r a c e 0 . 0 0 0 0.000 0.000 0.000 0.000

2 6 r a c e ~ ~ e d u c 0 . 0 0 0 0.000 0.000 0.000 0.000

2 7 r a c e ~ ~ i n c o 0 . 0 0 0 0.000 0.000 0.000 0.000

2 8 e d u c ~ ~ e d u c 0 . 0 0 0 0.000 0.000 0.000 0.000

2 9 e d u c ~ ~ i n c o 0 . 0 0 0 0.000 0.000 0.000 0.000

3 0 i n c o ~ ~ i n c o 0 . 0 0 0 0.000 0.000 0.000 0.000

3 1 p a r t y ~ ~ i n t e r 0 . 1 0 0 -0.041 - 0.041 - 0.012 -0.012

3 2 p a r t y ~ ~ s u p p t 3 . 2 1 5 0.498 0.498 0.162 0.162

3 3 i n t e r ~ ~ s u p p t 4 . 5 2 4 0.699 0.699 0.250 0.250

3 4 p a r t y ~ i n t e r 0 . 1 0 0 - 0.014 -0.014 -0.012 -0.012

3 5 p a r t y ~ s u p p t 1 . 4 9 0 0.223 0.223 0.181 0.181

3 6 i n t e r ~ p a r t y 0 . 1 0 0 - 0.012 -0.012 -0.013 -0.013

3 7 i n t e r ~ s u p p t 0 . 6 3 7 0.050 0.050 0.045 0.045

3 8 s u p p t ~ a g e 1 . 0 4 9 - 0.028 - 0.028 -0.029 -0.018