Recursive and Non-Recursive Models
Recursive and Non-Recursive Models
So here's an example of have x1 causing x2 and there's a disturbance term for that equation
x2 in turn causes y1 we have a disturbance term there and also in that equation we have x3,
so y1 is regressed on x2 and x3, but here we see that all of the causal effects are going in one
direction and none of the disturbances are correlated. So this would be a recursive model. A
non-recursive model on the other hand will have some kind of feedback loop we have here is
causing y1 and y1 is causing x1. So these are reciprocal effects and this is actually quite a
plausible kind of causal mechanism there are many examples of situations where we would
expect two variables to be causing each other, we can think for example of economic
perceptions, the more people perceive that the economy is doing well the more that they will
support the government and the more that people support the government the more that they
may think that the economy is doing well. So there are many examples where we would want
to estimate this type of equation, and we also see here that we have a correlation between the
two disturbance terms the errors in those structural equations, and that's indeed implied by the
fact that we have this reciprocal cause and effect between Y1 and X1 means that the
disturbances must be correlated.
Now there are some grey areas and this results in what we refer to as partially recursive
models, here we see that we have a correlation between the disturbance terms but we don't
have any direct effects amongst the endogenous variables in this model. The endogenous
variables here being Y1 and X1. So in this case we can treat this in terms of identification as
a recursive model but here we do in this diagram we have a direct effect correlation, so this
would be treated as recursivity or recursive versus non recursive model stators is important
for identification but that's not terribly interesting from a sort of analytical perspective.
Recursivity is also important really because a recursive model is always identified and it's
simple to estimate we can estimate recursive models using OLS using a set of OLS models.
But that simplicity is can't estimate the more complex kinds of models that we would often
want to.
So introducing a non-recursive model means that we have more flexibility in the kinds of
model specifications that we can use, and these are actually a lot of the reasons why many
analysts want to use structural equation modelling, structural equation modelling software,
because it's actually very easy to specify this kind of model. But we have to be aware that just
because we can specify a model as a path diagram and we will generate some parameter
estimates that doesn't mean that we can estimates. So non recursive models despite being
more flexible also can be challenging in terms of identification and will often require in order
to use other variables in the model that which may not be of direct substantive interest in the
model but we need them nonetheless for identification purposes. Being empirically identified
doesn't mean necessarily that we can trust the parameter estimates, and in particular if we
want to have unbiased paths, these are when we have arrows running between two variables
in a model, we have to make some quite strict assumptions about the variables in the model.
So in particular in this sort of context with reciprocal effects, we need to assume that we have
some exogenous variables in the model that we can treat as instrumental variables and this is
another important idea for understanding non-recursive model, the idea of an instrumental
variable. To understand what we mean by an instrumental variable in this context it's useful
first to understand another concept which is that of an endogenous regressor, and here we've
got a simple path diagram to help understand what we mean by an endogenous regressor, so
we have here Y1 regressed on X1 we want to estimate beta, we'd ideally like to treat beta as
the cause of the effect of X1 on Y1, but we also see here that we have a covariance or
correlation between the disturbance term in this equation and X1 which is the predictor. Now
we know from our OLS classes that this is an assumption that we have to make in OLS that
we don't have a correlation between the error term and the predictors. If we find that there is
a, if there is a such a correlation then we have what's referred to as an endogenous regressor,
the X1 is because it can be for a number of reasons, but will often be because of some
unobserved variable that we should have in our model that maybe is related to both X1 and
Y1, or it may be because of simultaneous causal effects that X1 is causing Y1 and Y1 is
causing X1, the sort of reciprocal effects that we're interested in here that would generate this
correlation. So when we have this kind of a situation we need an instrumental variable for
X1, if we want to be able to interpret the beta coefficient as the cause and effect of X1 on Y1.
So an instrumental variable is a variable that's going to deal with this endogenous regressor
problem and it does this by introducing exogenous variability into the endogenous regressor
and to have the properties of an instrumental variable then which we refer to as Z, our
instrumental variable will be Z in this context, and the instrument must cause the endogenous
regressor but not cause the outcome. Now there are lots of different examples of instrumental
variables that have been used in the empirical literature and we'll come on to some of those,
but one good way of thinking about an instrumental variable is the assignment variable in a
randomized control trial. The randomization which determines whether someone is allocated
to the treatment or to the control condition. This is a perfect instrumental variable because it's
very strongly correlated with whether you are in the treatment or the control group, but it is
uncorrelated with whatever the outcome is in the randomized control trial. So that's a good
way of thinking about what an instrumental variable is and the sorts of variables that we will
be looking to use as instruments should come as close as possible to that sort of
randomization type of variable.
So this is what we're looking for in terms of a path diagram here we've got an endogenous
regressor X1 and we need an instrument which is Z1 here, which causes X1 and but doesn't
cause Y1 other than through its effect on X1. So you can see it has a an indirect effect on Y1
but not a direct effect. So this would be an instrumental variable. As I said there are many
papers particularly in economics which have used if natural variability natural experiments if
you like and one example is the the Vietnam Lottery draft which determined whether US
citizens were was done on the basis of a random lottery. So if you want to assess the effect of
going to vietnam on later outcomes like your earnings, your education, your mental health
and so on, then you can use that initial lottery draft as a an instrument for going to vietnam
war. Another one that's been used is proximity to your nearest college for studying the effects
of education on earnings. Obviously if you just look at the relationship between education
and earnings there are many unobserved variables that would mean that you couldn't just take
the simple correlation between education and earnings as a causal effect, but if you can use
something like proximity to a college, that can have a direct effect on education but not a
direct effect on your earnings other than through its effect on education. The third example
might be variability in the compulsory schooling age, this can vary across geographic
boundaries in US states, for example have different compulsory schooling ages or in the UK
there was a an increase in the compulsory schooling age from 15 to 16 in 1973. This can be
used to as an instrument for again the effects of education on later outcomes such as earnings
because the policy change introduced random variability into how much schooling people
obtained, but it wouldn't have had any direct effect on earnings. So those are some examples
of instrumental variables and should give you an idea that you have to meet some quite strict
requirements to to be a good instrumental variable and even for these three quite well-known
examples there have been criticisms of these as whether they really are valid instruments.
So because non-recursive models are easy to specify here's an example again using the
European Social Survey data where we're looking at the relationship between life
satisfaction, happiness and social trust. Scholars have been interested in what the relationship
is here and this model specifies reciprocal causality between these variables. Now if you just
try to estimate that model without the two exogenous variables at the bottom of the diagram,
whether you're married and your earnings, it would be unidentified. So these variables are
acting as instrumental variables in the model, but it's not really plausible to assume that they
are valid instruments because we have to assume that neither of them has a direct effect on
the other latent variable in this model. Each one only causes one latent variable but it's not
really reasonable to assume that your income is not related to your level of social trust, we
know that's an implausible assumption. So we have to be careful just because we can estimate
a, and we get parameter estimates for a structural equation model which is non recursive we
have to check our assumptions that are needed to make that identification and assess whether
we can really trust the estimates.