0% found this document useful (0 votes)

96 views

Best Linear Predictor

The document discusses prediction versus causal inference. 1. Prediction involves finding the linear function that best predicts an outcome Y based on a variable X, minimizing mean squared error. This results in the best linear predictor (BLP) of Y from X. 2. However, the BLP does not necessarily measure the causal effect of a change in X on Y, as it does not account for counterfactual changes. 3. For causal inference, an instrumental variable Z is needed that affects the treatment X but not the outcome Y directly, to isolate the causal effect of X on Y from other confounding factors. The instrumental variable approach exploits variation in X caused by Z to obtain a consistent estimate of the causal effect

Uploaded by

HoracioCastellanosMuñoa

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

96 views

Best Linear Predictor

Uploaded by

HoracioCastellanosMuñoa

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Causal inference

Prediction and causal inference, in a nutshell

1 Prediction
(Source: Amemiya, ch. 4)
Best Linear Predictor: a motivation for linear univariate regression
Consider two random variables X and Y . What is the “best” predictor of Y , among all the
possible linear functions of X?
“Best” linear predictor minimizes the mean squared error of prediction:

min E(Y − α − βX)2 . (1)

α,β

The first-order conditions are:

For α: 2α − 2EY + 2βEX = 0
For β: 2βEX 2 − 2EXY + 2αEX = 0.

Solving:

Cov(X, Y )
β∗ =
VX (2)
α∗ = EY − β ∗ EX

These are the coefficients obtained in a linear regression of Y on a single variable X.

Important: β ∗ does not measure the change in Y caused by a change in X. Difference
between prediction and causation. The latter implies a “counterfactual” change in X.
Let Ŷ ≡ α∗ + β ∗ X denote a “fitted value” of Y , and U ≡ Y − Ŷ denote the “residual” or
prediction error:

• EU = 0

• V Ŷ = (β ∗ )2 V X = (Cov(X, Y ))2 /V X = ρ2XY V Y

• V U = V Y + (β ∗ )2 V X − 2β ∗ Cov(X, Y ) = V Y − (Cov(X, Y ))2 /V X = (1 − ρ2XY )V Y

Hence, the b.l.p. accounts for a ρ2XY proportion of the variance in Y ; in this sense, the
correlation measures the linear relationship between Y and X.
Causal inference

Also note that

Cov(Ŷ , U ) = Cov(Ŷ , Y − Ŷ )
= E[(Ŷ − E Ŷ )(Y − Ŷ − EY + E Ŷ )]
= E[(Ŷ − E Ŷ )(Y − EY ) − (Ŷ − E Ŷ )(Ŷ − E Ŷ )]
= Cov(Ŷ , Y ) − V Ŷ
= E[(α∗ + β ∗ X − α∗ − β ∗ EX)(Y − EY )] − V Ŷ (3)
= β ∗ E[(X − EX)(Y − EY )] − V Ŷ
= β ∗ Cov(X, Y ) − V Ŷ
= Cov 2 (X, Y )/V X − Cov 2 (X, Y )/V X
= 0.

Similarly, Cov(X, U ) = 0.
Hence, for any random variable X, the random variable Y can be written as the sum of
a part which is a linear function of X, and a part which is uncorrelated with X. This
decomposition of Y is done when you regress Y on X.
Finally, note that (obviously) the BLP of the BLP – that is, the best linear predictor given
X of the BLP of Y given X – is just the BLP itself. There is no gain, in predicting Y , by
iterating the procedure.
■■■
Note: in practice, with a finite sample of Y, X, the minimization problem (1) is infeasible.
In practice, we minimize the sample counterpart
∑
min (Yi − α − βXi )2 (4)
α,β
i

which is the objective function in ordinary least squares regression. The OLS values for α
and β are the sample versions of Eq. (2).
■■■
Next we consider some intuition of least-squares regression. Go back to the population
problem. Assume that the “true” model describing the generation of the Y process is:

Y = α0 + β 0 X + ϵ, Eϵ = 0. (5)

What we mean by true model is that this is a causal model in the sense that a one-unit
increase in X would raise Y by β 0 units. (In the previous section, we just assume that Y, X
move jointly together, so there is no sense in which changes in X “cause” changes in Y .)
Causal inference

Question: under what assumptions does doing least-squares on Y, X (which leads to the
best linear predictor from the previous section) recover the true model; ie. α∗ = α0 , and
β∗ = β0?

• For α∗ :
α∗ = EY − β ∗ EX
= α0 + β 0 EX + Eϵ − β ∗ EX

which is equal to α0 if β 0 = β ∗ .
• For β ∗ :
Cov(α0 + β 0 X + ϵ, X)
β∗ =
V arX
1 { }
= · E[X(α0 + β 0 X + ϵ)] − EX · E[α0 + β 0 X + ϵ]
V arX
1 { }
= · α0 EX + β 0 EX 2 + E[ϵX] − α0 EX − β 0 [EX]2 − EXEϵ
V arX
1 { }
= · β 0 [EX 2 − (EX)2 ] + E[ϵX]
V arX

which is equal to β 0 if
E[ϵX] = 0.

This is an “exogeneity” assumption, that (roughly) X and the disturbance term ϵ are
uncorrelated. Under this assumption, the best linear predictors from the infeasible problem
(1) coincide with the true values of α0 , β 0 . Correspondingly, it turns out that the feasible
finite-sample least-squares estimates from (4) are “good” (in some sense) estimators for α0 ,
β0.
■■■
Best prediction
Generalize above results to general (not just linear) prediction.
What if we don’t restrict ourselves to linear function of X? What general function of X is
optimal predictor of Y ?
min E [Y − ϕ(X)]2 .
ϕ(·)

The middle term is

Hence, Eq. (6) is clearly minimized when

ϕ(X) = E(Y |X) :

the conditional expectation of Y given X is the best predictor.

Define the residual U ≡ Y − E(Y |X). As in the b.l.p. case, Cov(U, E(Y |X)) = 0:

Also, by similar calculations, Cov(U, X) = 0.

Note how useful the law of iterated expectations is.
■■■
Both the b.l.p. and b.p. are examples of projections of Y onto spaces of functions of X.
More precisely:
Definition: a projection of a random variable Y onto a space S is the element Ŝ ∈ S which
minimizes (technical caveats aside)

min E(Y − S)2 .

S∈S

Projection theorem: Let S be a linear space of random variables with finite second
moments. Then Ŝ is the projection of Y onto S if and only if Ŝ ∈ S and the orthogonality
condition is satisfied:

ES(Y − Ŝ) = 0, ∀S ∈ S. (7)

The projection Ŝ is unique.

Proof: For any S and Ŝ in S:

E(Y − S)2 = E(Y − Ŝ)2 + 2E(Y − Ŝ)(Ŝ − S) + E(Ŝ − S)2 .

If Ŝ satisfies the orthogonality condition, then the middle term is zero, and we conclude that
E(Y −S)2 ≥ E(Y − Ŝ)2 with strict inequality unless E(Ŝ −S)2 = 0. Thus the orthogonality
condition implies that Ŝ is a projection, and also that it is unique.
Causal inference

For any number α,

E(Y − S − αŜ)2 − E(Y − Ŝ)2 = −2αE(Y − Ŝ)S + α2 ES 2 .

If Ŝ is a projection, then this expression is nonnegative for every α (note that the vector
S + αŜ ∈ S). But the parabola −2αE(Y − Ŝ)S + α2 ES 2 is nonnegative iff E(Y − Ŝ)S = 0
– the orthogonality condition – is satisfied. ■
In the b.l.p. case: S is the space of all linear transformations of X, and the orthogonality
condition (7) implies that both Cov(U, X) = 0 and Cov(U, Ŷ ) = 0 (because both X, Ŷ ∈ S),
which we showed. Note that the projection theorem implies that Cov(U, g(X)) = 0, for any
linear function of X.
In the b.p. case: S is the space of all transformations of X, say g(X) with finite second
moments (i.e., Eg(X)2 < ∞).
■■■
Obviously, the projection of the projection is just the projection itself. Letting PS Y denote
the projection of Y on the space S, we have that PS [PS Y ] = PS Y . That is, projections are
idempotent operators; idempotency is even a defining feature of a projection. (You will use
this fact ubiquitously next quarter.)

2 Causal inference: Background

To start with, we consider a linear structural outcome equation, and homogeneous effects:

y = βd + ϵ (8)

where y is some outcome, d is an explanatory (or “treatment”) variable of interest, and ϵ

is an unobservable which represent unobserved determinants of y not accounted for in d.
Here β measures the causal effect of a unitary change in d on the outcome y. Examples: (y
is wages, d is yrs of schooling), (y is quantity demanded, d is price), (y is price, d is market
concentration), (y is test scores, d is class size), etc. You want to estimate β. But if d is
endogenous (in the sense that E(ϵ · d) ̸= 0) then OLS estimate is biased.
A classic solution to the endogeneity problem is to use an instrumental variable z, which
should be correlated with d, uncorrelated with ϵ, and excluded from the equation of interest
(8).
Heuristically, rewrite the structural model as:

y = β ′ d(z, x) + ϵ

where the notation d(x, z) makes explicit that the treatment d depends on both the instru-
ment z (the “exogenous” variation) and other factors x (which are correlated with ϵ, leading
Causal inference

to “endogenous” variation). We derive the causal effect of d on y indirectly, by considering

an exogenous change in z, which in turn affects d and then affects y. Formally, we have
/
dy ∂y ∂d dy ∂d
= =⇒ β =
dz ∂d ∂z dz ∂z

In the special case when we have a binary auxiliary variable Z ∈ {0, 1}, we obtain the
following estimator:
E[Y |Z = 1] − E[Y |Z = 0]
.
E[D|Z = 1] − E[D|Z = 0]
This is the classical Wald estimator. A number of the treatment effect estimators we consider
below take this form, for different choices of the auxiliary variable Z.

3 Cross-sectional approaches
Here we consider the situation where each individual in the dataset is only observed once.
We also restrict attention to the binary treatment case. (Most common case for policy
evaluation.)

3.1 Rubin causal framework

• Treatment D ∈ 0, 1
• Potential outcomes YD , D = 0, 1
• “Treatment effect”: ∆ ≡ Y1 − Y0 .
• Goal of inference: moments of ∆.
– Average Treatment Effect: E[∆]
– Average TE on the treated: E[∆|D = 1]
– Local ATE: E[∆|Z = z] for some auxiliary variable Z (depends on setting)
– Local ATT, &etc...
– Note that if ∆ is a nondegenerate random variable, it implies that the treatment
effect differs across individuals in an arbitrary way. (In a linear model, this is
consistent with the model yi = βi di + ϵi , so that the coefficient on the treatment
variable is different for every individual.)
• In the cross-sectional setting, the crucial data limitation is that each individual can
only be observed in one of the possible treatments: that is, defining
Y = D ∗ Y1 + (1 − D) ∗ Y0
the researcher observes a sample of (Y, D, Z) across individuals (Z are auxiliary vari-
ables).
Causal inference

A naive estimator of ATE is just the difference in conditional means E[Y |D = 1]−E[Y |D =
0]. This is obviously not a good thing to do unless Y0 , Y1 ⊥ D – that is, unless treatment
is randomly assigned (as it would be in a controlled lab setting, or in a tightly controlled
field experiment). Otherwise, typically E[Y |D = 0] = E[Y0 |D = 0] ̸= E[Y0 ], and similarly
to E[Y |D = 1].

3.2 Selection on observables: propensity score weighting and matching

• Assumption: Y0 , Y1 ⊥ D|Z, where Z denotes variables observed for each individual.
This is selection on observables, as the interpretation is that treatments are exogenous
once the additional observables Z are controlled for.
• Let FZ denote the joint distribution of the Z variables. With this assumption, we
have that
∫ ∫
{E[Y |D = 1, Z] − E[Y |D = 0, Z]} dFZ = {E[Y1 |D = 1, Z] − E[Y0 |D = 0, Z]} dFZ
∫
= {E[Y1 |Z] − E[Y0 |Z]} dFZ

= E[Y1 − Y0 ]

which is the average treatment effect.

• But if Z is large dimensional, then inplementing this is not feasible. Therefore we
consider some dimension-reducing approaches.
• Define the propensity score:

Q = P rob(D = 1|Z).

This can be estimated for each individual in the sample. Hence we assume that we
observe (Y, D, Z, Q) for everyone in the sample. Remember that Q is just a function
of Z.
• Rosenbaum and Rubin (1983) theorem: under the selection on observables assump-
tion, we also have (Y0 , Y1 ) ⊥ D|Q.
Proof: We want to show that P (D, Y1 , Y0 |Q) = P (D|Q)P (Y1 , Y0 |Q). Starting with
the Law of Total Probability, we have P (D, Y1 , Y0 |Q) = P (D|Y1 , Y0 , Q)P (Y1 , Y0 |Q).
So it suffices to show P (D|Y1 , Y0 , Q) = P (D|Q). Since D is binary, we can focus on
showing this for P [D = 1|Y1 , Y0 , Q) = P (D = 1|Q). Note that

P [D = 1|Y0 , Y1 , Q] = E {E[D|Y1 , Y0 , Z]|Y1 , Y0 , Q}

= E {E[D|Z]|Y1 , Y0 , Q}
= E {Q|Y0 , Y1 , Q} = Q.

which does not depend on (Y1 , Y0 ). ■

Causal inference

3.2.1Inverse PS weighting
[ ]
• Main result: E(Y1 ) = E D∗Y
Q (Horvitz-Thompson estimator)
Proof:
[ ] ][
D∗Y D∗Y 1
E = EE |Z = E E [D ∗ Y1 |Z]
Q Q Q

• This is inverse propensity score weighting. Intuitively, in the case of E(Y1 ), you weight
each individual in the treated sample by the probability of that individual being in
the treated sample, which is Q.

• Since we divide by the propensity score Q above, we need that:

0 < Q(Z) < 1, ∀Z.

This is known as the overlap assumption. Practically, it implies that for any Z,
individuals with those covariates have a nonzero chance of being treated. Obviously,
if there is any set of Z with positive probability for which Q = 0, then this set must
be excluded from the expectation above, and so it is invalid to interpret it as the
unconditional mean of Y0 .
Causal inference

3.2.2 PS matching
This is just dimension reduction. Let FQ denote the distribution of propensity scores. We
have that
∫ ∫
{E[Y |D = 1, Q] − E[Y |D = 0, Q]} dFQ = {E[Y1 |D = 1, Q] − E[Y0 |D = 0, Q]} dFQ
∫
= {E[Y1 |Q] − E[Y0 |Q]} dFQ

= E[Y1 − Y0 ]

which is the average treatment effect. The penultimate equality uses the Rosenbaum-Rubin
theorem.
This is “matching” in the sense that for each value of Q, you compare the average outcome
of treated vs. untreated with this Q. Many variants on this based on how you match
individuals in the treated vs. untreated samples.

3.3 Regression Discontinuity design

3.3.1 Basic setup (“sharp” design)
• Forcing variable Z: D = 0 when Z ≤ Z̄; D = 1 when Z > Z̄. This implies you
observe E[Y0 |Z] for Z ≤ Z̄, and E[Y1 |Z] for Z > Z̄.
• Continuity assumption: E[YD |Z] continuous at Z = Z̄, for D = 0, 1.
• Local unconfoundedness: Y0 , Y1 ⊥ D|Z for Z in a neighborhood of Z̄. This means
that P (Y1 , Y0 , D|Z) = P (Y1 , Y0 |Z)P (D|Z).
• E[Y |D = 1, Z̄ + ]−E[Y |D = 0, Z̄ − ] estimates E[Y1 −Y0 |Z̄], the “local” treatment effect
for individuals with forcing variable Z = Z̄.
Proof:

E[Y |D = 1, Z̄ + ] − E[Y |D = 0, Z̄ − ] = E[Y1 |D = 1, Z̄ + ] − E[Y0 |D = 0, Z̄ − ]

= E[Y1 |Z̄ + ] − E[Y0 |Z̄ − ] (by cond. independence)
= E[Y1 |Z̄] − E[Y0 |Z̄] (by continuity)

Example: Angrist and Lavy (1999): y is test scores, d is class size, z is indicator for whether
total enrollment was “just above” a multiple of 40. Maimonides’ rules states (roughly) that
no class size should exceed forty, so that if enrollment (treated as exogenous) is “just below”
40, class sizes will be bigger, whereas if enrollment is “just above” 40, class sizes will be
smaller. They restrict their sample to all (school-cohorts) where total enrollment was within
+/- 5 of a multiple of 40.
Causal inference

3.3.2 “Fuzzy” design

• Probability of treatment jumps discontinuously at Z̄: that is, P [D = 1|Z] jumps (up)
at Z = Z̄. Define P + = P (D = 1|Z̄ + ) and analogously P − .

• Conditional independence: Y1 , Y0 ⊥ D|Z in a neighborhood of Z̄.

• Continuity: E[YD |Z̄ + ] = E[YD |Z̄ − ] for D = 0, 1.

• Let Y = (1 − D)Y0 + DY1 . Then

E[Y |Z̄ + ] − E[Y |Z̄ − ]

E[Y1 − Y0 |Z̄] ≈ ,
E[D|Z̄ + ] − E[D|Z̄ − ]

a Wald-type estimator.1
Proof: We have

E[Y |Z̄ + ] = (1 − P + )E[Y0 |Z̄ + ] + P + E[Y1 |Z̄ + ]

= (1 − P + )E[Y0 |Z̄] + P + E[Y1 |Z̄]
{ }
= E[Y0 |Z̄] + P + · E[Y1 |Z̄] − E[Y0 |Z̄] .
{ }
Similarly E[Y |Z̄ − ] = E[Y{ −
0 |Z̄] + P · E[Y1 |Z̄]}− E[Y0 |Z̄] . Hence numerator of Wald
estimator is (P + − P − ) · E[Y1 |Z̄] − E[Y0 |Z̄] . Denominator is (P + − P − ). ■.

Interpretation: above is Wald IV estimator in regression of observed outcome Y on D, using

values of the instrument Z close to the jump point Z̄.

3.4 Instrumental variables: LATE

More formally, the basic binary local average treatment effect (“LATE”) setup is the fol-
lowing (cf. Angrist and Pischke (2009)):

• Binary IV: Z ∈ {0, 1}.

• Potential treatments (binary) DZ ∈ {0, 1}

• Potential outcomes YDZ = y(D, Z)

• Assumption A1 (Exclusion): YD,0 = YD,1 ≡ YD for D = 0, 1.

• Assumption A2 (Independence): Y1 , Y0 , D1 , D0 ⊥ Z

• A3 (“rank”): E[D1 − D0 ] ̸= 0.
1
See Hahn, Todd, and van der Klaauw (2001).
Causal inference

• A4 (Monotonicity): D1 ≥ D0 with probability 1.

• “Full” (latent) sample is (Y0 , Y1 , D0 , D1 , Z). We observe a sample of (Y, D, Z):

E[Y |Z = 1] = E[(1 − D)Y0 + DY1 |Z = 1] = E[(1 − D1 )Y0 + D1 Y1 ].

Similarly, E[Y |Z = 0] = E[(1 − D0 )Y0 + D0 Y1 ], implying that the numerator is

E[Y |Z = 1] − E[Y |Z = 0] = E[(Y1 − Y0 )(D1 − D0 )]

= E[(Y1 − Y0 ) · 1|D1 > D0 ]P (D1 > D0 ) + E[(Y1 − Y0 ) · 0|D1 = D0 ]P (D1 = D0 )
+ E[(Y1 − Y0 ) · (−1)|D1 < D0 ]P (D1 < D0 )
= E[(Y1 − Y0 )|D1 > D0 ]P (D1 > D0 ) + 0 + 0.

Denominator, by similar argument, equals P (D1 > D0 ).

Here, the Wald estimator measures the average effect of d on y for those for whom a change
in z from 0 to 1 would have affected the treatment d. This insight is known by several terms,
including local IV and local average treatment effect (LATE) (see Angrist and Imbens (1994)
for more details).
Examples:

Angrist and Krueger (1991) y is wages, d is yrs of schooling, z is quarter of birth

(1=Jan-Aug; 0=Sept-Dec). Exploits two institutional features: (i) can only enter
school (kindergarten) when you are 5 yrs old by Sept. 1; (ii) must remain in school
until age 16 =⇒ people with z = 1 forced to complete more yrs of schooling before
they can drop out.
Hence, for all kids born in say 2000, those born before 9/1/2000 (tagged z = 1) started
school a year earlier, and will be in tenth grade when they are allowed to drop out.
Those born after 9/1/2000 (tagged z = 0) started school a year later, and will only
be in ninth grade when they are allowed to drop out.2
For this case, the LATE measures the effect of an extra year of schooling on those
(dropout) students for whom an earlier birth (ie. change z from 0 to 1) would have
been forced to complete an extra year of schooling before dropping out.
2
Note that if compulsory schooling were described in terms of years of schooling, then identification
strategy fails.
Causal inference

Angrist (1990) y is lifetime income, d is years of experience in the (civilian) workforce,

and z is draft eligibility. Intuition: that draft eligibility led to exogenous shift in years
of experience.
Angrist, Graddy, and Imbens (2000) y is quantity demanded, d is price, and z is
weather variable.
Angrist and Evans (1990) y is parents’ labor supply, d is number of children, z is indi-
cator of sex composition of children (i.e., whether first two births were females)

4 Panel data
In panel data, one observes the same individual over several time periods, including (ideally)
periods both before and after a policy change. For example, d is often a policy change which
affects some states but not others.
In this richer data environment, one can estimate the effect of the policy change while
controlling arbitrarily for individual-specific heterogeneity, as well as for time-specific effects.
This is the difference-in-difference approach.
Abstractly, consider outcome variables indexed by the triple (i, t, d), with i, t, d ∈ {0, 1} (all
binary). Here i denotes a subsample, with i = 1 being the treated subsample. t denotes time
period, with t = 1 denoting the period when individuals in subsample i = 1 are treated. d
is the treatment variable, as before. Of the eight possible combinations, we only observe
Y000 , Y010 , Y100 , Y111 .

• Common trend: E[Y110 − Y010 ] = E[Y100 − Y000 ] = α.

• The DID estimator is:

DID = E[Y111 − Y100 ] − E[Y010 − Y000 ]

• Under the common trend assumption,

DID = E[Y111 ] − (E[Y000 ] + α) − (E[Y110 ] − α) + E[Y000 ]

= E[Y111 − Y110 ]

which is the treatment effect on the treated (see Fig. 1).

The DID is typically obtained by linear regression. Consider the following linear model:

yit = αi + βdit + γt + ϵit

with ϵ ⊥ d. In first differences, this is:

∆yi = β∆di + (γ1 − γ0 ) + ηi

Causal inference

Y(111)

Y(010)

Y(000) alpha

Y(110)

Y(100)

Figure 1: Difference in difference: illustration

Causal inference

with η ⊥ ∆di . By running this regression, the estimated β̂ is an estimate of the DID.
In the regression context, it is easy to control for additional variables Zit which also affect
outcomes.
There are many many examples of this. Two examples are:

Card and Krueger (1994) y is employment, d is minimum wage (look for evidence of
general equilibrium effects of minimum wage). Exploit policy shift which resulted in
rise of minimum wage in New Jersey, but not in Pennsylvania. Sample is fast food
restaurants on the NJ/Pennsylvania border.

Kim and Singal (1993) y is price, d is concentration of particular flight market. Exploit
merger of Northwest and Republic airlines, which affected only markets (so we hope)
in which Northwest or Republic offered flights.

References
Angrist, J. (1990): “Lifetime Earnings and the Vietnam Era Draft Lottery: Evidence
from Social Security Administrative Records,” American Economic Review, 80, 313–336.

Angrist, J., and W. Evans (1990): “Lifetime Earnings and the Vietnam Era Draft
Lottery: Evidence from Social Security Administrative Records,” American Economic
Review, 80, 313–336.

Angrist, J., K. Graddy, and G. Imbens (2000): “The Interpretation of Instrumental

Variables Estimators in Simulataneous Equations Models with an Application to the
Demand for Fish,” Review of Economic Studies, 67, 499–527.

Angrist, J., and G. Imbens (1994): “Identification and Estimation of Local Average
Treatment Effects,” Econometrica, 62, 467–476.

Angrist, J., and A. Krueger (1991): “Does Compulsory School Attendance Affect
Scholling and Earnings?,” Quarterly Journal of Economics, 106, 979–1014.

Angrist, J., and V. Lavy (1999): “Using Maimonides’ Rule to Estimate the Effect of
Class Size on Scholastic Achievement,” Quarterly Journal of Economics, 114, 533–575.

Angrist, J., and J. Pischke (2009): Mostly Harmless Econometrics. Princeton University
Press.

Card, D., and A. Krueger (1994): “Minimum Wages and Employment: A Case Study
of the Fast-Food Industry in New Jersey and Pennsylvania,” American Economic Review,
84, 772–93.
Causal inference

Hahn, J., P. Todd, and W. van der Klaauw (2001): “Estimation of Treatment
Effectswith a Quasi-Experimental Regression-Discontinuity Design,” Econometrica, 69,
201–210.

Kim, E., and V. Singal (1993): “Mergers and Market Power: Evidence from the Airline
Industry,” American Economic Review, 83, 549–569.

Rosenbaum, P., and D. Rubin (1983): “The central role of the propensity score in
observational studies of causal effects,” Biometrika, 70, 41–55.

Two-Variable Regression Analysis, Some Basic Ideas
No ratings yet
Two-Variable Regression Analysis, Some Basic Ideas
28 pages
Connections Between Wind Tonguings and Keyboard Fingerings (1500-1650)
No ratings yet
Connections Between Wind Tonguings and Keyboard Fingerings (1500-1650)
110 pages
ARE 213 Notes
No ratings yet
ARE 213 Notes
227 pages
Quantitative Methods in Economics Linear Predictor: Maximilian Kasy
No ratings yet
Quantitative Methods in Economics Linear Predictor: Maximilian Kasy
19 pages
Advanced Econometrics PDF
No ratings yet
Advanced Econometrics PDF
58 pages
M300 Summary Notes
No ratings yet
M300 Summary Notes
12 pages
Lecture 1a
No ratings yet
Lecture 1a
17 pages
Lectures
No ratings yet
Lectures
766 pages
Gary Chamberlain Econometric S
No ratings yet
Gary Chamberlain Econometric S
152 pages
Module 4
No ratings yet
Module 4
36 pages
Econ20222 MJAbackgr
No ratings yet
Econ20222 MJAbackgr
164 pages
econ4
No ratings yet
econ4
92 pages
Week 2, OLS
No ratings yet
Week 2, OLS
83 pages
Lec Topic2
No ratings yet
Lec Topic2
68 pages
CIML2023
No ratings yet
CIML2023
87 pages
Econometric Theory: Module - Iii
No ratings yet
Econometric Theory: Module - Iii
10 pages
Chapter 6: Regression
No ratings yet
Chapter 6: Regression
7 pages
Econometrics Notes
No ratings yet
Econometrics Notes
95 pages
Unit - 1
No ratings yet
Unit - 1
8 pages
Simple Linear Regression Model I
No ratings yet
Simple Linear Regression Model I
83 pages
LECTURE2
No ratings yet
LECTURE2
13 pages
IAPRI Technical Training-Intro To Applied Econometrics 2018 06 25+-+Nicole+Mason
No ratings yet
IAPRI Technical Training-Intro To Applied Econometrics 2018 06 25+-+Nicole+Mason
29 pages
Instrumental-variables-slides-2021
No ratings yet
Instrumental-variables-slides-2021
26 pages
Lecture 1b
No ratings yet
Lecture 1b
7 pages
14.382 Inference: Creative Commons BY-NC-SA
No ratings yet
14.382 Inference: Creative Commons BY-NC-SA
19 pages
07 BiasAndRegression
No ratings yet
07 BiasAndRegression
35 pages
Assignment3SolNew_Fall2024 (1)
No ratings yet
Assignment3SolNew_Fall2024 (1)
9 pages
Chapter2 Econometrics MultipleLinearRegressionModel 1 1
No ratings yet
Chapter2 Econometrics MultipleLinearRegressionModel 1 1
34 pages
BivariateReg WT2425
No ratings yet
BivariateReg WT2425
109 pages
Week1 Lecture2
No ratings yet
Week1 Lecture2
57 pages
Stats101A - Chapter 2
No ratings yet
Stats101A - Chapter 2
59 pages
Assignments Ashoka University
No ratings yet
Assignments Ashoka University
32 pages
Matrix OLS NYU Notes
No ratings yet
Matrix OLS NYU Notes
14 pages
Econometrics I 2
No ratings yet
Econometrics I 2
38 pages
Econometrics Module 2
No ratings yet
Econometrics Module 2
38 pages
351note06 Lecture Notes 6
No ratings yet
351note06 Lecture Notes 6
29 pages
CH 2
No ratings yet
CH 2
31 pages
Lec2 Ase Iev
No ratings yet
Lec2 Ase Iev
32 pages
Introduction To Mathematical Modeling: Simple Linear Regression
No ratings yet
Introduction To Mathematical Modeling: Simple Linear Regression
21 pages
Classical Linear Regression and Its Assumptions
No ratings yet
Classical Linear Regression and Its Assumptions
63 pages
Ba Rimsr
No ratings yet
Ba Rimsr
110 pages
EC220/221 Introduction To Econometrics: Canh Thien Dang
No ratings yet
EC220/221 Introduction To Econometrics: Canh Thien Dang
30 pages
Short - Notes - Econometric Methods
No ratings yet
Short - Notes - Econometric Methods
22 pages
Pertemuan 3
No ratings yet
Pertemuan 3
23 pages
Regression With One Regressor
No ratings yet
Regression With One Regressor
25 pages
Econometrics
No ratings yet
Econometrics
13 pages
Econometrics I: Professor William Greene Stern School of Business Department of Economics
No ratings yet
Econometrics I: Professor William Greene Stern School of Business Department of Economics
47 pages
Chapter 1 - Regression Recap
No ratings yet
Chapter 1 - Regression Recap
24 pages
Manual Econometrics
No ratings yet
Manual Econometrics
20 pages
405 Econometrics Odar N. Gujarati: Prof. M. El-Sakka
100% (1)
405 Econometrics Odar N. Gujarati: Prof. M. El-Sakka
27 pages
Quantitative Methods in Economics Linear Predictor: Maximilian Kasy
No ratings yet
Quantitative Methods in Economics Linear Predictor: Maximilian Kasy
19 pages
04 16 Simple Regression
No ratings yet
04 16 Simple Regression
47 pages
Notes2
No ratings yet
Notes2
16 pages
Multiple Linear Regression Model
No ratings yet
Multiple Linear Regression Model
99 pages
Uni Variate Regression
No ratings yet
Uni Variate Regression
61 pages
Additional Cheatsheet en
No ratings yet
Additional Cheatsheet en
3 pages
Lecture 2 SLR - 1
No ratings yet
Lecture 2 SLR - 1
28 pages
Introduction to Bessel Functions
From Everand
Introduction to Bessel Functions
Frank Bowman
2.5/5 (1)
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
P. D. Balsdon, A. Horawa, R. M. Lenain, H. Yang
No ratings yet
P. D. Balsdon, A. Horawa, R. M. Lenain, H. Yang
36 pages
Problemas de Algebra
No ratings yet
Problemas de Algebra
13 pages
Etale Fundamental Groupss
No ratings yet
Etale Fundamental Groupss
73 pages
Notas Utiles Homotopia
No ratings yet
Notas Utiles Homotopia
21 pages
(Universitext) Manfredo P. Do Carmo - Differential Forms and Applications-Springer-Verlag (2000) PDF
No ratings yet
(Universitext) Manfredo P. Do Carmo - Differential Forms and Applications-Springer-Verlag (2000) PDF
119 pages
Marcos M. Alexandrino, Renato G. Bettiol - Lie Groups and Geometric Aspects of Isometric Actions-Springer (2015)
No ratings yet
Marcos M. Alexandrino, Renato G. Bettiol - Lie Groups and Geometric Aspects of Isometric Actions-Springer (2015)
215 pages
Game Theory WSC PDF
No ratings yet
Game Theory WSC PDF
297 pages
CS 8803 Advanced Topics in Algorithmic Game Theory: Georgios Piliouras Jugal Garg Ruta Mehta
No ratings yet
CS 8803 Advanced Topics in Algorithmic Game Theory: Georgios Piliouras Jugal Garg Ruta Mehta
4 pages
(Federico Caffè Lectures) Samuel Bowles - The New Economics of Inequality and Redistribution-Cambridge University Press (2012)
100% (1)
(Federico Caffè Lectures) Samuel Bowles - The New Economics of Inequality and Redistribution-Cambridge University Press (2012)
207 pages
Economic Growth in Small Open Economies Lessons From The Visegrad Countries
No ratings yet
Economic Growth in Small Open Economies Lessons From The Visegrad Countries
215 pages
X When No Danger of Confusion May Arise (Namely, Always) We Will Forget To
No ratings yet
X When No Danger of Confusion May Arise (Namely, Always) We Will Forget To
21 pages
Toric Varietis Cox PDF
No ratings yet
Toric Varietis Cox PDF
368 pages
Patients Safety - Key Issues and Challenges
No ratings yet
Patients Safety - Key Issues and Challenges
4 pages
2018 Yam Marine b2b Eu Web
No ratings yet
2018 Yam Marine b2b Eu Web
59 pages
MF-218 Piping and Miscellaneous Practice in Engine Room PDF
No ratings yet
MF-218 Piping and Miscellaneous Practice in Engine Room PDF
35 pages
Supercharger
No ratings yet
Supercharger
27 pages
Spare Parts Catalogue: 536317 ADN-50-50-A-P-A Compact Cylinder - Series: All
No ratings yet
Spare Parts Catalogue: 536317 ADN-50-50-A-P-A Compact Cylinder - Series: All
2 pages
DASH Eating Plan
No ratings yet
DASH Eating Plan
4 pages
Smart Vs City of Davao
No ratings yet
Smart Vs City of Davao
2 pages
Business Plan Template Grade 9 Final Exam Performance Task
100% (1)
Business Plan Template Grade 9 Final Exam Performance Task
17 pages
Reverting To The Default Factory Configuration For The EX Series Switch - Technical Documentation - Support - Juniper Networks
No ratings yet
Reverting To The Default Factory Configuration For The EX Series Switch - Technical Documentation - Support - Juniper Networks
2 pages
Beneheart R12: Peripherals and Communications
No ratings yet
Beneheart R12: Peripherals and Communications
4 pages
Genelec 1031A Specs
No ratings yet
Genelec 1031A Specs
1 page
Epicor ERP Architecture Guide
No ratings yet
Epicor ERP Architecture Guide
37 pages
Eced 9 Module 2
No ratings yet
Eced 9 Module 2
11 pages
BHJS 2122 Paper2 Marking
No ratings yet
BHJS 2122 Paper2 Marking
7 pages
Criterion Referenced Assessment Workshop Handout 20111502
No ratings yet
Criterion Referenced Assessment Workshop Handout 20111502
7 pages
PCJS Complete
No ratings yet
PCJS Complete
29 pages
Indian Facilities Management Services Report-Final
50% (2)
Indian Facilities Management Services Report-Final
51 pages
PH5310-Solid State Synthesi
No ratings yet
PH5310-Solid State Synthesi
19 pages
Hae2
No ratings yet
Hae2
205 pages
Prolyte Stagedex Catalogue
No ratings yet
Prolyte Stagedex Catalogue
52 pages
FuseBox
No ratings yet
FuseBox
4 pages
SAP Concur: The Complete Smart Solution For Your Business Travel
No ratings yet
SAP Concur: The Complete Smart Solution For Your Business Travel
1 page
At 13704
No ratings yet
At 13704
2 pages
Unit-1 - Technology of Meat, Fish, Poultry & Their Products
No ratings yet
Unit-1 - Technology of Meat, Fish, Poultry & Their Products
11 pages
Climate Graph Assignment Rubric
No ratings yet
Climate Graph Assignment Rubric
1 page
People Vs Mateo
No ratings yet
People Vs Mateo
1 page
Lab - Configure ASA 5505 Basic Settings Using CLI
No ratings yet
Lab - Configure ASA 5505 Basic Settings Using CLI
11 pages
Activity Design Mrsia 2018
No ratings yet
Activity Design Mrsia 2018
3 pages
Arridae Infosec PVT LTD
No ratings yet
Arridae Infosec PVT LTD
13 pages