Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
548 views

Econometrics Cheatsheet

1. Ordinary least squares (OLS) regression aims to minimize the sum of squared residuals by estimating coefficients to best fit a linear relationship between a dependent variable and one or more independent variables. 2. For OLS estimates to be best, linear, unbiased, and efficient, the model must meet assumptions such as no perfect collinearity, conditional mean zero errors, homoscedasticity, and no autocorrelation. 3. Asymptotically, if the assumptions hold, OLS produces unbiased, consistent, and normally distributed estimates allowing for hypothesis testing and confidence intervals about the population parameters.

Uploaded by

f/z BENRABOUH
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
548 views

Econometrics Cheatsheet

1. Ordinary least squares (OLS) regression aims to minimize the sum of squared residuals by estimating coefficients to best fit a linear relationship between a dependent variable and one or more independent variables. 2. For OLS estimates to be best, linear, unbiased, and efficient, the model must meet assumptions such as no perfect collinearity, conditional mean zero errors, homoscedasticity, and no autocorrelation. 3. Asymptotically, if the assumptions hold, OLS produces unbiased, consistent, and normally distributed estimates allowing for hypothesis testing and confidence intervals about the population parameters.

Uploaded by

f/z BENRABOUH
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Econometrics Cheat Sheet 1.

0 Assumptions and properties Ordinary Least Squares


By Marcelo Moreno - University King Juan Carlos
Econometric model assumptions Objective -Pminimize the Sum of Squared Residuals
n
Under this assumptions, the estimators of the OLS pa- (SSR): M in i=1 û2i , where ûi = yi − ŷi
Basic concepts
rameters will present good properties. Gauss-Markov Simple regression model
Definitions assumptions extended: y Equation:
Econometrics - is a social science discipline with the 1. Parameters linearity. y must be a linear function yi = β0 + β1 x1i + ui
objective of quantify the relationships between eco- of the β’s
Estimation:
nomic agents, contrast economic theories and evaluate 2. Random sampling. The sample from the popula-
ŷi = β̂0 + β̂1 x1i
and implement government and business policies. tion has been randomly taken. (ONLY makes sense
Econometric model - is a simplified representation of when data is cross section) β1 Where:
the reality to explain economic phenomena. 3. No perfect collinearity. β̂0 = y − β̂1 x
ˆ There are no constant independent variables: β̂1 = Cov(y,x)
V ar(x)
Data types V ar(X) ̸= 0. β0
Cross section - data taken at a given moment in time,
ˆ There is not an exact linear relation between in-
an static “photo”. Order does not matter.
dependent variables. x
Temporal series - observation of one/many variable/s
4. Conditional mean zero and correlation zero.
across time. Order does matter.
ˆ There are no systematic errors: E(u|x1 , ..., xk ) = Multiple regression model
Panel data - consist of a temporal series for each ob- y
E(u) = 0. Equation:
servation of a cross section.
ˆ There are no relevant variables left out the model: yi = β0 + β1 x1i + ... +
Pooled cross sections - combines cross sections from
Cov(xj |u) = 0 for any j = 1, ..., k. βk xki + ui
different temporal periods.
5. Homoscedasticity. The variability of the residual Estimation:
Phases of an econometric model is the same for all levels of x: V ar(u|x1 , ..., xk ) = σ 2 . ŷi = β̂0 + β̂1 x1i +...+ β̂k xki
1. Specification. 3. Validation. 6. No auto-correlation. The residuals do not contain
Where:
2. Estimation. 4. Utilization. information about other residuals: Corr(ut , us |X) =
β̂0 = y − β̂1 x1 − ... − β̂k xk
0 for any t ̸= s. (ONLY makes sense when data is x2 Cov(y,resid(x ))
Regression analysis temporal series) β0 β̂j = V ar(resid(xjj))
Study and predict the mean value of a variable (depen- In matrix form:
7. Normality. The residuals are independent and
dent variable, y) regarding the base of fixed values of
identically distributed: u ∼ N (0, σ 2 ). x1 β̂ = (X T X)−1 X T Y
other variables (independent variables, x’s). In econo-
8. Data size. The number of observations available
metrics it is common to use Ordinary Least Squares Interpretation of coefficients
must be greater than (k + 1) parameters to estimate.
(OLS) for regression analysis.
(Do NOT makes sense under asymptotic situations) Model Dependent Independent β1 interpretation
Correlation analysis Level-level y x ∆y = β1 ∆x
Asymptotic properties of OLS Level-log y log(x) ∆y = (β1 /100)%∆x
The correlation analysis not distinguish between depen-
Under the econometric model assumptions and the Cen- Log-level log(y) x %∆y = (100β1 )∆x
dent and independent variables. Log-log log(y) log(x) %∆y = β1 %∆x
tral Limit Theorem:
ˆ The simple correlation measures the grade of linear Quadratic y x + x2 ∆y = (β1 + 2β2 x)∆x
ˆ Hold 1 to 4: OLS is unbiased. E(β̂j ) = βj
association between two variables.
ˆ Hold 1 to 4: OLS is consistent. plim(β̂j ) = βj Error measures
ˆ The partial correlation measures the grade of linear
ˆ Hold 1 to 5: asymptotic normality of OLS (then,
Pn
association between two variables controlling a third Sum of Sq. Resid.: SSR =P i=1 u2i
n
7 is necessarily satisfied): u ∼a N (0, σ 2 ). Expl. Sum of Sq.: SSE =P i=1 (ŷi − y i )2
variable.
ˆ Hold 1 to 7: OLS is BLUE (Best Linear Unbiased Es- n
Tot. Sum of Sq.: SST = i=1 (yi − y i )2 = r SSE + SSR
timator) or efficient. Thereby, hypothesis contrasts Pn 2

i=1 i
and confidence intervals can be done reliably. Standard error (se) of the regression: σ̂ = n−k−1
q Pn
(ûi −u)2
Sqrt. of the Quadratic PMean Error: i=1
n
n
|ûi |
Absolute Mean Error: i=1
n
Goodness of the fit, R-squared alternative instead). Dummy variables and structural
Is a measure of the goodness of the fit (how the OLS
fits to P
the data):
Individual contrasts change
n
(ŷi −y )2
Pn 2

Under the premise of normality of the residuals, con-
2
R = Pni (y −yi )2 = 1 − nSi 2 i trast if a given parameter is significantly different from Dummy (or binary) variables are used for qualitative
i y
i i
information: sex, civil state, etc.
ˆ Measures the percentage of variation of y that is lin- a given value.
ˆ H0 : β j = θ ˆ The dummy variables get the value of 1 in a given
early explained by the variations of x’s.
category, and 0 on the rest.
ˆ Takes values between 0 (no linear explanation of the ˆ H1 : βj ̸= θ
ˆ Dummy variables are used to analyze and modeling
variations of y) and 1 (total explanation of the vari- Under H0 :
β̂j −βj structural changes in the model parameters.
ations of y). t = se( ∼ tn−k−1,α/2
β̂j ) If a qualitative variable have m categories, we only have
ˆ When the number of regressors increment, the value If | t |> tn−k,α/2 there is evidence to reject the null
to include m − 1 dummy variables.
of the r-squared increments as well, whatever the new hypothesis.
variables are relevant or not. Individual significance contrasts - contrast if a Structural change
To eliminate the last point, there is an r-squared cor- given parameter is significantly different from zero. We denominate structural changes to the modifications
rected by degrees ofPfreedom: ˆ H0 : β j = 0 in the value of the parameters of the models for different
n
û2 sub-populations.
2
R = 1 − n−k−1n−1 P
n
i=1 i
= 1 − n−1
(1 − R 2
) ˆ H1 : βj ̸= 0
(yi −y i )2 n−k−1 The position of the dummy variable matters:
i=1 Under H0 :
For big sample sizes: β̂ −0 ˆ On the constant, their associate parameter represents
2 t = se(j β̂ ) ∼ tn−k−1,α/2
R ≈ R2 j the difference in mean between the values.
If | t |> tn−k−1,α/2 there is evidence to reject the null ˆ On the parameters that determines the slope of the
hypothesis.
Hypothesis testing regression line, the associate parameter represents
the difference in the effect between the values.
The basics of hypothesis testing Confidence intervals The Chow’s structural contrast
An hypothesis test is a rule designed to explain from a When we want to analyze the existence of structural
Under the normality of the residuals requirement, the
sample, if exist evidence or not to reject an hypothesis changes in all the model parameters, is more common
confidence intervals at 1 − α confidence can be calcu-
that is made of one or more population parameters. to use a particular expression of the F contrast known
lated:
Elements of an hypothesis contrast: as the Chow’s contrast.
β̂j ∓ tn−k−1,α/2 se(β̂j )
ˆ Null hypothesis (H0 ): is the hypothesis that you want It defines two non restricted models (with structural
to contrast. change):
ˆ Alternative hypothesis (H1 ): is the hypothesis that The F contrast yi = β0A + β1A x1i + ... + βkA xki + ui from sub-sample A
cannot be rejected when the null hypothesis is re- It uses a non restricted model and a restricted model yi = β0B + β1B x1i + ... + βkB xki + ui from sub-sample B
jected. to do assumptions about the parameters. Restricted model (without structural change):
ˆ Statistic of contrast: is a random variable with a ˆ Non restricted model: is the model on which we want yi = β0 + β1 x1i + ... + βk xki + ui
known distribution that allow us to see if we reject to make the hypothesis contrast. With the restriction:
(or not) the null hypothesis. ˆ Restricted model: is the model on which the hypoth- H0 : βjA = βjB ; j = 0, 1, ..., k, there is no structural
ˆ Significance level (α): is the probability of rejecting esis that we want to contrast have been imposed. difference.
the null hypothesis being true (type I error). Is cho- Then, ˆ Be SRN the sum of the OLS square residuals of the
Pn looking at the errors, there is:
sen by who conduct the contrast. Commonly is 0.10, ˆ û
i=1 N
2
: is the sum of the OLS residuals of the non non restricted model: SRN = SRA + SRB
0.05, 0.01 or 0.001 restricted model (SRN). ˆ Be SRR the sum of the OLS square residuals of the
ˆ Critic value: is the value that, for a determined value
Pn
ˆ û
i=1 R
2
: is the sum of the OLS residuals of the re- restricted model.
of α, determines the reject (or not) of the null hypoth- stricted model (SRR). Then:
n−2(k+1)
esis. Then: F = SRR−SRN (n−K−1)
∼ Fq,n−K−1 F = SRR−SRN
SRN k+1 ∼ Fk+1,n−2(k+1)
SRN q
ˆ p-value: is the highest level of significance for what Where K is the number of parameters of the non re- If Fq,n−K−1 < F , there is evidence to reject the null
we do not reject (accept) the null hypothesis (H0 ). stricted model and q is the number of linear hypothesis. hypothesis.
The rule is: if p-value is lower than α, there is evidence When Fq,n−K−1 < F , there is evidence to reject the
at that given α to reject the null hypothesis (accept the null hypothesis.
Multicollinearity Consequences are not independent E(ui , uj ) ̸= 0; i ̸= j
ˆ The estimators still are unbiased.
If there is exact multicolineality, the equation system of ˆ The estimators still are consistent. Consequences
OLS cannot be solved due to infinite solutions. OLS estimators are not lineal, not efficient, and are bi-
ˆ The variance estimations of the estimators is biased:
ˆ Approximate multicollinearity: when one or more the construction of confidence intervals and the hy-
ased.
variables are almost a constant or there is a linear Because OLS estimators are not efficient, variance es-
pothesis contrast are not reliable.
relation between them. In this context, there is not a In this context, OLS is not an unbiased lineal estimator timations of the estimators are biased, hypothesis con-
problem, given the classic requirements of OLS, and of minimum variance. There is an alternate unbiased trast and confidence intervals are not reliable.
the inference is valid. But, there are some empiric lineal estimator of minimum variance denominated es- Detection
consequences of this: timator of least weighted squares (OLWS) or least gen- ˆ Graphic residual analysis. There are auto-correlation
– Small sample variations can induce to big varia- eralized squares (LGS). structures that can be identified in a plot. For exam-
tions in the OLS estimations. ple:
– The variance of the OLS estimator of the x’s that Detection AR: AR(+): AR(-):
are collinear V ar(β̂j ) increments, then, the infer- Plots. Look for structures in plots with the square ut ut ut

ence of the parameter is affected → The estimation residuals) and contrasts: Park test, Goldfield-Quandt,
of the parameter is very imprecise (big confidence Bartlett, Breush-Pagan, CUSUMQ, Spearman, White.
interval). White test null hypothesis:
ut−1 ut−1 ut−1
Calculating the Variance Inflation Factor to analyze H0 = Homoscedasticity
multicollinearity problems: Correction
1
V IF (β̂j ) = 1−R 2
ˆ When the variance structure is known, use weighted ˆ Formal contrasts: Breusch-Godfrey. It allows:
j

Indicates the increment of V ar(β̂j ) because of the mul-least squares. – Dynamic models.
ticollinearity. ˆ When the variance structure is not known: make – ut that follows an auto-regressive model or ρ order.
ˆ If it is bigger than 10, indicates that there are multi-
assumptions of the possible structure and ap- – Moving averages of the error term.
collinearity problems. ply weighted least squares (factible weighted least H0 : N oauto − correlation
ˆ From 4 onwards, it is advisable to analyze in more squares). H1 : ut ∼ AR(ρ) or ut ∼ M A(q)
detail if there might be multicollinearity. ˆ Supposing that σi2 is proportional to x2i , divide by xi .
Correction
One typical characteristic of multicollinearity is thatˆ New model specification, for example, logarithmic
the regression coefficients of the model are not individ-transformation. Prediction
ually different from zero (because the high variances),ˆ Standard errors with heteroscedasticity corrected by
Two types of prediction:
but jointly they are different from zero. the White’s method.
ˆ Prediction of the mean value of y for a specific value
of x.
Heteroscedasticity Auto-correlation ˆ Prediction of an individual value of y for a specific
The residuals ui of the population regression function The ”natural” context of this phenomena is in temporal value of x.
2
do not have the same variance σ : series. If the values of the variables (x) approximate to the
V ar(u|x) = V ar(y|x) ̸= σ 2 The residual of any observation, ui is correlated with mean values (x), the confidence interval amplitude will
the residual of any other observation. The observations be less.

You might also like