Basic Regression Analysis
Basic Regression Analysis
Categories of econometrics
Functions of econometrics
Measurement of economic relations
In many cases we can apply the various econometric techniques in
order to obtain estimates of the individual coefficients of the
economic relationships, from which we may evaluate other
parameters of economic theory.
Verification of economic theory
Econometrics aims primarily at the verification of economic
theories. In this case we can say that the purpose of the research
analysis ie obtaining empirical evidence to test the explanatory
power of economic theories to decide how well they explain the
observed behavior of the economic units
Forecasting
In formulating policy decisions it is essential to be able to forecast
the value of the economic magnitudes. Such forecasts will enable
the policy makers to judge whether it is necessary to take any
measure in order to influence the relevant economic variable.
Regression analysis
The term regression was introduced by Francis Galton. Regression analysis
is concerned with the study of the dependent variable, on one or more
explanatory variables.
The positive ui values cancel out the negative ui values so that their mean
effect on Y is zero.
4Homoscedasticity or equal variance of ui.
Given the value of X, the variance
of ui is the same for all observations. That is, the conditional variances of ui
are identical.
Symbolically, we have
var (ui |Xi) = E[ui E(ui |Xi)]2
= E(ui2 | Xi ) because of Assumption 3
= 2
where var stands for variance.
Diagrammatically, the situation can be depicted as follows.
There must not be any relation between the residual term and the X variable
which is to say that they are uncorrelated. This is to mean that the variables
left unaccounted for in the residual should have no relationship with the
variable X included in the model.
The number of observations n must be greater than the number of
parameters to be estimated. Alternatively, the number of
observations n must be greater than the number of explanatory
variables.
8Variability in X values.
The X values in a given sample must not all be the same. Technically, var (X)
must be a finite positive number.
X cannot be a constant within a given sample since we are interested in how
variation in X affects the variations in Y. if all values of X are identical, it
will be impossible to estimate the parameters.
The regression model is correctly specified. Alternatively, there is no
specification bias or error in the model used in empirical analysis.
By omitting important variables from the model, or by choosing the wrong
functional form, or by making wrong stochastic assumptions about the
variables of the model, the validity of interpreting the estimated regression
will be highly questionable.
10There is no perfect multicollinearity.
That is, there are no perfect linear relationships among the explanatory
variables.
This assumption relates to multiple regression models. It requires that in the
regression function, we include only those variables that are not exact linear
functions of one or more variables in the model
7
9
In the model with the intercept term absent, we use raw sums of squares and
cross products but in the intercept-present model, we use adjusted sums of
squares and cross products.
The df for computing 2 is (n 1) in the first case and (n 2) in the second
case.
ui, which is always zero for the model with the intercept term, need not be
zero when that term is absent.
Coefficient of determination is always nonnegative for the conventional
model; can on occasions turn out to be negative for the interceptless model.
Scaling and Units of Measurement
The units and scale in which the regressand and the regressor(s) are
expressed are very important because the interpretation of regression
coefficients
critically depends on them.
Yi = 1 + 2Xi + ui
Now rescale Xi and Yi using the constants w1 and w2, called the scale factors
Y i * = 1*+ 2*Xi*+ ui*
Where
Yi* = w1Yi
Xi* = w2Xi
If the scaling factors w1 = w2, the slope coefficient and its standard
error remain unaffected; the intercept and its standard error are
both multiplied by w1
If the X scale is not changed (i.e., w2 = 1) and the Y scale is
changed by the factor w1, the slope as well as the intercept
coefficients and their respective standard errors are all multiplied
by the same w1 factor.
If the Y scale remains unchanged (i.e., w1 = 1) but the X scale is
changed by the factor w2, the slope coefficient and its standard
error are multiplied by the factor (1/w2) but the intercept coefficient
and its standard error remain unaffected.
Dummy Variables
Qualitative variables usually indicate the presence or absence of a quality
or an attribute, such as male or female, black or white, Democrat or
Republican. We can construct artificial variables that take on values of 1 or
0, 1 indicating the presence of that attribute and 0 indicating the absence of
that attribute.
For example 1 may indicate that a person is a female and 0 may designate a
male. Variables that assume such 0 and 1 values are called dummy
variables.
Dummy variable trap
QUALITATIVE RESPONSE REGRESSION MODELS
Where you have a dummy variable for each category or group and also an
There are three approaches to developing a probability model for a binary
intercept, you have a case of perfect collinearity, that is, exact linear
response variable:
relationships among the variables. This situation is called the dummy
1. The linear probability model (LPM)
variable trap. If a qualitative variable has m categories, introduce only (m
2. The logit model
1) dummy variables.
3. The probit model
ANOVA models
Regression models which contain regressors that are all exclusively dummy,
or qualitative, in nature are called Analysis of Variance (ANOVA) models.
ANCOVA models
Regression models containing an admixture of quantitative and qualitative
variables are called analysis of covariance (ANCOVA) models.
Problems of LPM
Nonfulfillment of 0 E(Yi | X) 1