Econometrics (Bigb3n)

TOPIC 1: DEFINITION, SCOPE & DIVISION OF ECONOMETRICS
Econometrics deals with the measurement of economic relationships. The term 'econometrics' is
formed from two words of Greek origin, oikonomia (economy), and metron (measure).
Econometrics is a combination of economic theory, mathematical economics and statistics, but it
is completely distinct from each one of these three branches of science.
It is the unification of all three that is powerful. And it is this unification that constitutes
econometrics.
Thus ECONOMETRICS may be considered as the integration of economics, mathematics and
statistics for the purpose of providing numerical values for the parameters of economic
relationships (for example , elasticities, marginal propensities) and verifying economic theories.
Starting from the relationships of economic theory,we express them in mathematical terms (i.e.
we build a model so that they can be measured. We then use specific methods, called
econometric methods, in order to obtain numerical estimates of the coefficients of the economic
relationships. Econometric methods are statistical methods specifically adapted to the
peculiarities of economic phenomena. The Most important characteristic of economic
relationships is that they contain a random element (stochastic error term or Disturbance
element) which, however, is ignored by economic theory and mathematical economics which
postulate exact (deterministic) relationships between the various economic variables.
An example will make the above clear. Economic theory postulates that the demand for a
commodity depends on its price, on the prices of other commodities, on consumers' income and
on tastes. This is an exact relationship, because it implies that demand is completely
determined by the above four factors. No other factor, except those explicitly mentioned,
influences the demand. In mathematical economics we express the above abstract economic
relationship of demand in mathematics. Thus we may write the following demand equation as
Q = b0 + b1P + b2Po + b3Y + b4T
Q = Quantity Demanded of a Good, P = Price of the good, Po = Price of other goods, Y =

Consumer Income, T = Taste & b0, b1, b2, b3, b4 are the parameters of the model.
Yet it is common knowledge that in economic life many more factors may affect
demand. The invention of a new product, a war, professional changes, institutional changes,
changes in law, changes in income distribution, massive population movements (migration),
etc., are examples of such factors. Furthermore, human behavior is inherently erratic. We are
influenced by rumors, dreams, prejudices, traditions and other psychological and sociological
factors, which make us
behave differently even though the conditions in the market (prices) and our incomes remain the
same. In econometrics the influence of these 'other' factors
is,taken into account by the introduction into the economic relationships of a random variable, u,
known as the stochastic term
Hence, the initial exact model now becomes an econometric model:
Q = b0 + b1P + b2Po + b3Y + b4T + u
Where the random variable = u, covers all other unaccounted factors that affect Demand (Q)
GOALS OF ECONOMETRICS
1. Analysis i.e Testing Of Economic Theory
2. Policymaking i.e. supplying numerical estimates of the coefficients of economic

relationships, which may be then used for decision making
3. Forecasting, i.e. using the numerical estimates of the coefficients in order to forecast
the future values of the economic magnitudes.
DIVISION OF ECONOMETRICS
1. Theoretical Econometrics includes the development of appropriate methods for the

measurement of economic relationships.
2. Applied Econometrics includes the applications of econometric methods to specific

branches of economic theory for the analysis of economic phenomena and forecasting
economic behavior.
TYPES OF DATA FOR ECONOMETRIC ANALYSIS
1. Time Series Data: A time series is a set of observations on the values that a variable
takes at different times. Such data may be collected at regular time intervals, such as
daily (e.g stock prices, weather reports), weekly (e.g money supply figures), monthly
[e.g., the unemployment rate, the Consumer Price Index (CPI)], quarterly (e.g., GDP),
annually (e.g government budgets), quinquennially, that is, every 5 years (e.g the census
of manufactures), or decennially (e.g the census of population). e.g the data of a
Country’s GDP taken between 2000-2024 is a time series data. It is denoted by subscript
t
2. Cross-Section Data: Cross-section data are data on one or more variables collected at
the same point in time, for example, the data of on Inflation for five countries taken for
just one year. It is denoted by subscript i
3. Pooled/Panel/longitudinal Data: In pooled, or combined data, elements of both time

series and cross-section data. For example, given the data of seven countries, the
Consumer Price Index (CPI) for each country for 2000-2024 is time series data, whereas
the data on the CPI for the seven countries for a single year are cross-sectional data.
When the CPI for all seven countries are studied for 2000-2024, it then becomes a
pooled data. It is denoted by subscript it
METHODOLOGY OF ECONOMETRIC RESEARCH
In any econometric research, we may distinguish four stages.
STAGE A : SPECIFICATION OF THE MODEL
The first step in any econometric research is the specification of the model with which one will
attempt the measurement of the phenomenon being analyzed. This stage is also known as the
formulation of the maintained hypothesis. In this stage, the VARIABLES OF THE MODEL & the
MATHEMATICAL FORM OF THE MODEL are determined
STAGE B: ESTIMATION OF THE MODEL
After the formulation of the model one should obtain estimates of its parameters. the second
stage includes the estimation of the model by means of the appropriate econometric method.
This stage is known as the testing of the maintained hypothesis. Here Data for Estimation of the
Model are Gathered, Examination Of Identification condition of the function, examination of the
degree correlation between variables & the choice of the appropriate econometric technique for
estimation e.t.c are all conducted in this stage
STAGE C: EVALUATION OF THE ESTIMATES
Once the model has been estimated, one should proceed with the evaluation of the estimates,
that is, to say decide on the basis of certain criteria whether the estimates are satisfactory and
reliable. Here three major criteria must be satisfied to underpin the validity, accuracy & reliability
of the estimates.
First, The economic a priori criteria, which are determined by economic theory. Second,
statistical criteria, determined by statistical theory. And Finally, Third, the econometric
criteria, determined by econometric theory.
STAGE D: EVALUATION OF THE FORECASTING POWER OF THE MODEL
The final stage of any econometric research is concerned with the evaluation of the forecasting
validity of the model. Estimates are useful because they help in decision making. A model,after
the estimation of its parameters, can be used in forecasting the values of economic variables.
The econometrician must ascertain how good the forecasts are expected to be, in other words
he must test the forecasting power of the model
IMPORTANT WORD: Stages A & C are the most important for any econometric research. They
require the skills of an economist with experience of the functioning of the economic system.
While Stages B and D are technical and require knowledge of theoretical econometrics.
DESIRABLE PROPERTIES OF AN ECONOMETRIC MODEL
1. Theoretical plausibility: The model should be compatible with the postulates of

economic theory. It must describe adequately the economic phenomena to which it
relates.
2. Explanatory ability: The model should be able to explain the observations of the actual
world. It must be consistent with the observed behavior of the economic variables whose
relationship it determines.
3. Accuracy of the estimates of the parameters: The estimates of the coefficients should
be accurate in the sense that they should approximate as best as possible the true
parameters of the structural model. The estimates should possess the desirable
properties of unbiasedness, consistency and efficiency
4. Forecasting ability: The model should produce satisfactory predictions of future values
of the dependent (endogenous) variables.
5. Simplicity: The model should represent the economic relationships with maximum
simplicity. The fewer the equations and the simpler their mathematical form, the better
the model is considered, ceteris paribus (that is to say, provided that the other desirable
properties are not affected by the simplifications of the model).
TOPIC 2: REGRESSION ANALYSIS (CLASSICAL LINEAR REGRESSION MODEL - CLRM)
Regression Analysis is concerned with describing and evaluating the relationship between a
given variable, usually called the dependent variable, and one or more other variables, usually
known as the independent variable(s).
Some Alternative names for Dependent variable or the Y variable are the Explained variable,
Predictand, Regressand, Response, Endogenous, Outcome, Controlled variable, effect variable
Other names for the Independent variables or X variables are the Explanatory variable,
Predictor, Regressor, Stimulus, Exogenous, Covariate, Control variable, cause variable
REGRESSION VS CAUSATION
Although regression analysis deals with the dependence of one variable on other variables, it
does not necessarily imply causation. e.g In a crop yield-rainfall scenario, crop yield (dependent
variable) depends on rainfall (independent variable), but rainfall does not depend on crop yield
i.e x causes y, but y does not cause x. Which means There exist only one-way causation and
not two-way causation here.
REGRESSION VS CORRELATION
The primary objective of correlation analysis is to measure the strength or degree of linear
association between two variables. For example, we may be interested in finding the correlation
(coefficient) between smoking and lung cancer, between scores on statistics and mathematics
examinations, between high school grades and college grades, and so on. But In regression
analysis, we try to estimate or predict the average value of one variable on the basis of the fixed
values of other variables. Thus, we may want to know whether we can predict the average
score on a statistics examination by knowing a student’s score on a mathematics examination.
If we are studying the dependence of a variable on only a single explanatory variable, such as
that of consumption expenditure on real income, such a study is known as simple, or
two-variable (bivariate) regression analysis. However, if we are studying the dependence of one
variable on more than one explanatory variable, as in the effect of rainfall, temperature,
sunshine, and fertilizer on crop-yield, it is known as multiple (multivariate) regression analysis. In
other words, in simple regression there is only one explanatory variable, whereas in multiple
regression there is more than one explanatory variable.
ORDINARY LEAST SQUARES (OLS)
This is the commonest method used to estimate the parameters in a linear regression model by
minimizing the sum of the squared differences between the observed values (X variables) and
the values predicted (Y variable) by the model. It is used to find the line of best fit.
For a simple regression, the general equation is
Y=a+bX.
However, this equation is completely deterministic. Is this realistic? No. So what we do is to add
a random disturbance term, u into the equation.
Now we have the new statistical/econometric equation as
Y=a+bX+u
Why do we include a Disturbance term?
- We always leave out some determinants of y
- There may be errors in the measurement of y that cannot be modeled.
- Random outside influences on y which we cannot model
PRF VS SRF
The population regression function (PRF) is a description of the model that is thought to be
generating the actual data and the true relationship between the variables (i.e. the true values of
a & b).
While the sample regression function (SRF) is the estimated data used to infer/determine likely
values of the PRF parameters (i.e hat(a) & hat(b))
The PRF is Y=a+bX+u

The SRF is Y=hat(a)+hat(b)X
Since the PRF parameters are usually unknown, we superimpose the SRF on the PRF to have:
Y=hat(a)+hat(b)X + hat(u)
Same as Y = hat(Y) + hat(u)
Interpreted as Actual = Estimated + Error
THE ASSUMPTIONS OF OLS
1. Error variable, μ, is random

2. Mean value of error term is zero, ie E(μ) = 0
3. Variance of error variable is constant for all explanatory variables. This is the assumption
of homoscedasticity of error term i.e Absence of heteroscedasticity.
4. Error variable follows a normal distribution
5. Successive error terms are non-auto-correlated. This means there is serial
independence of error terms corresponding to explanatory variables i.e Absence of
Autocorrelation
6. There is pure independence of error terms and explanatory variables. E(μ, X) = 0 . This
means that X is truly exogenous.
7. There are no measurement errors
8. There is no substantial multicolinearity of explanatory variables. That is to say
explanatory variables are not perfectly co-linear i.e Absence of Multicollinearity
9. Aggregation of macro-variables are correctly done
10. The model is appropriately identified
11. The is no error of specification of the model
LINEARITY
In order to use OLS, we need a model which is linear in the parameters (a & b). It does not
necessarily have to be linear in the variables (y & x).
Linearity in parameters means the degree (power) of the parameters of the model is 1
Linearity in variables means the degree (power) of the variables of the model is 1
PROPERTIES OF OLS
1. Linearity: It is a linear function of a random variable, which must be linear in parameters

2. Unbiasedness: its average or expected value of the sample estimates are equal to the
true or actual value of the population parameters
3. Efficiency: It has minimum variance in the class of all such linear unbiased estimators.
4. Consistency: The least squares estimators and are consistent i.e the estimates will
converge to their true values as the sample size increases to infinity.
5. BLUE: among the class of unbiased estimators, they’re the “Best Unbiased Linear
Estimators”
DERIVATION OF THE OLS ESTIMATORS
TOPIC 3: Time Series Analysis
Time series analysis is a specific way of analyzing a sequence of data points collected over an
interval of time. In time series analysis, analysts record data points at consistent intervals over a
set period of time rather than just recording the data points intermittently or randomly
Time series analysis is used for non-stationary data—things that are constantly fluctuating over
time or are affected by time. Industries like finance, retail, and economics frequently use time
series analysis because currency and sales are always changing. Stock market analysis is an
excellent example of time series analysis in action, especially with automated trading
algorithms. Likewise, time series analysis is ideal for forecasting weather changes, helping
meteorologists predict everything from tomorrow’s weather report to future years of climate
change.
Examples of time series analysis in action include: Weather data, Rainfall measurements,
Temperature readings, Heart rate monitoring (EKG), Brain monitoring (EEG), Quarterly sales,
Stock prices, Automated stock trading, Industry forecasts, Interest rates
IMPORTANCE OF TIME SERIES ANALYSIS
1. For Forecasting
2. For Prediction
3. To understand the underlying course or systematic pattern over time
4. To show the likely changes in data behavior
5. It determines the likelihood of future events
MODELS/TYPES OF TIME SERIES ANALYSIS
● Classification: Identifies and assigns categories to the data.

● Curve fitting: Plots the data along a curve to study the relationships of variables within
the data.
● Descriptive analysis: Identifies patterns in time series data, like trends, cycles, or
seasonal variation.
● Explanative analysis: Attempts to understand the data and the relationships within it,
as well as cause and effect.
● Exploratory analysis: Highlights the main characteristics of the time series data, usually
in a visual format.
● Forecasting: Predicts future data. This type is based on historical trends. It uses the
historical data as a model for future data, predicting scenarios that could happen along
future plot points.
● Intervention analysis: Studies how an event can change the data.
● Segmentation: Splits the data into segments to show the underlying properties of the
source information.
DATA CLASSIFICATION
Stock time series data means measuring attributes at a certain point in time, like a static
snapshot of the information as it was.
Flow time series data means measuring the activity of the attributes over a certain period,
which is generally part of the total whole and makes up a portion of the results.
DATA VARIATIONS
Functional analysis can pick out the patterns and relationships within the data to identify
notable events.
Trend analysis means determining consistent movement in a certain direction. There are two
types of trends: deterministic, where we can find the underlying cause, and stochastic, which is
random and unexplainable.
Seasonal variation describes events that occur at specific and regular intervals during the
course of a year. Serial dependence occurs when data points close together in time tend to be
related.
REGRESSION INVOLVING TIME SERIES DATA
TOPIC 4: STATIONARITY TEST

Stationarity test means that the statistical properties of the mean is zero i.e the mean, variance,
standard deviation and don’t change as time changes / remains the same irrespective of time
movements.
Hence, time series analysis is used for non-stationary data. When dealing with non-stationary
data, which means the statistical properties of the data change over time, specific methods like
differencing or transformations can be applied to make the data stationary before conducting
further analysis. By making the data stationary, it becomes easier to identify patterns and
relationships within the time series data.
Unit root is a term that represents nonstationarity. It is a process where the first difference is
stationary.
a Unit root test tests whether a time series variable is non-stationary and possesses a unit root.
The null hypothesis is generally defined as the presence of a unit root and the alternative
hypothesis is either stationarity, trend stationarity or explosive root depending on the test used
i.e
Ho: There is unit root
H1: There is no unit root
In general, the approach to unit root testing implicitly assumes that the time series to be tested
can be written as,
Yt = Dt + Zt + Et
Dt is the deterministic component (trend, seasonal component, etc.)
Zt is the stochastic component.
Et is the stationary error process.
The task of the test is to determine whether the stochastic component contains a unit root or is
stationary.
ATTRIBUTES OF STATIONARITY TEST
1. Impact of Time Series Length on Test Results: This capability varies differently for
different tests. The time series length must not influence a test in providing unbiased
results. For any given length, the tests should ideally yield an unbiased outcome;
therefore, it is necessary to study the impact of length on the test results. This would
provide colossal information about the tests’ efficacy while dealing with time series of
different lengths. The effect of time series length on test results can be noted by
comparing the test results and critical values for various lengths, which help notice the
significance of a test declaring a time series as stationary or nonstationary.
2. Impact of Time Series Clustering on Test Results: Some stationarity tests function by
dividing the time series into various fragments. All these fragments are compared and
analyzed, and test results are obtained. Among all the considered tests, Levene’s, KW,
and two-way KS tests are built to examine the time series by dividing them into various
groups or clusters. These fragments, groups, or clusters are created by considering
some parts of the time series without any intermixing of data. The size of the group can
cause notable variations in test results. It is crucial to know the apt size of the groups for
respective tests to obtain accurate and unbiased results. Considering a very small or
very large group size may lead to significant discrepancies in test results.
3. Impact of Time Series Facets on Test Results: Trend and volatility effects in a
seasonal time series account for its nonstationarity and might bias the test results [32]. It
is also possible that the tests may overlook these effects and may fail to yield unbiased
results. It is vital to notice the changes in test results and critical values due to the trend
effect as the facet cause nonstationarity in time series. Such an analysis can help
understand how impactful a trend effect could be in making a test bias
4. Impact of Changes in Test Parameters on Test Results: Various test parameters

characterize stationarity tests; they define the primary features of respective stationarity
tests, e.g., the median is typically employed in Levene’s test whenever the data
distribution are not symmetrical, whereas the mean is appropriate when the data come
from a symmetrical distribution. Thus, significant impacts of these modifications could be
observed through the test results.
PHILLIPS-PERRON TEST
the Phillips–Perron test (named after Peter C. B. Phillips and Pierre Perron) is a unit root test.
That is, it is used in time series analysis to test the null hypothesis that a time series is
integrated of order 1.
Like the augmented Dickey–Fuller test, the Phillips–Perron test addresses the issue that the
process generating data might have a higher order of autocorrelation than is admitted in the test
equation.
The test is robust with respect to unspecified autocorrelation and heteroscedasticity in the
disturbance process of the test equation.
COINTEGRATION
A cointegration test is used to establish if there is a correlation between several time series in
the long term. The concept was first introduced by Nobel laureates Robert Engle and Clive
Granger in 1987 after British economist Paul Newbold and Granger published the spurious
regression concept.
Cointegration tests identify scenarios where two or more non-stationary time series are
integrated together in a way that they cannot deviate from equilibrium in the long term. The tests
are used to identify the degree of sensitivity of two variables to the same average price over a
specified period of time.
Methods of Testing for Cointegration

There are three main methods of testing for cointegration. They are used to identify the
long-term relationships between two or more sets of variables. The methods include:
1. Engle-Granger Two-Step Method: The Engle-Granger Two-Step method starts by

creating residuals based on the static regression and then testing the residuals for the
presence of unit-roots. It uses the Augmented Dickey-Fuller Test (ADF) or other tests to
test for stationarity units in time series. If the time series is cointegrated, the
Engle-Granger method will show the stationarity of the residuals. The limitation of the
Engle-Granger method is that if there are more than two variables, the method may
show more than two cointegrating relationships. Another limitation is that it is a single
equation model. However, some of the drawbacks have been addressed in recent
cointegration tests like Johansen’s and Phillips-Ouliaris tests.
2. Johansen Test: The Johansen test is used to test cointegrating relationships between
several non-stationary time series data. Compared to the Engle-Granger test, the
Johansen test allows for more than one cointegrating relationship. However, it is subject
to asymptotic properties (large sample size) since a small sample size would produce
unreliable results. Using the test to find cointegration of several time series avoids the
issues created when errors are carried forward to the next step. Johansen’s test comes
in two main forms, i.e., Trace tests and Maximum Eigenvalue test.
Trace Tests
Trace tests evaluate the number of linear combinations in a time series data, i.e., K to be equal
to the value Ko, and the hypothesis for the value K to be greater than Ko. It is illustrated as
follows:
Ho : K = Ko
Ho : K > Ko
When using the trace test to test for cointegration in a sample, we set K0 to zero to test whether
the null hypothesis will be rejected. If it is rejected, we can deduce that there exists a
cointegration relationship in the sample. Therefore, the null hypothesis should be rejected to
confirm the existence of a cointegration relationship in the sample.
Maximum Eigenvalue test
An Eigenvalue is defined as a non-zero vector which, when a linear transformation is applied to

it, changes by a scalar factor. The Maximum Eigenvalue test is similar to the Johansen’s trace
test. The key difference between the two is the null hypothesis.
Ho : K = Ko
Ho : K = Ko + 1
In a scenario where K=K0 and the null hypothesis is rejected, it means that there is only one
possible outcome of the variable to produce a stationary process. However, in a scenario where
K0 = m-1 and the null hypothesis is rejected, it means that there are M possible linear
combinations. Such a scenario is impossible unless the variables in the time series are
stationary.
TOPIC 5: SIMULTANEOUS EQUATION MODEL
When explanatory variables are not purely exogenous in the sense that their values are
determined in the system, certain consequences for econometric analysis arise:
- You cannot specify single equation as complete model of relationship between the
variables
- You cannot use OLS to obtain estimates that have BLUE properties
- Asymptotic un-biasedness cannot be achieved in the sense that even an increase in
sample size will not eliminate specification bias and OLS estimates are not consistent.
This kind of bias is also called simultaneous equation bias and can only be cured by
appropriately specifying a model of simultaneous equation
SIMULTANEOUS EQUATION MODEL is a model made up of more than one equation having
variables that are jointly dependent.
- Certain explanatory variables are also endogenous in the system
For example, if market price of a stock (M) is a function of financial leverage (L), Investment (I),
Investor psychology (P); and in turn, investor psychology is a function of current and one period
lag of market price. It will be inappropriate to use a single equation model as specification of the
relationship. The correct specification is:
M𝑡 = f(L𝑡 , I𝑡, P𝑡),
P𝑡 = f(M𝑡, M𝑡−1)
Which can be expressed econometrically as:
M𝑡 =a0 +a1L𝑡 +a2I𝑡 +a3P𝑡 +μ𝑡
P=bo+b1𝑀t+b2Mt-1+e𝑡
IMPORTANT THINGS TO NOTE IN THE MODEL
1. The dependent variables M and P are endogenous (determined within the system)
2. Pt , though an explanatory variable, is not purely independent
3. L𝑡, I𝑡, and M𝑡−1 are pre-determined variables. They are also explanatory variables. They
are exogenous and determined outside the system. They are as given.
4. μ𝑡 and e𝑡 are error/stochastic variables
5. The simultaneous equation system is structural model in the sense that you have
endogenous variables that are functions of other endogenous variables and pre-
determined variables. In a complete structural model, you have n endogenous variables
and n equations. The above equation is a complete structural model
6. a0 , a1, a2, a3, b0, b1, and b2 are structural parameters, and can be used to express
direct effects of the explanatory variables on the dependent variables
7. Note however that indirect effects of explanatory variables on the dependent variables
cannot be expressed by the structural parameters but only through a solution to the
simultaneous equation. For instance, the effect of M𝑡−1 on M𝑡 can be determined only
through its effect on Pt etc
ESTIMATION OF STRUCTURAL MODEL
REDUCED FORM : Structural model can be estimated by expressing each of its endogenous
variable as function of predetermined variables only, example:
M𝑡 =a0 +a1L𝑡 +a2I𝑡 +a3M𝑡−1 +μ𝑡

P𝑡 =b0 +b1𝐿𝑡 +b2I𝑡 +b3M𝑡−1+et
The reduced form coefficients measure the total effects of predetermined variables on the
respective dependent variables while:
Total effects = Direct effects + Indirect Effects
OLS can be used to estimate reduced form parameters (total effects) but:
- observe that this may not yield the direct and indirect effects separately.
- Also effects of explanatory variables which are also endogenous may not be known
directly
RECURSIVE FORM: Also known as Triangular system, this is a structural model in which the
first equation has a dependent variable (y1 ) as a function of predetermined variables (x𝑖) only.
The second equation has another dependent variable (y2) as a function of the first dependent
variable (y1) and other predetermined variables only and so on.
y1 =a1 +α11x1 +α12x2 +μ1

y2 =a2 +α21𝑥1 +α22 x2 +β21y1+μ2
y3 = a3 +α31𝑥1 + α32 x2 + β31y1+ β32y2+ μ3
You can safely use OLS to estimate the structural parameters because the μ𝑖 are independent
of the explanatory variables for the particular equations
ILLUSTRATIONS
A Researcher is considering a topic that has to do with the effect of dividend policy decisions on
market value (P) of firms in Nigeria. A scan of literature shows that five key factors are important
in explaining P, namely Dividend Payout rate (d), Earnings Rate (e), Investment Opportunity (I),
Management Efficiency (m) and economic growth rate (g). While other factors are as given, d is
said to depend on e, I, and P, just as I depends on e and g. Required:
a. Specify structural model equations of the relationships

b. Specify the Reduced Form of the simultaneous equations
Solution
Structural model:
P = f(d,e,I,m,g)
d = f(e, I, P)
I = f(e, g)
𝑃 =a1 +α11 𝑒+α12𝑚+α13𝑔+β11𝑑+β12𝐼+ + μ1

d =a2 +α21 𝑒+ +β22𝐼+β23𝑃+ μ2
I =a3+α31𝑒+ +α33𝑔+ +μ3
Reduced Form Model:

P = f(e, m, g)
d = f(e, m, g)
I = f(e, m, g)
𝑃 =Π1 +Π11 𝑒+Π12𝑚+Π13𝑔+μ1

𝑑 =Π2 +Π21 𝑒+Π22𝑚+Π23𝑔+μ2
𝐼 =Π3 +Π31 𝑒+Π32𝑚+Π33𝑔+μ3
ESTIMATION METHODS
SINGLE EQUATION METHODS (OLS, ILS, 2SLS)
● Ordinary Least Squares Method: OLS can only be used to estimate simultaneous
equation coefficients when transformed to reduced or recursive form such that all
explanatory variables are truly exogenous. Otherwise the estimates will be biased and
inconsistent
● Indirect Least Squares: ILS is a single equation method and can be used to estimate
structural parameters of simultaneous equation where the equations are exactly
identified. The steps are as follows:
1. Derive reduced form of the model
2. Obtain estimates of reduced form coefficients Π𝑖 of each equation using OLS method
3. Form a system of coefficient relationships by making Π𝑖 functions of the structural
parameters.
4. Solve for structural parameters accordingly
IDENTIFICATION PROBLEM
Assumption 10 underlying the use of OLS is that the model is identitied

A model is identified if its functions have unique statistical form and parameters of the
relationship as formulated truly measure the variation in the dependent variable in relation to
variation in the explanatory variables as specified. Which is to say that it is not a measure of
another relationship.
Types Of Identification Conditions
UNDER-IDENTIFIED MODEL: Statistical form of any of the functions of the model is not unique
• IDENTIFIED MODELS: Here, all functions of the model are unique, and may come in two
forms:
- EXACTLY IDENTIFIED
- OVER-IDENTIFIED
Implication of Identification condition
If a model is under-identitied, it cannot be statistically estimated.

The nature of identified model determines appropriate estimation method to adopt.
• Exactly Identified Model may be estimated using Indirect Least Squares (ILS)
• Over-Identified Model may be estimated using more sophisticated techniques like 2 Stage
Least Squares (2SLS) and Maximum Likelihood Methods.
Criteria/Conditions for Identification
• COMPLETENESS OF THE MODEL

A model is said to be complete if all endogenous variables correspond to the dependent
variables in the system. Hence, the number of endogenous variables equals the number of
equations of the model.
Note however that this condition is given, but does not conclusively prove that a model is
identified
• ORDER CONDITION
An equation in which the number of excluded variables is greater than or equal to the number of
endogenous variables, less one, is said to satisfy the order condition for identification
(TV-EV) ≥ (END-1)
Where:
TV = Total number of variables in the system
EV= Number of variables in the equation
END= Number of endogenous variables
Note that this Is a necessary but not sufficient condition of identification

• RANK CONDITION
This condition states that in a system ot END equations, any particular equation is identified if
and only if it is possible to construct at least one non-zero determinant of the order END-1 from
coefficients of the excluded variables
PROCEDURES FOR ESTABLISHING RANK CONDITION
1. Specity the structural model

2. Rewrite the structural equation in a general form containing all variables, noting that
excluded variables have zero coefficients
3. Form a table of the relevant coefficients
4. Strike out row coefficients of the equation being examined
5. Strike out all columns with non-zero coefficients for the equation under examination
6. Draw table of all remaining coefficients which corresponds to parameters of excluded
variables as contained in other equations of the system
7. Form determinants of the order (END - 1)a n d examine their values:
- if all determinants are zero, the equation is under-identified
- If at least one of the determinants is non-zero, the equation is identified W
Exactly identified: (TV -EV) = (END- 1)
Over-identified: (TV - EV) > (END - 1)

Econometrics (Bigb3n)

Uploaded by

Copyright:

Available Formats

Econometrics (Bigb3n)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Econometrics (Bigb3n)

Uploaded by

Copyright:

Available Formats

TOPIC 1: DEFINITION, SCOPE & DIVISION OF ECONOMETRICS

Q = b0 + b1P + b2Po + b3Y + b4T

Q = Quantity Demanded of a Good, P = Price of the good, Po = Price of other goods, Y =

1. Analysis i.e Testing Of Economic Theory

2. Policymaking i.e. supplying numerical estimates of the coefficients of economic

1. Theoretical Econometrics includes the development of appropriate methods for the

2. Applied Econometrics includes the applications of econometric methods to specific

TYPES OF DATA FOR ECONOMETRIC ANALYSIS

3. Pooled/Panel/longitudinal Data: In pooled, or combined data, elements of both time

METHODOLOGY OF ECONOMETRIC RESEARCH

In any econometric research, we may distinguish four stages.

STAGE A : SPECIFICATION OF THE MODEL

STAGE B: ESTIMATION OF THE MODEL

STAGE C: EVALUATION OF THE ESTIMATES

STAGE D: EVALUATION OF THE FORECASTING POWER OF THE MODEL

DESIRABLE PROPERTIES OF AN ECONOMETRIC MODEL

1. Theoretical plausibility: The model should be compatible with the postulates of

TOPIC 2: REGRESSION ANALYSIS (CLASSICAL LINEAR REGRESSION MODEL - CLRM)

ORDINARY LEAST SQUARES (OLS)

Why do we include a Disturbance term?

- We always leave out some determinants of y

- There may be errors in the measurement of y that cannot be modeled.

- Random outside influences on y which we cannot model

The PRF is Y=a+bX+u

THE ASSUMPTIONS OF OLS

1. Error variable, μ, is random

1. Linearity: It is a linear function of a random variable, which must be linear in parameters

DERIVATION OF THE OLS ESTIMATORS

TOPIC 3: Time Series Analysis

IMPORTANCE OF TIME SERIES ANALYSIS

MODELS/TYPES OF TIME SERIES ANALYSIS

● Classification: Identifies and assigns categories to the data.

REGRESSION INVOLVING TIME SERIES DATA

TOPIC 4: STATIONARITY TEST

ATTRIBUTES OF STATIONARITY TEST

4. Impact of Changes in Test Parameters on Test Results: Various test parameters

Methods of Testing for Cointegration

1. Engle-Granger Two-Step Method: The Engle-Granger Two-Step method starts by

Maximum Eigenvalue test

An Eigenvalue is defined as a non-zero vector which, when a linear transformation is applied to

TOPIC 5: SIMULTANEOUS EQUATION MODEL

IMPORTANT THINGS TO NOTE IN THE MODEL

ESTIMATION OF STRUCTURAL MODEL

M𝑡 =a0 +a1L𝑡 +a2I𝑡 +a3M𝑡−1 +μ𝑡

Total effects = Direct effects + Indirect Effects

y1 =a1 +α11x1 +α12x2 +μ1

a. Specify structural model equations of the relationships

𝑃 =a1 +α11 𝑒+α12𝑚+α13𝑔+β11𝑑+β12𝐼+ + μ1

Reduced Form Model:

𝑃 =Π1 +Π11 𝑒+Π12𝑚+Π13𝑔+μ1

SINGLE EQUATION METHODS (OLS, ILS, 2SLS)

Assumption 10 underlying the use of OLS is that the model is identitied

Types Of Identification Conditions

Implication of Identification condition

If a model is under-identitied, it cannot be statistically estimated.

Criteria/Conditions for Identification

• COMPLETENESS OF THE MODEL

Note that this Is a necessary but not sufficient condition of identification

PROCEDURES FOR ESTABLISHING RANK CONDITION