0% found this document useful (0 votes)

14 views

MLRMSB2

Uploaded by

Supravat Bagli

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

MLRMSB2

Uploaded by

Supravat Bagli

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

1

K-Variable Linear Regression Model

The multivariate relation gives rise to a richer array of inference questions and reduces the
chance of omitted variable bias in estimation than the two-variable equation. The specification
of k variable model:

𝑌𝑖 = 𝛽1 + 𝛽2 𝑋2𝑖 + 𝛽3 𝑋3𝑖 … … . . +𝛽𝑘 𝑋𝑘𝑖 + 𝑈𝑖 . [1] Population model/True model

This equation identifies k- 1 explanatory variables (regressors), namely, X2, X3, . . . , Xk, that
are thought to influence the dependent variable (regressand) Y. subscript ‘i’ indicates the ith
population member(observation).

U is stochastic disturbance, captures the randomness of the relationship between the regressand
and the regressor. It contains the unobserved factors affecting Y.

The matrix form of the model [1] for n observations

𝑌 = 𝑋𝛽 + 𝑈 (1a)

1 𝑋21 ⋯ 𝑋𝑘1 𝑌1 𝛽1 𝑈1
Where, X=[ ⋮ ⋱ ⋮ ] Y=[ ⋮ ] 𝛽 = [ ⋮ ] and U =[ ⋮ ]
1 𝑋2𝑛 ⋯ 𝑋𝑘𝑛 𝑌𝑛 𝛽𝑘 𝑈𝑛

Assumptions

A1 Linear in coefficient Parameters

The relationship in [1] is linear in the coefficient parameters (𝛽𝑗 ), however Y and Xs may be
various transformations of the underlying variables of interest. This assumption simply defines
the multiple linear regression model.

A2 No Perfect Collinearity
There are no exact linear relationship among the independent variables. It means that there is
no exact multicollinearity problem. Technically this assumption says that X matric has full
column rank i.e. 𝜌(𝑋) = 𝑘 this assumption is known as the identification condition. The linear
independence of the columns of X is required for unique determination of the estimates of 𝛽𝑗

A3 Zero Conditional mean of the disturbance

Supravat Bagli/MLRM
2

This assumption states that no observations on X convey information about the expected value
𝑈
𝐸 [ 1⁄𝑋] 0
of the disturbance. Which we write as 𝐸[𝑈⁄𝑋] = 0 𝑖. 𝑒. ⋮ =[ ⋮ ]
𝑈𝑛⁄ 0
[𝐸 [ 𝑋 ]]
The zero conditional mean implies that unconditional mean is also zero, since

𝑈
𝐸[𝑈𝑖 ] = 𝐸𝑥 [𝐸 [ 𝑖⁄𝑋]] = 𝐸𝑥 [0] = 0 for all i

𝑈
Since for each 𝑈𝑖 cov(𝐸 [ 𝑖⁄𝑋] , 𝑋) = 𝑐𝑜𝑣(𝑈𝑖 , 𝑋) = 0 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖

The implication of this assumption is that

𝐸[𝑌⁄𝑋] = 𝑋𝛽 [1b]

Assumption A1 and A3 comprises the linear regression model. The regression Y on X is the
condition mean of Y given X. So assumption A3 makes 𝑋𝛽 conditional mean function.

Secondly assumption A3 indicates that unobserved factors in U and the explanatory variables
are uncorrelated. It means that explanatory variables are exogenous. Violation of this
assumption creates the problem of misspecification of the model and the problem of
endogeneity.

A4 Spherical Disturbances
This assumption states that disturbances are homoscedastic and non-autocorrelated. It can be
written as

/
𝐸 [𝑈𝑈 ⁄𝑋] = 𝜎 2 𝐼

This assumption has two parts; (a) variance of Ui is constant for all i

And (b) covariance of Ui and Uj is zero for all i and j. Violation of part (a) say the problem of
heteroscedasticity and violation of part (b) creates the problem of autocorrelation. When the
disturbance term satisfies assumption A4, it is called spherical disturbance.

A5. Data generating process for the regressor to be unrelated to Disturbances.

It is common to assume that explanatory variables are non-stochastic as it happens in
experimental situation. With this simplification A3 and A4 wold be made unconditional.
However in reality X can be a mixture of stochastic and non-stochastic. So we need an

Supravat Bagli/MLRM
3

assumption that the ultimate source of the data in X is statistically and economically unrelated
to the source of U. In other words (𝑌𝑖 , 𝑋2𝑖 , 𝑋3𝑖 … … . . 𝑋𝑘𝑖 ) , i=1,2,…n are independently and
identically distributed. This assumption holds automatically if the data are collected by simple
random sampling.

Assumptions A1 through A5 are collectively known as Gauss-Markov assumptions for cross

section analysis.

Ordinary Least Squares (OLS) Estimates of the MLRM

The primary objective of the estimation of a MLRM is to estimate the conditional mean of Y
given X with some confidence interval. For this purpose we need to estimate k coefficient
parameters 𝛽𝑠 and conditional variance of U i.e. 𝜎 2 . So we estimate k+1 parameters in model
[1a]

If the unknown vector 𝛽 in [1a] is replaced by some estimate 𝛽̂ we can write a residual vector

𝑒 = 𝑌 − 𝑋𝛽̂

Or
𝑌 = 𝑋𝛽̂ + 𝑒 --[2] sample model

The OLS principle is to choose 𝛽̂ to minimise the residual sum of squares i.e. minimise

𝑒 / 𝑒 = (𝑌 − 𝑋𝛽̂ )/ (𝑌 − 𝑋𝛽̂ ) -- [3]

We know that sum of squares deviations from mean is minimum. This minimising the sum of
the squares of the residuals we actually find out the conditional mean of Y.

𝑅𝑆𝑆 = 𝑒 / 𝑒 = (𝑌 − 𝑋𝛽̂ )/ (𝑌 − 𝑋𝛽̂ )

=𝑌 / 𝑌 − 𝑌 / 𝑋𝛽̂ − 𝛽̂ / 𝑋 / 𝑌 + 𝛽̂ / 𝑋 / 𝑋𝛽̂

= 𝑌 / 𝑌 − 2𝛽̂ / 𝑋 / 𝑌 + 𝛽̂ / 𝑋 / 𝑋𝛽̂

[Since 𝑌 / 𝑋𝛽̂ 𝑎𝑛𝑑 𝛽̂ / 𝑋 / 𝑌 are scalar and one is the transpose of the other, hence both are
equal.]

Supravat Bagli/MLRM
4

FOCs for minimisation

𝜕𝑅𝑆𝑆
̂ = −2𝑋 / 𝑌 + 2𝑋 / 𝑋𝛽̂ =0 -- [4] are the k normal equations
𝜕𝛽

−1
Or, 𝛽̂ = (𝑋 / 𝑋) 𝑋 / 𝑌 --- [5] this is the OLS estimates of 𝛽 vector. This expression shows
that how the OLS estimate of 𝛽 is related to the data.

𝜕2 𝑅𝑆𝑆
SOCs requires ̂2 = 2𝑋 / 𝑋 > 0 is satisfied as 2𝑋 / 𝑋 is positive definite matrix by
𝜕𝛽

assumption A2.

If we put [2] in [5] we get

−1 −1
𝛽̂ = (𝑋 / 𝑋) 𝑋 / 𝑋𝛽̂ + (𝑋 / 𝑋) 𝑋 / 𝑒

−1
𝑜𝑟 0 = (𝑋 / 𝑋) 𝑋 / 𝑒

−1
Or 𝑋 / 𝑒 = 0 -- [6] as (𝑋 / 𝑋) ≠0

It is a fundamental result in OLS. The first element in [6] gives

∑ 𝑒𝑖 =0 i.e. 𝑒̅ = 𝑌̅ − 𝛽̂1 − 𝛽̂2 𝑋̅2 − ⋯ − 𝛽̂𝑘 𝑋̅𝑘 = 0 [6a]

Thus the residuals have zero mean, and the regression plane passes through the point of
means in k dimensional space. The remaining elements in Eq. (6) are of the form
∑𝑛𝑖 𝑋𝑗𝑖 𝑒𝑖 =0 --[6b] j=1,2,…k, and n (sample size)

This condition means that each regressor has zero sample correlation with the residuals.
1 1
[ sample covariance of Xj and e = cov(Xj,e) = 𝑛 ∑𝑛𝑖=1(𝑋𝑗𝑖 − 𝑋̅𝑗 )𝑒𝑖 = 𝑛 ∑𝑛𝑖=1(𝑋𝑗𝑖 ) 𝑒𝑖 −

𝑋̅𝑗 ∑𝑛𝑖=1 𝑒𝑖 =0 as ∑𝑛𝑖 𝑋𝑗𝑖 𝑒𝑖 =0 and ∑ 𝑒𝑖 =0]

This in turn implies that 𝑌̂( = X𝛽̂ ), the vector of regression values for Y, is uncorrelated with
e, for
𝑌̂ / e = 𝑋𝛽̂ / e = 𝛽̂ / 𝑋 / e = 0 -- [7]

Further we have 𝑌 = 𝑌̂ + 𝑒

Implies 𝑌̅ = 𝑌̅̂ + 𝑒̅

Supravat Bagli/MLRM
5

Or 𝑌̅ = 𝑌̅̂ --[8] since 𝑒̅ = 0

Decomposition of the Sum of Squares

The zero covariance between the regressors and the residual underlie the decomposition of
the sum of squares. Decomposing the Y vector into the part explained by the regression and
the unexplained part,
𝑌 = 𝑌̂ + 𝑒 = 𝑋𝛽̂ + 𝑒
it follows that
𝑌 / 𝑌 = (𝑋𝛽̂ + 𝑒)/ (𝑋𝛽̂ + 𝑒) = 𝛽̂ / 𝑋 / 𝑋𝛽̂ + 𝑒 / 𝑒

However, 𝑌 / 𝑌 is the sum of squares of the actual Y values. We are actually interested to analyse
the variation in Y measured by the sum of squared deviation from the sample i.e.
𝒏
̅)2 = 𝑌 / 𝑌 − 𝑛𝑌̅ 2
∑(Y𝑖 − Y
𝒊=𝟏

Thus, subtracting 𝑛𝑌̅ 2 from each side of the previous decomposition gives a revised
decomposition,
𝑌 / 𝑌 − 𝑛𝑌̅ 2 = 𝛽̂ / 𝑋 / 𝑋𝛽̂ − 𝑛𝑌̅ 2 + 𝑒 / 𝑒
Or 𝑌 / 𝑌 − 𝑛𝑌̅ 2 = 𝑌̂ / 𝑌̂ − 𝑛𝑌̅̂ 2 + 𝑒 / 𝑒 since 𝑌̅ = 𝑌̅̂

Therefore, TSS = ESS + RSS [9]

where TSS indicates the total sum of squares in Y, and ESS and RSS the explained and residual
(unexplained) sum of squares.

Equation in Deviation Form

An alternative approach is to begin by expressing all the data in the form of deviations from
the sample means. We have sample regression function for ith observation
𝑌𝑖 = 𝛽̂1 + 𝛽̂2 𝑋2𝑖 + ⋯ + 𝛽̂𝑘 𝑋𝑘𝑖 + 𝑒𝑖 [from equation 2]
And from [6a]
𝑌̅ = 𝛽̂1 + 𝛽̂2 𝑋̅2 + ⋯ + 𝛽̂𝑘 𝑋̅𝑘

Supravat Bagli/MLRM
6

Now subtracting the second equation from the first gives

𝑦𝑖 = 𝛽̂2 𝑥2𝑖 + ⋯ + 𝛽̂𝑘 𝑥𝑘𝑖 + 𝑒𝑖 --[10]

This is called the deviation form of the k variable LRM where lowercase letters denote
deviations from sample means. The intercept 𝛽̂1disappears from the deviation form of the
equation, but it may be recovered from [6a].
𝛽̂1 = 𝑌̅ − 𝛽̂2 𝑋̅2 − ⋯ − 𝛽̂𝑘 𝑋̅𝑘 --[10a]

The least-squares slope coefficients are identical in both forms of the regression equation, [2]
and [10] as are the residuals.

Collecting all n observations, the deviation form of the equation may be written compactly
using a transformation matrix,

1
𝐴 = 𝐼𝑛 − (𝑛) 𝑖𝑖 / [11]

where i is a column vector of n ones. This is a symmetric, idempotent matrix.

If we premultiply A in the matrix form of [2] we get the deviation form of the in matrix notation
as follows.

𝐴𝑌 = 𝐴𝑋𝛽̂ + 𝐴𝑒

Or
̂
𝐴𝑌 = 𝐴[𝑖 𝑋2 ] [𝛽1 ] + 𝐴𝑒
𝛼̂
Or 𝑦 = 𝑥𝛼̂ + 𝑒 --[12]

Here X2 is a matrix of 𝑛 × (𝑘 − 1) 𝑎𝑛𝑑 𝛼̂ is the column vector of the slope estimates of order
(𝑘 − 1) ×1. Premultiplication of A with a vector of n observations, transforms that vector into
deviation form. It follows that Ae = e and Ai = 0. Therefore,
Now premultiplying 𝑥 / in [12] we get
𝑥 / 𝑦 = 𝑥 / 𝑥𝛼̂ + 𝑥 / 𝑒
This the set of (k-1) normal equation
−1
Or 𝛼̂ = (𝑥 / 𝑥) 𝑥 / 𝑦 --[12a]
From 12 we can write 𝑦 / 𝑦 = 𝛼̂ / 𝑥 / 𝑥𝛼̂ + 𝑒 / 𝑒

Or , TSS=ESS+RSS

Supravat Bagli/MLRM
7

Estimation of the variance of disturbance 𝝈𝟐

In addition to the coefficient parameters we have to estimate the unknown variance of U. it is

reasonable to base an estimate on the RSS from the fitted regression. We have
𝑒 = 𝑌 − 𝑋𝛽̂

Implies
−1 −1
𝑒 = 𝑌 − 𝑋(𝑋 / 𝑋) 𝑋 / 𝑌= [𝐼 − 𝑋(𝑋 / 𝑋) 𝑋 / ]𝑌 = 𝑀𝑌 where M is a symmetric and idempotent
matrix.

Therefore 𝑒 = 𝑀(𝑋𝛽 + 𝑈) = 𝑀𝑋𝛽 + 𝑀𝑈 = 𝑀𝑈 as MX=0

Now 𝑒 / 𝑒 = (𝑀𝑈)/ (𝑀𝑈) = 𝑈 / 𝑀/ 𝑀𝑈 = 𝑈 / 𝑀𝑈

Thus 𝐸(𝑒 / 𝑒) = 𝐸(𝑈 / 𝑀𝑈) now trace of a scalar is a scalar. So,

𝐸(𝑈 / 𝑀𝑈) = 𝐸(𝑡𝑟𝑈 / 𝑀𝑈) = 𝐸(𝑡𝑟 𝑈𝑈 / 𝑀) = 𝜎 2 𝑡𝑟(𝑀)

−1 −1 −1
=𝜎 2 tr[𝐼𝑛 − 𝑋(𝑋 / 𝑋) 𝑋 / ]= 𝜎 2 [𝑛 − 𝑡𝑟[[𝑋(𝑋 / 𝑋) 𝑋 / ]= 𝜎 2 [𝑛 − 𝑡𝑟[(𝑋 / 𝑋) 𝑋 / 𝑋]

= 𝜎 2 [𝑛 − 𝑡𝑟[𝐼𝑘 ]= = 𝜎 2 [𝑛 − 𝑘]

[trace implies the sum of the diagonal elements of a square matrix. trace (AB)=trace(BA) and
tr(ABC)=tr(BCA) where ABC are not necessarily square matrix.]

𝑒 /𝑒
Or, 𝐸(𝑛−𝑘) = 𝜎 2

𝑒 /𝑒
Therefore, 𝑠 2 = 𝑛−𝑘 --[13]

is the unbiased estimator of 𝜎 2 . The square root of s2 is the standard deviation of Y values
about the regression plane. It is often referred to as the standard error of the regression (SER).
The SER is used as a measure of the fit of the regression. The divisor n-k (rather than n) adjust
for the downward bias introduced by estimating k-1 slope coefficients and one intercept
parameter. When n is large, the effect of the degrees of freedom adjustment is negligible.

Supravat Bagli/MLRM
8

SER is an absolute measure of the goodness of fit. It depends on the unit of Y. SER measure
the spread of the observations around the regression line. So higher the value of SER lower
would be the goodness of fit and vice versa. In other words, large spread means that prediction
of Y made using the selected X variables will often be wrong by a large amount.
Other measures of the Goodness of Fit
The goodness of fit of a linear regression model measure how well the estimated model fits a
given set of data or how well it can explain the population. It is however, difficult to come up
with a perfect measure of the goodness of fit for econometric model. A regression model fit
well if the dependent variable is explained more by the regressor than by the residual. The
coefficient determination R2 is defined as the squared of multiple correlation coefficient is a
common measure of the goodness of fit of a regression model.

𝐸𝑆𝑆 𝑅𝑆𝑆
𝑅 2 = 𝑇𝑆𝑆 = 1 − 𝑇𝑆𝑆 --[14]

Thus R2 measures the proportion of the total variation in Y explained by the linear combination
of the regressors. It is the square of the simple correlation coefficient of Y and 𝑌̂ . a high value
of R2 indicates that we can predict individual outcomes on Y with much accuracy on the basis
of the estimated model.
TSS=RSS when the best fitted regression has no regressor, only intercept. If we add regressor
into the model RSS will reduce i.e. TSS≥RSS.
In one extreme R2 =0, the regression line is horizontal implying no change in Y with the change
in X. in other words X has no explanatory power.
In the other extreme, R2 =1 indicating all the data points lie on the same hyperplane of the fitted
regression and RSS=0. Thus in general 0≤ R2≤1

R2 is used as a measure of the goodness of fit, but it is difficult to say how large does R2 need
to be considered as a good. The value of R2 never decrease with addition of explanatory
variables. If added explanatory variable is totally irrelevant the ESS simply remains unchanged
due to the addition of this X. this is the basic limitation of the use of R2 as an indicator of the
goodness of fit.
Secondly, R2 is sensitive to extreme values, so R2 is not robust
Thirdly, R2 may be negative or greater than one if the intercept term is not included. So it may
not be a good measure of the fitness of the regression.

Supravat Bagli/MLRM
9

If intercept term is not included then

𝑌 = 𝑋𝛽̂ + 𝑒 (2)

𝑋21 ⋯ 𝑋𝑘1 𝑌1 𝛽̂1 𝑒1

Where, X=[ ⋮ ⋱ ⋮ ] Y=[ ⋮ ] 𝛽̂ = [ ⋮ ] and e =[ ⋮ ]
𝑋2𝑛 ⋯ 𝑋𝑘𝑛 𝑌𝑛 𝛽̂𝑘 𝑒𝑛

From condition [4] we get 𝑋 / 𝑋𝛽̂ = 𝑋 / 𝑌

Putting [2] in [4] gives 𝑋 / 𝑒 = 0 [ a k-1 column vector]
But as X matrix did not include the column of 1s so we cannot find the sum of ei is equal to
zero. Therefore, if we did not include the intercept we cannot say sample covariance of X
variables and residuals is zero.
We 𝑌 = 𝑌̂ + 𝑒
Or, (𝑌 − 𝑙𝑌̅)/ (𝑌 − 𝑙𝑌̅) = (𝑌̂ − 𝑙𝑌̅)/ (𝑌̂ − 𝑙𝑌̅) + 𝑒 / 𝑒+2(𝑌̂ − 𝑙𝑌̅)/ 𝑒 where l is a column
vector of n ones.
Or TSS=ESS+RSS+2(𝑋𝛽̂ −𝑙𝑌̅)/ 𝑒
Or, TSS-RSS =ESS−2𝑌̅𝑙 / 𝑒
𝐸𝑆𝑆−2𝑌̅𝑙/ 𝑒
Or 1-RSS/TSS= 𝑇𝑆𝑆
𝐸𝑆𝑆−2𝑌̅𝑙/ 𝑒
Or, 𝑅 2 = which may be negative if 𝐸𝑆𝑆 < 2𝑌̅𝑙 / 𝑒 . Thus without very strong
𝑇𝑆𝑆

theoretical foundation we did not formulate a linear regression model without intercept term.

Adjusted R2 as measure of the Goodness of fit

The R2 may be an indicator of goodness of fit of the model after adjusting the degrees of
freedom in estimating the parameter. The value of R2 adjusted by the degrees of freedom is
known as adjusted R2 denoted by
𝑅𝑆𝑆/(𝑛−𝑘)
𝑅̅ 2 = 1 − 𝑇𝑆𝑆/(𝑛−1) --[15] where RSS/(n-k) is the unbiased estimator of the variance

of U or the conditional variance of Y and TSS/(n-1) is an unbiased estimator of the

unconditional variance of Y.
It is useful for comparing the fit of specifications that differ in the addition or deletion of
explanatory variables. While the unadjusted R2 will never decrease with the addition of any
variable to the set of regressor, the adjusted R2 however, may decrease with the addition of
variables of low explanatory power.

Supravat Bagli/MLRM
10

The relation between the adjusted and unadjusted R2 is

(𝑛−1)𝑅𝑆𝑆 (𝑛−1)
𝑅̅ 2 = 1 − (𝑛−𝑘)𝑇𝑆𝑆 = 1 − (𝑛−𝑘) (1 − 𝑅 2 ) --[16]

𝑅̅ 2 = 𝑅 2 when k=1, it means that when the regression is formulated only with an intercept and
no explanatory variable or one explanatory variable without intercept (which is very rare). In
MLRM when k increases (n-1)/(n-k) increases and (1-𝑅 2 ) falls. The ratio (n-1)/(n-k) is called
the penalty of using more regressor in a model and 𝑅 2 is the benefit of the addition of regressor.
Whether addition of the regressors improve the explanatory power of the model depends on
the trade-off between 𝑅 2 and the penalty (n-1)/(n-k). Therefore, adjusted 𝑅 2 may not increase
with the number of explanatory variables. If the contribution of the additional regressor to the
estimated model is more than the loss of the degrees of freedom 𝑅̅ 2 will rise with the rise in
the number of regressor otherwise it will decline if the additional explanatory variable has no
explanatory power.
So, clearly 𝑅̅ 2 ≤ 𝑅 2
(𝑘−1)
It is noted that 𝑅̅ 2 may be negative when 𝑅 2 < (𝑛−1)

Properties of OLS estimators

Testing Linear Hypotheses about 𝜷

We have estimated the regression coefficients 𝛽 and examined the properties of the OLS
estimators. Let now see how to use these estimators to test various hypotheses about𝛽. Consider
the following examples of typical hypotheses about𝛽.
i] 𝐻0 : 𝛽𝑗 = 0 this hypothesis tells that regressor 𝑋𝑗 has no effect on Y. it is a very common test
often referred to as significance test.
ii] 𝐻0 : 𝛽𝑗 = 𝛽𝑗0 Here 𝛽𝑗0 is some specific value. If for example 𝛽𝑗 denote income elasticity,
one might wish to test 𝛽𝑗 = 1

Supravat Bagli/MLRM
11

iii] 𝐻0 : 𝛽2 + 𝛽3 = 1 if 𝛽2 𝑎𝑛𝑑 𝛽3 indicate labour and capital elasticities in a production

function , the hypothesis examines the presence of CRS.
iv] 𝐻0 : 𝛽2 = 𝛽3 𝑜𝑟 𝛽2 − 𝛽3 = 0. It examines that X2 and X3 have the same coefficient.
v] 𝐻0 : 𝛽2 = 𝛽3 = 𝛽4 = ⋯ = 𝛽𝑘 = 0
𝛽2 0
or, ⌈ ⋮ ⌉ = ⌈ ⋮ ⌉ or 𝐻0 : 𝛼 = 0 where 𝛼 denotes the column vector of slope parameters of order
𝛽𝑘 0
(k-1). This sets up the hypothesis that the complete of regressor has no effect on Y. it tests the
significance of overall relation. The intercept term does not enter into this hypothesis, interest
centers on the variation of Y around its mean and the level of the series is usually no specific
relevance.
vi] 𝐻0 : 𝛽⏟2 = ⏟
0 here 𝛽 vector is partition into two sub vectors 𝛽⏟1 𝑎𝑛𝑑 𝛽⏟2 containing

respectively k1 and k2(k-k1) elements. This sets up the hypothesis that a specified subset of the
regressor plays no role in the determination of Y.
vii] 𝐻0 : 𝛽2 + 𝛽3 = 1, 𝛽4 + 𝛽6 = 0, 𝛽5 + 𝛽6 = 0 we may test several linear restrictions.

All the examples fit into general linear framework

𝑯𝟎 : 𝑹𝜷 = 𝒓 or 𝑹𝜷 − 𝒓 = 𝟎 -- [1]

Where R is a 𝑞 × 𝑘 matrix of known constants with 𝑞 < 𝑘 and r is a q-vector of known

constants. Each null hypothesis determines the the relevant elements in R and r.
For the foregoing examples we have
i] R= [0 0 …0 1 0… 0] with 1 in the jth position r =0 and q=1
ii] R= [0 0 …0 1 0… 0] with 1 in the jth position r = 𝛽𝑗0 and q=1
iii] R= [0 1 1 0 …0] r =1 and q=1
iv] R= [0 1 -1 0 …0] r =0 and q=1
0
v] R= [0 Ik-1] where 0 is a vector of k-1 zeros r=[ ⋮ ] and q=k-1
0 𝑘−1×1
0
vi] R= [0𝑘2 ×𝑘1 𝐼𝑘2 ] r=[ ⋮ ] and q=𝑘2
0 𝑘2 ×1
011000 … 0 1
vii] R=[000101 … 0] r=[0] and q=3
000011 … 0 0 3×1

Supravat Bagli/MLRM
12

The general test may then be specialized to deal with any specific application. Given the OLS
−1 −1
estimator as 𝛽̂ = (𝑋 / 𝑋) 𝑋 / 𝑌 = 𝛽 + (𝑋 / 𝑋) 𝑋 / 𝑈, an obvious step is to compute the vector
(R𝛽̂ - r). This vector measures the discrepancy between expectation and observation. If this
vector is, in some sense, “large,” it casts doubt on the null hypothesis, and conversely, if it is
“small” it tends not to contradict the null. As in all conventional testing procedures, the
distinction between large and small is determined from the relevant sampling distribution under
the null, in this case, the distribution of R𝛽̂ with hull hypothesis𝑹𝜷 = 𝒓.

𝐸(𝑅𝛽̂) = 𝑅 𝐸(𝛽̂)= 𝑅𝛽 --[2]

𝑽𝒂𝒓(𝑹𝛽̂ ) = 𝐸(R𝛽̂ − 𝑅𝛽)(R𝛽̂ − 𝑅𝛽)/ = 𝑅𝐸(𝛽̂ − 𝛽)(𝛽̂ − 𝛽)/ 𝑅 / = 𝜎 2 𝑅(𝑋 / 𝑋)−1 𝑅/ --[3]

We have assumed that U is a spherical disturbance. Now we assume that the Ui are normally
distributed. It says that U~N (0, 𝜎 2 𝐼). Since 𝛽̂ is a linear function of the U vector, 𝛽̂ follows
normal distribution. Further 𝑅𝛽̂ is a linear function of 𝛽̂ , so we say that
𝑅𝛽̂ ~𝑁(𝑅𝛽, 𝜎 2 𝑅(𝑋 / 𝑋)−1 𝑅 / ) it implies 𝑅𝛽̂ − 𝑅𝛽 ~ 𝑁(0, 𝜎 2 𝑅(𝑋 / 𝑋)−1 𝑅/ )
Under null hypothesis 𝑅𝛽 = 𝑟
So under null 𝑅𝛽̂ − 𝑟 ~ 𝑁(0, 𝜎 2 𝑅(𝑋 / 𝑋)−1 𝑅 / )
With this formulation we can say
(𝑅𝛽̂ − 𝑟)/ (𝜎 2 𝑅(𝑋 / 𝑋)−1 𝑅 / )−1 (𝑅𝛽̂ − 𝑟) ~ 𝜒𝑞2 --[4]
[𝜒𝑞2 is the sum of square of q standard normal variate]
The distribution in [4] is derived based on the sampling distribution of 𝛽̂ . The only problem
hindering practical application of Eq. (4) is the presence of the unknown 𝜎 2 . However,
𝑒 /𝑒 2
~ 𝜒𝑛−𝑘 -- [5] which is independent of 𝛽̂
𝜎2

Thus the ratio between [4] and [5] gives a suitable statistic where 𝜎 2 is absent.
̂ −𝑟)/ (𝜎2 𝑅(𝑋 / 𝑋)−1 𝑅 / )−1 (𝑅𝛽
(𝑅𝛽 ̂ −𝑟)
i.e. 𝑒/ 𝑒
𝜎2

Finally if we divide the numerator and denominator by their respective degrees of freedom we
get
̂ −𝑟)/ (𝑅(𝑋 / 𝑋)−1 𝑅 / )−1 (𝑅𝛽
(𝑅𝛽 ̂ −𝑟)/𝑞
𝑒/ 𝑒
~ 𝐹(𝑞, 𝑛 − 𝑘) [6]
𝑛−𝑘

Supravat Bagli/MLRM
13

̂ −𝑟)/ (𝑠2 𝑅(𝑋 / 𝑋)−1 𝑅 / )−1 (𝑅𝛽

(𝑅𝛽 ̂ −𝑟)
Or, ~ 𝐹(𝑞, 𝑛 − 𝑘) --[6.1]
𝑞

𝑒 𝑒 /
Where 𝑠 2 = 𝑛−𝑘 𝑎𝑛𝑑 𝑣𝑎𝑟𝑐𝑜𝑣(𝛽̂ ) = 𝑠 2 (𝑋 / 𝑋)−1

Suppose, Cij denote i,j th element in (𝑋 / 𝑋)−1 then 𝑠 2 Cjj = Var(𝛽̂𝑗 ) and 𝑠 2 Cjt = cov(𝛽̂𝑗 𝛽̂𝑡 ) j,t
=1,2..k
Let us now consider the hypotheses one by one
i] 𝐻0 : 𝛽𝑗 = 0 under this null hypothesis (𝑅𝛽̂ − 𝑟) picks out 𝛽̂𝑗 and 𝑅(𝑋 / 𝑋)−1 𝑅/ picks out the
jth diagonal element in (𝑋 / 𝑋)−1 this equation [6] becomes
̂𝑗 2
𝛽
~𝐹(1, 𝑛 − 𝑘) Now taking square root of 𝐹(1, 𝑛 − 𝑘) we get
𝑠2 Cjj
̂𝑗
𝛽 ̂𝑗
𝛽
= 𝑠𝑒.𝛽̂ ~𝑡𝑛−𝑘 statistic -[s1]
𝑠√𝑐𝑗𝑗 𝑗

Thus the null hypothesis that Xj has no influence on Y is tested by dividing the estimated value
of the coefficient by its s.e. which follows t distribution with d.f. n-k. if the calculated value
greater than the tabulated value with a specific level of significance we reject the null
hypothesis.
Similarly for the following hypotheses we can test
̂𝑗 −𝛽𝑗0
𝛽
ii] 𝐻0 : 𝛽𝑗 = 𝛽𝑗0 by ̂𝑗 ~𝑡𝑛−𝑘 statistic [s2]
𝑠𝑒.𝛽

Confidence interval of 𝜷𝒋
Instead of testing specific hypothesis about 𝛽𝑗 we may compute 95% level of confidence
interval for𝛽𝑗 . Because of random sampling error, it is impossible to learn the exact value of
the true coefficient parameter 𝛽𝑗 using only the information in a sample. However, it is possible
to use data from random sample to construct the range that contains the true population
parameter 𝛽𝑗 with a certain pre-specified probability (say 95%). The range is called
confidence interval and the specified probability is known as confidence level.
For constructing confidence interval we require to test all possible values of 𝛽𝑗 as null
hypothesis which is almost impractical. Fortunately there is a much easier approach. In respect
of the t statistic in the hypothesis𝐻0 : 𝛽𝑗 = 𝛽𝑗0 , the trial value 𝛽𝑗0 of 𝛽𝑗 is rejected at 5% level of
significance if |𝑡𝑛−𝑘 | > 1.96 for n-k>120. Otherwise, we cannot reject the null at 5% level of
significance. The null would not be rejected if
−1.96 ≤ 𝑡𝑛−𝑘 ≤ 1.96

Supravat Bagli/MLRM
14

𝛽̂𝑗 − 𝛽𝑗0
−1.96 ≤ ≤ 1.96
𝑠𝑒. 𝛽̂𝑗
Or, 𝛽̂𝑗 − 1.96 𝑠𝑒. 𝛽̂𝑗 ≤ 𝛽𝑗0 ≤ 𝛽̂𝑗 + 1.96𝑠𝑒. 𝛽̂𝑗

Thus the set of values of 𝛽𝑗 that are not rejected at 5% level of significance consists of the
values within 𝛽̂𝑗 ± 1.96 𝑠𝑒. 𝛽̂𝑗 . Thus 95% confidence interval for 𝛽𝑗 is 𝛽̂𝑗 ± 𝑡.025 𝑠𝑒. 𝛽̂𝑗 .
Similarly 99% confidence interval for 𝛽𝑗 is𝛽̂𝑗 ± 2.58 𝑠𝑒. 𝛽̂𝑗 for n-k>120

However this discussion so far has focused two sided confidence interval. We could instead
construct a one sided confidence interval as the set of values of 𝛽𝑗 that cannot be rejected by a
one sided hypothesis test.

̂2 +𝛽
𝛽 ̂3 −1
iii] 𝐻0 : 𝛽2 + 𝛽3 = 1 by ~𝑡𝑛−𝑘 statistic [s3]
̂2 +𝛽̂
√𝑣𝑎𝑟(𝛽 3)

̂2 + 𝛽̂
Alternatively 95% confidence interval for 𝛽2 + 𝛽3 is (𝛽 ̂ ̂
3 ) ± 𝑡.025 √𝑣𝑎𝑟(𝛽2 + 𝛽3 ).

̂2 −𝛽
𝛽 ̂3
iv] 𝐻0 : 𝛽2 = 𝛽3 𝑜𝑟 𝛽2 − 𝛽3 = 0. by ~𝑡𝑛−𝑘 statistic [s4]
̂2 −𝛽̂
√𝑣𝑎𝑟(𝛽 3)

̂2 − 𝛽̂
95% confidence interval for 𝛽2 − 𝛽3 is (𝛽 ̂ ̂
3 ) ± 𝑡.025 √𝑣𝑎𝑟(𝛽2 − 𝛽3 ).

Let us consider the case (v) 𝐻0 : 𝛽2 = 𝛽3 = 𝛽4 = ⋯ = 𝛽𝑘 = 0 𝑜𝑟 𝐻0 : 𝛼 = 0. Here we have

q=k-1 hypotheses and R= [0 Ik-1] where 0 is a vector of k-1 zeros. Now 𝑅(𝑋 / 𝑋)−1 𝑅 / picks out
the square sub matrix of order (k-1) in the bottom right hand corner in (𝑋 / 𝑋)−1 .
𝑖
To evaluate the sub matrix, partition the matrix X=[i X2] thus 𝑋 / = [ ] where order of X2 is
X2
𝑛×𝑘−1

/ 𝑖/𝑖 𝑖 / X2
𝑋 𝑋=[ / ]
X2 𝑖 X2 / X2

−1
𝑛 𝑖 / X2 𝐵 𝐵12
/
Therefore (𝑋 𝑋) −1
=[ ] =[ 11 ](let)
/
X2 𝑖 /
X2 X2 𝐵21 𝐵22

Supravat Bagli/MLRM
15

𝐵11 𝐵12 0
𝑅(𝑋 / 𝑋)−1 𝑅 / = [0 I𝑘−1 ] [ ][ ] = 𝐵22
𝐵21 𝐵22 I𝑘−1

The properties of the inverse of partition matrix tells that

1
𝐵22 = [X 2 / X2 − X2 / 𝑖𝑛−1 𝑖 / X2 ]−1 = [X2 / (𝐼𝑛 − ( ) 𝑖𝑖 / )X2 ]−1
𝑛
−1 −1
= [X2 / 𝐴X2 ] [X2 / 𝐴/ 𝐴X2 ] = (𝑥 / 𝑥)−1

Where A is a symmetric and idempotent matrix and AX2 give the deviation form of the
explanatory variables in our k variable model.

Now 𝑅𝛽̂ − 𝑟 𝑢𝑛𝑑𝑒𝑟 𝑛𝑢𝑙𝑙 𝑖𝑠 𝛼̂ and q=k-1. These follows

𝐸𝑆𝑆 𝐸𝑆𝑆
̂ −𝑟)/ (𝑠2 𝑅(𝑋 / 𝑋)−1 𝑅 / )−1 (𝑅𝛽
(𝑅𝛽 ̂ −𝑟) ̂ / (𝑥 / 𝑥)𝛼
𝛼 ̂ 𝑘−1 𝑘−1
= = = ~F(k-1,n-k)
𝑞 𝑠2 𝑞 𝑒 / 𝑒/𝑛−𝑘 𝑅𝑆𝑆/𝑛−𝑘

𝐸𝑆𝑆/𝑇𝑆𝑆 𝑅2
𝑘−1 𝑘−1
Or, 𝑅𝑆𝑆/𝑇𝑆𝑆 ~F(k-1,n-k) or (1−𝑅2 )
~F(k-1,n-k) [s5]
𝑛−𝑘 𝑛−𝑘

This test essentially asks whether the mean squares due to regression is significantly larger than
the residual mean square.

Next we consider the hypotheses under [vi] 𝐻0 : 𝛽⏟2 = ⏟

0 This hypothesis postulates that a

subset of regressor coefficients is a zero vector, in contrast with the previous example, where
all regressor coefficients were hypothesized to be zero. Partition the regression equation as
follows:
̂
𝛽
⏟1
̂
𝑌 = (𝑋1 𝑋2 ) ( ) + 𝑒 = 𝑋1 𝛽 ̂
⏟1 + 𝑋2 𝛽
⏟2 + 𝑒
̂
𝛽
⏟2
̂
Where X1 has k1 column including the intercept column, and X2 has k2 column and 𝛽 ̂
⏟1 𝑎𝑛𝑑 𝛽
⏟2

are the corresponding sub vectors of the coefficient OLS estimator.

The partitioning of the X matrix gives

Supravat Bagli/MLRM
16

−1
𝑋 /X 𝑋1 / X 2 𝐵 𝐵12
/
(𝑋 𝑋) −1
= [ 1/ 1 ] =[ 11 ](let)
X2 𝑋1 /
X2 X2 𝐵21 𝐵22

Now 𝑅(𝑋 / 𝑋)−1 𝑅 / picks out the square sub matrix of order (k2) in the bottom right hand corner
in (𝑋 / 𝑋)−1 . Here R= [0𝑘2 ×𝑘1 𝐼𝑘2 , r=0 and q=𝑘2
/
Therefore, like case [v] we find 𝑅(𝑋 / 𝑋)−1 𝑅 / = (𝑋2 𝑀2 𝑋2 )−1
where 𝑀2 = 𝐼 − 𝑋1 (𝑋1/ X1 )−1𝑋1 / is a symmetric and idempotent matrix and 𝑀2 𝑋1 =
0 𝑎𝑛𝑑 𝑀2 𝑒 = 𝑒
Further 𝑀2 𝑌 gives the vector of residuals when Y is regressed on 𝑋1. Then the numerator in
[6] is
̂/ (𝑋 / 𝑀 𝑋 ) 𝛽
𝛽⏟ ̂
2 2 2 2 ⏟2
𝑘2
To understand the meaning of the numerator consider the partition regression
̂
𝑌 = 𝑋1 𝛽 ̂
⏟1 + 𝑋2 𝛽
⏟2 + 𝑒

Premultiplying 𝑀2 we get
̂
𝑀2 𝑌 = 𝑀2 𝑋1 𝛽 ̂
⏟1 + 𝑀2 𝑋2 𝛽
⏟2 + 𝑀2 𝑒
̂
𝑂𝑟, 𝑀2 𝑌 = 𝑀2 𝑋2 𝛽
⏟2 + 𝑒

Or, (𝐼 − 𝑋1 (𝑋1 / X1 )−1 𝑋1 / )𝑌 = 𝑀2 𝑋2 𝛽

̂
⏟2 + 𝑒

Or, Y−𝑋1 (𝑋1 / X1 )−1 𝑋1 / 𝑌 = 𝑀2 𝑋2 𝛽

̂
⏟2 + 𝑒
̂
Or, Y−𝑋1 𝛽 ̂
⏟1 = 𝑀2 𝑋2 𝛽
⏟2 + 𝑒
̂
Or, 𝑒𝑟 = 𝑀2 𝑋2 𝛽
⏟2 + 𝑒

Squaring both sides gives

/ / / /
̂
𝑒𝑟 / 𝑒𝑟 = 𝛽
⏟ ̂ ̂
⏟2 + 𝑒 / 𝑒 or, 𝑒𝑟 / 𝑒𝑟 − 𝑒 / 𝑒 = 𝛽
2 (𝑋2 𝑀2 𝑋2 ) 𝛽 ⏟ ̂
2 (𝑋2 𝑀2 𝑋2 ) 𝛽
⏟2

The term on the left of this equation is the RSS when Y is regressed just on X1. The last term,
e'e, is the RSS when Y is regressed on [X1 X2]. Thus the middle term measures the increment
in ESS (or equivalently, the reduction in RSS) when X2 is added to the set of regressors. In
/ /
̂
other words, 𝛽
⏟ ̂
2 (𝑋2 𝑀2 𝑋2 ) 𝛽
⏟2 is the difference between restricted RSS and unrestricted RSS.

The hypothesis may thus be tested by running two separate regressions. First regress Y on
X1(submatrix of X) and denote the RSS by RSSr then run the regression on all the Xs, obtaining
the RSS, denoted as usual by RSSu. From Eq. (6) the test statistic is

Supravat Bagli/MLRM
17

𝑅𝑆𝑆𝑟 −𝑅𝑆𝑆𝑢
𝑘2
𝑅𝑆𝑆𝑢 ~ F(k2, n-k)
𝑛−𝑘

Dividing the numerator and the denominator by TSS we get

𝑅2𝑢 −𝑅𝑟
2
𝑘2
(1−𝑅2
~ F(k2, n-k) [s6]
𝑢)
𝑛−𝑘

Where 𝑅𝑢2 𝑎𝑛𝑑 𝑅𝑟2 indicate the coefficient of determination for the unrestricted regression and
the coefficient of determination for the restricted model. Finally we compare the calculated
value and tabulated value of F and if find the calculated value is greater that the tabulated value
we reject the null hypothesis.
Note that hypothesis testing in [v] is a special case of [vi] when X1 =the intercept dummy and
X2 includes all explanatory variables. Now if we regress Y on the intercept term TSS=RSS i.e
RSSr=TSS and 𝑅𝑟2 = 0 . Putting 𝑅𝑟2 = 0 in [s6] we get [s5]. Thus [s6] statistic is a general
statistic which can be used for testing all the hypotheses stated above and all other linear
restrictions.

Finally we see how to use [s6] for testing hypotheses in [vii].

We have the unrestricted regression model
𝑌𝑖 = 𝛽1 + 𝛽2 𝑋2𝑖 + 𝛽3 𝑋3𝑖 + 𝛽4 𝑋4𝑖 + 𝛽5 𝑋5𝑖 + 𝛽6 𝑋6𝑖 + ⋯ … . . +𝛽𝑘 𝑋𝑘𝑖 + 𝑈𝑖
Putting restrictions 𝛽2 = 1 − 𝛽3 , 𝛽4 = −𝛽6, and 𝛽5 = −𝛽6 in the unrestricted model we get
𝑌𝑖 = 𝛽1 + (1 − 𝛽3 )𝑋2𝑖 + 𝛽3 𝑋3𝑖 − 𝛽6 𝑋4𝑖 − 𝛽6 𝑋5𝑖 + 𝛽6 𝑋6𝑖 + ⋯ … . . +𝛽𝑘 𝑋𝑘𝑖 + 𝑈𝑖
Or, 𝑌𝑖 − 𝑋2𝑖 = 𝛽1 + 𝛽3 (𝑋3𝑖 − 𝑋2𝑖 ) + 𝛽6 (𝑋6𝑖 − 𝑋4𝑖 − 𝑋5𝑖 ) + ⋯ … . . +𝛽𝑘 𝑋𝑘𝑖 + 𝑈𝑖 restricted
model
Now estimating the unrestricted and restricted models compute𝑅𝑢2 𝑎𝑛𝑑 𝑅𝑟2 .
𝑅2𝑢 −𝑅𝑟
2
3
(1−𝑅2
~ F(3, n-k) [s7]
𝑢)
𝑛−𝑘

Note that q may be calculated in several equivalent ways: a. the number of rows in R matrix.
b. number of elements in r vector c. the difference between the number of slope coefficients in
unrestricted and the restricted model. The difference between the degrees of freedom attaching
to RSS in restricted and unrestricted model.

Simple correlation, Multiple correlation and Partial correlation

Supravat Bagli/MLRM
18

Simple correlation between two variables X1 and X2 is the degree of linear association between
𝑐𝑜𝑣(𝑋1 ,𝑋2 )
the variables. The measure of simple correlation coefficient: 𝑟12 = . It can be
√𝑣𝑎𝑟(𝑋1 ) 𝑣𝑎𝑟(𝑋2 )

derived without making any reference to the structure of causal dependence i.e. regression
specification.

Multiple Correlation coefficient is the relation between a explained variable Y and the
explanatory variables (X2,…Xk). Square of this Multiple Correlation coefficient is denoted
2 𝐸𝑆𝑆
by𝑅𝑌.23…𝑘 = ; it is interpreted as the proportion of the sample variation in Y that is explained
𝑇𝑆𝑆

by the OLS regression line. It is known as the Coefficient of Determination for the regression.
It is equal to the squared correlation coefficient between the actual and fitted values of Y i.e.
2 𝐸𝑆𝑆 2
𝑅𝑌.23…𝑘 = 𝑇𝑆𝑆 = 𝑟𝑌𝑌
̂

2 [𝑐𝑜𝑣(𝑌𝑌̂)]2
To illustrate the definition 𝑟𝑌𝑌 ̂ ̂ ̂ ̂
̂ = 𝑣𝑎𝑟(𝑌)𝑉𝑎𝑟(𝑌̂) Now Cov(𝑌𝑌 ) = 𝐶𝑜𝑣 ((𝑌 + 𝑒)𝑌 ) = 𝑉𝑎𝑟(𝑌)

2 𝑉𝑎𝑟(𝑌̂)
𝑟𝑌𝑌
̂ = = 𝐸𝑆𝑆/𝑇𝑆𝑆
𝑉𝑎𝑟(𝑌)
Partial Correlation coefficient between explained variable Y and an explanatory variable, say
Xj, keeping all other X’s unchanged. The square of the partial correlation coefficient is denoted
2
by 𝑟𝑌𝑗.23…𝑗−1,𝑗+1…𝑘 or by r j2 . It is actually the 𝑟𝑒21 𝑒2 where e1 denotes the residual where Y is

regressed on all Xs except Xj and e2 denotes the residual when Xj is regressed on all other Xs.

To illustrate the relation between partial and simple correlation coefficient consider
𝑌 = 𝛽1 + 𝛽2 𝑋2 + 𝛽3 𝑋3 + 𝑈
In order to find partial correlation between Y and X2 we need to calculate simple correlation
between e1 and e2 where e1 is the residual when we regress Y on X3 and e2 is the residual when
we regress X2 on X3. Or 𝑦 = 𝑎̂𝑥3 + 𝑒1 and 𝑥2 = 𝑏̂𝑥3 + 𝑒2
After applying OLS in these deviation form equation we get
𝑠𝑦 𝑠2
𝑎̂ = 𝑟𝑦3 𝑎𝑛𝑑 𝑏̂ = 𝑟23
𝑠3 𝑠3
And Var(y) =Var(𝑎̂𝑥3 ) + 𝑣𝑎𝑟(𝑒1 ) and Var(x2) =Var(𝑏̂𝑥3 ) + 𝑣𝑎𝑟(𝑒2 )
𝑠𝑦 2
 𝑣𝑎𝑟(𝑒1 ) = 𝑠𝑦2 − [𝑟𝑦3 𝑠 ] 𝑠32 = 𝑠𝑦2 [1 − 𝑟𝑦3
2
]
3

Similarly 𝑣𝑎𝑟(𝑒2 ) = 𝑠22 [1 − 𝑟23

2
]
1
Now Cov(𝑒1 𝑒2 ) = 𝑐𝑜𝑣[(𝑦 − 𝑎̂𝑥3 ), (𝑥2 − 𝑏̂𝑥3 )] = 𝑛 ∑𝑛𝑖=1(𝑦𝑥2 − 𝑎̂𝑥3 𝑥2 − 𝑏̂𝑦𝑥3 + 𝑎̂𝑥3 𝑏̂𝑥3 )

Supravat Bagli/MLRM
19

𝑠𝑦 𝑠 𝑠𝑦 𝑠
=𝑐𝑜𝑣(𝑦𝑥2 ) − 𝑟𝑦3 𝑠 𝑐𝑜𝑣(𝑥3 𝑥2 ) − 𝑟23 𝑠2 𝑐𝑜𝑣(𝑦𝑥3 ) + 𝑟𝑦3 𝑠 𝑟23 𝑠2 𝑣𝑎𝑟(𝑥3 )
3 3 3 3

𝑠𝑦 𝑠2 (𝑟𝑦2 − 𝑟𝑦3 𝑟23 − 𝑟𝑦3 𝑟23 + 𝑟𝑦3 𝑟23 )= 𝑠𝑦 𝑠2 (𝑟𝑦2 − 𝑟𝑦3 𝑟23 )
𝒔𝒚 𝒔𝟐 (𝒓𝒚𝟐 −𝒓𝒚𝟑 𝒓𝟐𝟑 ) (𝒓𝒚𝟐 −𝒓𝒚𝟑 𝒓𝟐𝟑 )
Therefore, 𝒓𝟐𝒚𝟐.𝟑 = 𝒓𝟐𝟐 = 𝒓𝟐𝒆𝟏 𝒆𝟐 = =
√(𝒔𝟐𝟐 [𝟏−𝒓𝟐𝟐𝟑 ])(𝒔𝟐𝒚 [𝟏−𝒓𝟐𝒚𝟑 ]) √[𝟏−𝒓𝟐𝟐𝟑 ][𝟏−𝒓𝟐𝒚𝟑 ]

2
Relationship among rYX , RY2.1,2,...,k & r j2 :
j

In order to illustrate the relation consider k=3 and j=3

Thus the MLRM becomes 𝑌 = 𝛽1 + 𝛽2 𝑋2 + 𝛽3 𝑋3 + 𝑈
The main objective of the single-equation analysis is to explain the variation in Y. the multiple
2
regression Y on X2 and X3 give the total ESS can be written as 𝑅𝑌.23 TSS. Now if we first run
2 2 2
the regression Y on X2 then ESS can be written as 𝑅𝑌.2 𝑇𝑆𝑆 𝑜𝑟 𝑟𝑌2 TSS and RSSr=(1 − 𝑟𝑌2 )𝑇𝑆𝑆
Now if we add X3 and want to know the additional contribution of X3 to explain the remaining
RSS i.e. RSSr we regress e1 on e3; where e1 is the residual when we regress Y on X2 and e3 is
the residual when we regress X3 on X2 . ESS in this stage can be written as 𝑟𝑒21𝑒2 𝑅𝑆𝑆𝑟 =
2 2 2
𝑟𝑌3.2 𝑅𝑆𝑆𝑟 = 𝑟𝑌3.2 (1 − 𝑟𝑌2 )𝑇𝑆𝑆. Thus total ESS after inclusion of X2 and X3 can be written as
2 2 2 2
𝑟𝑌2 𝑇𝑆𝑆 + 𝑟𝑌3.2 (1 − 𝑟𝑌2 )𝑇𝑆𝑆 which again can be written as 𝑅𝑌.23 TSS.
2 2 (1 2 )𝑇𝑆𝑆 2
Thus 𝑟𝑌2 𝑇𝑆𝑆 + 𝑟𝑌3.2 − 𝑟𝑌2 = 𝑅𝑌.23 𝑇𝑆𝑆
2 2 2 (1 2 )
Or, 𝑅𝑌.23 = 𝑟𝑌2 + 𝑟𝑌3.2 − 𝑟𝑌2
Similarly if we start with the regression Y on X3 then we get
2 2 2 (1 2 )
𝑅𝑌.23 = 𝑟𝑌3 + 𝑟𝑌2.3 − 𝑟𝑌3

In general we can write for the inclusion of Xj

2
2
𝑅𝑌.23…𝑘 = 𝑟𝑌.23..𝑗−1,𝑗+1..𝑘 + 𝑟𝑗2 (1 − 𝑟𝑌.23..𝑗−1,𝑗+1..𝑘
2
)

Corollary: if 𝑟23 = 0 i.e. X2 and X3 are not correlated then the variables are said to be
orthogonal. In this case

2
𝑟𝑦2 2
𝑟𝑌2.3 = 2
[1 − 𝑟𝑦3 ]
Therefore,
2 2 2
𝑅𝑌.23 = 𝑟𝑌3 + 𝑟𝑌2

Supravat Bagli/MLRM
20

𝑭 𝒕𝟐
We can show that 𝒓𝟐𝒋 = 𝑭+𝒅𝒇 = 𝒕𝟐 +𝒅𝒇 where F and t are the values the test statistic for testing

𝑯𝟎 : 𝜷𝒋 = 𝟎 i.e for testing the partial influence of 𝑿𝒋 on Y and df denotes the degrees of
freedom in the regression.
Consider 𝐻0 : 𝛽𝑗 = 0 ; under H0 we can derive RSSr. Now we add 𝑿𝒋 in regression, it will

explain r j2 portion of previously unexplained variation in Y; So, unexplained part reduces by

rj2 RSSr

 RSSu =RSSr−rj2 RSSr = (1  r j2 ) RSSr

We know for 𝐻0 : 𝛽𝑗 = 0

𝑅𝑆𝑆𝑟 − 𝑅𝑆𝑆𝑢 𝑅𝑆𝑆𝑟 − (1 − 𝑟𝑗2 )𝑅𝑆𝑆𝑟

2 1 1 𝑟𝑗2 (𝑛 − 𝑘)
𝐹(1,𝑛−𝑘) = 𝑡𝑗 = = =
𝑅𝑆𝑆𝑢 (1 − 𝑟𝑗2 )𝑅𝑆𝑆𝑟 1 − 𝑟𝑗2
𝑛−𝑘 𝑛−𝑘

t 2j t 2j
or,; r j2  2  ;
t j  (n  k ) t 2j  df

Degrees of Freedom and R 2

RSSr > RSSu, and two different estimators of  u2 can be derived from these two models as

𝑠𝑟2 𝑎𝑛𝑑 𝑠𝑢2

If q is the number of restrictions, then RSSr=(n-k+q)𝑠𝑟2 and RSSu=(n-k)𝑠𝑢2 ;

𝑅𝑆𝑆𝑟 − 𝑅𝑆𝑆𝑢 (𝑛 − 𝑘 + 𝑞)𝑠 2 𝑟 − (𝑛 − 𝑘)𝑠 2 𝑢

𝑞 𝑞
𝐹(𝑞,𝑛−𝑘) = =
𝑅𝑆𝑆𝑢 𝑠𝑢2
𝑛−𝑘

 With little manipulation we get

𝑠𝑟2 𝑐+𝐹 𝑛−𝑘
2 = 1+𝑐 where 𝑐 =
𝑠𝑢 𝑞

Supravat Bagli/MLRM
21

(n  1) sU2
We know that R 2  1  ; so R 2 & sU2 are inversely related
2
 yi
𝑠𝑟2 𝑐+𝑡𝑗2
If H 0 :  j  0 , then F  t 2j ; 2 =
𝑠𝑢 1+𝑐

We know that 𝑅𝑆𝑆𝑟 ≥ 𝑅𝑆𝑆𝑢

𝑅𝑆𝑆𝑟 𝑅𝑆𝑆𝑢
It implies 𝑛−𝑘+1 ≥≤ or 𝑠𝑟2 ≥≤ 𝑠𝑢2
𝑛−𝑘

If 𝑠𝑟2 ≤ 𝑠𝑢2

𝑐 + 𝑡𝑗2
≤1
1+𝑐

|𝑡| ≤ 1
Again 𝑠𝑟2 ≤ 𝑠𝑢2
𝑠𝑟2 2
𝑠𝑢
Implies ≤
𝑇𝑆𝑆/(𝑛−1) 𝑇𝑆𝑆/(𝑛−1)

𝑠2 𝑠2
1-𝑇𝑆𝑆/(𝑛−1)
𝑟 𝑢
≥ 1 − 𝑇𝑆𝑆/(𝑛−1)

Therefore 𝑅̅𝑟2 ≥ 𝑅̅𝑢2

So, for any regressor |tj| < 1, we drop that regressor and if Xj is dropped from the regression

R 2 will go up and the goodness of fit will improve;

If for any regression F is significant but most of the tj’s are insignificant, suspect the presence
of Multicollinearity;

Supravat Bagli/MLRM

MLRM
No ratings yet
MLRM
22 pages
CH 2 Part II Handout
No ratings yet
CH 2 Part II Handout
27 pages
Wooldridge Notes
No ratings yet
Wooldridge Notes
15 pages
Handout Theory by pd sir
No ratings yet
Handout Theory by pd sir
94 pages
Classical Linear Regression and Its Assumptions
No ratings yet
Classical Linear Regression and Its Assumptions
63 pages
Appendex E
No ratings yet
Appendex E
9 pages
Lecture 3 - Econometria I
No ratings yet
Lecture 3 - Econometria I
46 pages
Multiple Regression
No ratings yet
Multiple Regression
22 pages
Matrix OLS NYU Notes
No ratings yet
Matrix OLS NYU Notes
14 pages
PPG MLRM Upto Autocorr PDF
No ratings yet
PPG MLRM Upto Autocorr PDF
20 pages
Multiple Linear Reegression
No ratings yet
Multiple Linear Reegression
21 pages
Unit - 1
No ratings yet
Unit - 1
8 pages
Chapter 15: Regression Analysis With Linear Algebra Primer
No ratings yet
Chapter 15: Regression Analysis With Linear Algebra Primer
26 pages
統計摘要
No ratings yet
統計摘要
12 pages
ECON 332 LECTURE NOTES APRIL 2021
No ratings yet
ECON 332 LECTURE NOTES APRIL 2021
57 pages
Econometric Theory: Module - Iii
No ratings yet
Econometric Theory: Module - Iii
10 pages
Lesson01 PDF 02
No ratings yet
Lesson01 PDF 02
5 pages
Education and Research: UP School of Statistics Student Council
No ratings yet
Education and Research: UP School of Statistics Student Council
26 pages
M2L2 CLRM & Simple Linear Regression Analysis
No ratings yet
M2L2 CLRM & Simple Linear Regression Analysis
13 pages
Econometrics Chapter Three
No ratings yet
Econometrics Chapter Three
35 pages
Problem Set 03 - Solutions
No ratings yet
Problem Set 03 - Solutions
16 pages
Econometrics I CH 3 MLR
No ratings yet
Econometrics I CH 3 MLR
30 pages
Regression and Multiple Regression Analysis
100% (1)
Regression and Multiple Regression Analysis
21 pages
Linear Regression Analaysis - 21
No ratings yet
Linear Regression Analaysis - 21
22 pages
Topic 2
No ratings yet
Topic 2
23 pages
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
No ratings yet
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
64 pages
Econometrics Chap - 2
No ratings yet
Econometrics Chap - 2
57 pages
CH 2
No ratings yet
CH 2
31 pages
Econometrics (EM2008) The K-Variable Linear Regression Model
No ratings yet
Econometrics (EM2008) The K-Variable Linear Regression Model
46 pages
HW1 (1)
No ratings yet
HW1 (1)
7 pages
Jimma University: M.SC in Economics (Industrial Economics) Regular Program Individual Assignment: Econometrics
No ratings yet
Jimma University: M.SC in Economics (Industrial Economics) Regular Program Individual Assignment: Econometrics
20 pages
Chapter 4 Multiple Regression Model
No ratings yet
Chapter 4 Multiple Regression Model
31 pages
Chapter 3
No ratings yet
Chapter 3
43 pages
Classical Linear Regression Model (CLRM)
100% (1)
Classical Linear Regression Model (CLRM)
68 pages
REg03
No ratings yet
REg03
39 pages
Econometrics for Finance Lecture III
No ratings yet
Econometrics for Finance Lecture III
54 pages
Chapter 2 Simple Linear Regression
No ratings yet
Chapter 2 Simple Linear Regression
31 pages
Appendix E - The Linear Regression Model in Matrix Form
No ratings yet
Appendix E - The Linear Regression Model in Matrix Form
14 pages
Econometrics: Two Variable Regression: The Problem of Estimation
No ratings yet
Econometrics: Two Variable Regression: The Problem of Estimation
28 pages
Basic Econometrics: TWO-VARIABLEREGRESSION MODEL
No ratings yet
Basic Econometrics: TWO-VARIABLEREGRESSION MODEL
35 pages
Elementary Regression Analysis
No ratings yet
Elementary Regression Analysis
25 pages
CHAPTER 2
No ratings yet
CHAPTER 2
17 pages
Chapter Three Metrics (I)
No ratings yet
Chapter Three Metrics (I)
35 pages
Sheetcheat Econometrics
No ratings yet
Sheetcheat Econometrics
1 page
Emet2007 Notes
No ratings yet
Emet2007 Notes
6 pages
Lecture2 241007 162001
No ratings yet
Lecture2 241007 162001
11 pages
Chapter2 Econometrics MultipleLinearRegressionModel 1 1
No ratings yet
Chapter2 Econometrics MultipleLinearRegressionModel 1 1
34 pages
Eco 3
No ratings yet
Eco 3
68 pages
4 - Multiple Linear Regressions
No ratings yet
4 - Multiple Linear Regressions
61 pages
8. Linear Regression
No ratings yet
8. Linear Regression
29 pages
6 Multiple Regression
No ratings yet
6 Multiple Regression
26 pages
Lec Topic3
No ratings yet
Lec Topic3
51 pages
Econometrics_2
No ratings yet
Econometrics_2
8 pages
Ols Estimates
No ratings yet
Ols Estimates
16 pages
L3 MLRM
No ratings yet
L3 MLRM
10 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Complex numbers
From Everand
Complex numbers
Alessio Mangoni
No ratings yet
ALGEBRA SIMPLIFIED EQUATIONS WORKBOOK WITH ANSWERS: Linear Equations, Quadratic Equations, Systems of Equations
From Everand
ALGEBRA SIMPLIFIED EQUATIONS WORKBOOK WITH ANSWERS: Linear Equations, Quadratic Equations, Systems of Equations
Luke Aneke
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
ESPCatalogOCT2013 PDF
No ratings yet
ESPCatalogOCT2013 PDF
218 pages
ISO-IEC-15438-PDF417 AAMVA STANDARD (1)
No ratings yet
ISO-IEC-15438-PDF417 AAMVA STANDARD (1)
15 pages
Twin 35Mm Gun: Air Defence Weapons
0% (1)
Twin 35Mm Gun: Air Defence Weapons
7 pages
Derrick 5T FOR DISMANTLE TC1
No ratings yet
Derrick 5T FOR DISMANTLE TC1
1 page
A Novel Healthy and Time Aware Food Recommender Syst - 2023 - Expert Systems Wit PDF
No ratings yet
A Novel Healthy and Time Aware Food Recommender Syst - 2023 - Expert Systems Wit PDF
22 pages
CSE 6th Semester - Image and Video Analytics - CCS349 - Notes
No ratings yet
CSE 6th Semester - Image and Video Analytics - CCS349 - Notes
216 pages
Skills Framwork For The Information Age v6 PDF
No ratings yet
Skills Framwork For The Information Age v6 PDF
2 pages
Symantec DLP 15.5 System Requirements Guide
No ratings yet
Symantec DLP 15.5 System Requirements Guide
75 pages
Fibocom_FM350_AT Commands User Manual_V2.10
No ratings yet
Fibocom_FM350_AT Commands User Manual_V2.10
344 pages
Wa0004.
No ratings yet
Wa0004.
5 pages
NBR Iec 60439-1-2-3 PDF
90% (20)
NBR Iec 60439-1-2-3 PDF
130 pages
Lean & Six Sigma For Clinical Laboratory by DR Annabel DSouza Sekar
No ratings yet
Lean & Six Sigma For Clinical Laboratory by DR Annabel DSouza Sekar
27 pages
Shear Wall Notes:: Ac Pad Must Meet Minimum Design Flood Elevation in Flood Zone Locations 1'-0" 3'-0"
No ratings yet
Shear Wall Notes:: Ac Pad Must Meet Minimum Design Flood Elevation in Flood Zone Locations 1'-0" 3'-0"
1 page
CCTV Katalog Produk
No ratings yet
CCTV Katalog Produk
6 pages
TDE4906 00 UserGuide
No ratings yet
TDE4906 00 UserGuide
158 pages
Lecture Notes 1-W234. Laplace Transform and Applied - Send
No ratings yet
Lecture Notes 1-W234. Laplace Transform and Applied - Send
14 pages
Research As Q2 W2
No ratings yet
Research As Q2 W2
6 pages
SYNC IEC60870-5 104 Slave Interface Usermanual Rev1.0.9
No ratings yet
SYNC IEC60870-5 104 Slave Interface Usermanual Rev1.0.9
24 pages
Application Form 2021: Personal Details
No ratings yet
Application Form 2021: Personal Details
2 pages
26 Jan - Republic Day Detaisl
No ratings yet
26 Jan - Republic Day Detaisl
3 pages
confirme30.confirmeonline.com.br
No ratings yet
confirme30.confirmeonline.com.br
5 pages
Btech Cs 5 Sem Computer Graphics kcs053 2022
No ratings yet
Btech Cs 5 Sem Computer Graphics kcs053 2022
2 pages
Quick Reference Index
No ratings yet
Quick Reference Index
3 pages
Gujarat Technological University: Bachelor of Engineering
No ratings yet
Gujarat Technological University: Bachelor of Engineering
3 pages
BS EN 24920-1992, ISO 4920-1981 Textiles. Determination of Resistance To Surface Wetting (Spray Test) of Fabrics
No ratings yet
BS EN 24920-1992, ISO 4920-1981 Textiles. Determination of Resistance To Surface Wetting (Spray Test) of Fabrics
12 pages
Vdoc - Pub The Effective Engineer How To Leverage Your Efforts in Software Engineering To Make A Disproportionate and Meaningful Impact
No ratings yet
Vdoc - Pub The Effective Engineer How To Leverage Your Efforts in Software Engineering To Make A Disproportionate and Meaningful Impact
335 pages
Cerro Herradura Practica Curva de Nivel A3
No ratings yet
Cerro Herradura Practica Curva de Nivel A3
1 page
Jane Resumgo: Graphic Designer
No ratings yet
Jane Resumgo: Graphic Designer
2 pages
Home UB-Mannheim-tesseract Wiki GitHub
No ratings yet
Home UB-Mannheim-tesseract Wiki GitHub
4 pages
Instruction Manual: Easy Intelligent Balance Charger
No ratings yet
Instruction Manual: Easy Intelligent Balance Charger
1 page