Lecture 3
Lecture 3
Multiple
Regression Analysis:
Estimation
Prepared by Quanquan Liu
Fall 2024
Motivation
In the general case with k independent variables, we seek estimates, in the equation
Given n observations the OLS estimates, k + 1 of them, are chosen to minimize the
sum of squared residuals:
Define the total sum of squares (SST), the explained sum of squares (SSE), and the residual
sum of squares or sum of squared residuals (SSR) as
Using the same argument as in the simple regression case, we can show that
.
Just as in the simple regression case, the R-squared is defined to be
.
Ordinary Least Squares
never decreases, and it usually increases, when another independent variable is added to
a regression and the same set of observations is used for both regressions.
The sum of squared residuals never increases when additional regressors are added
to the model.
Missing data can be an important practical issue.
Adjusted : A goodness-of-fit measure in multiple regression analysis that penalizes
additional explanatory variables by using a degrees of freedom adjustment in estimating
the error variance.
.
The Expected Value of the OLS
Estimators
To derive the bias, run a simple regression of on :
.
Estimated equation:
Plug into the true model:
for , where is the total sample variation in , and is the R-squared from regressing on all
other independent variables (including an intercept).
The Variance of the OLS Estimators
Increasing the sample size is thus a way to get more precise estimates.
The Variance of the OLS Estimators
is preferred.
Do not include irrelevant regressors!
Case 3. and are correlated in the sample, and .
- the standard error of the regression (SER), which is an estimator of the standard
deviation of the error term.
It can either decrease or increase when another independent variable is added to a regression.
Efficiency of OLS