0% found this document useful (0 votes)

6 views

Lecture 3

Uploaded by

marupakareshmitha

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Lecture 3

Uploaded by

marupakareshmitha

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 27

Lecture 3.

Multiple
Regression Analysis:
Estimation
Prepared by Quanquan Liu
Fall 2024
Motivation

 The Model with Two Independent Variables

 Example. Wage and Education
.
 Because this model contains experience explicitly, we will be able to measure the effect of
education on wage, holding experience fixed.
 Zero conditional mean assumption:
 This implies that other factors affecting wage are not related on average to educ and
exper. If we think innate ability is part of u, we will need average ability levels to be
the same across all combinations of education and experience in the working
population.
Motivation

 The Model with Two Independent Variables

 Example. Consumption and Income
.
 Multiple regression analysis is also useful for generalizing functional relationships between
variables.
 – the marginal effect of income on consumption depends on as well as on and the level
of income.
Motivation

 The Model with k Independent Variables

 The general multiple linear regression (MLR) model can be written in the population as
,
 is the intercept; is the parameter associated with ; is the parameter associated with , and
so on.
 u is the error term; it contains factors other than that affect y.
 Zero conditional mean assumption: .
 All factors in the unobserved error term must be uncorrelated with the explanatory
variables.
 We have correctly accounted for the functional relationships between the explained
and explanatory variables.
Motivation

 The Model with k Independent Variables

 Example. CEO salary, sales and CEO tenure
.
 Model assumes a constant elasticity relationship between CEO salary and the sales of his
or her firm.
 The parameter is the ceteris paribus elasticity of salary with respect to sales.
 Model assumes a quadratic relationship between CEO salary and his or her tenure with
the firm.
 The term “linear” means that the model is linear in the parameters, , not in the variables.
Ordinary Least Squares

 In the general case with k independent variables, we seek estimates, in the equation

 Given n observations the OLS estimates, k + 1 of them, are chosen to minimize the
sum of squared residuals:

 First order conditions:

Ordinary Least Squares

 The OLS regression line is

.
 Written in terms of changes,
.
 The coefficient on measures the change in due to a one-unit increase in , holding all other
independent variables fixed – partial effect. That is,
,
holding fixed.
 partial effect: The effect of an explanatory variable on the dependent variable, holding other
factors in the regression model fixed.
Ordinary Least Squares

 Example. Determinants of College GPA

.
 There is a positive partial relationship between colGPA and hsGPA: holding ACT fixed,
another point on hsGPA is associated with .453 of a point on the college GPA.
 If we choose two “typical” students, A and B, and these students have the same ACT score, but
the high school GPA of Student A is one point higher than the high school GPA of Student B, then
we predict Student A to have a college GPA .453 higher than that of Student B.
 The sign on ACT implies that, while holding hsGPA fixed, a change in the ACT score of 10
points – a very large change – affects colGPA by less than one-tenth of a point.
Ordinary Least Squares

 For observation i, the fitted value

.
 The residual for observation i is
.
 Properties:
Ordinary Least Squares

 Define the total sum of squares (SST), the explained sum of squares (SSE), and the residual
sum of squares or sum of squared residuals (SSR) as

 Using the same argument as in the simple regression case, we can show that
.
 Just as in the simple regression case, the R-squared is defined to be
.
Ordinary Least Squares

 never decreases, and it usually increases, when another independent variable is added to
a regression and the same set of observations is used for both regressions.
 The sum of squared residuals never increases when additional regressors are added
to the model.
 Missing data can be an important practical issue.
 Adjusted : A goodness-of-fit measure in multiple regression analysis that penalizes
additional explanatory variables by using a degrees of freedom adjustment in estimating
the error variance.

 Adjusted can go up or down when a new independent variable is added to a regression.

 : For small n and large k, adjusted can be substantially below . In fact, if the usual is small, and is
also small, adjusted can actually be negative!
The Expected Value of the OLS
Estimators
 Assumption MLR.1. Linear in Parameters
 The model in the population can be written as
,
where are the unknown parameters (constants) of interest and u is an unobserved
random error or disturbance term.
 Assumption MLR.2. Random Sampling
 We have a random sample of n observations, , following the population model in
Assumption MLR.1.
 For a randomly drawn observation i from the population, we have
.
The Expected Value of the OLS
Estimators
 Assumption MLR.3. No Perfect Collinearity
 In the sample (and therefore in the population), none of the independent variables is
constant, and there are no exact linear relationships among the independent variables.
 If an independent variable is an exact linear combination of the other independent variables, then
we say the model suffers from perfect collinearity, and it cannot be estimated by OLS.
 In the general regression model, there are parameters, and MLR.3 fails if .
 Assumption MLR.4. Zero Conditional Mean
 The error u has an expected value of zero given any values of the independent variables,
.
 Fail if the functional relationship between the explained and explanatory variables is mis-specified.
 Fail if an important factor that is correlated with any of is omitted.
 Fail if there is measurement error in an explanatory variable.
The Expected Value of the OLS
Estimators
 When Assumption MLR.4 holds, we often say that we have exogenous explanatory variables.
 exogenous explanatory variables: An explanatory variable that is uncorrelated with the error term.
 If is correlated with u for any reason, then is said to be an endogenous explanatory variable.
 endogenous explanatory variable: An explanatory variable in a multiple regression model that is
correlated with the error term, either because of model misspecification, omitted variables, measurement
error, or simultaneity.
 Theorem 1. Unbiasedness of OLS
 Under Assumptions MLR.1, MLR.2, MLR.3 and MLR.4,
,
for any values of the population parameter . In other words, the OLS estimators are unbiased
estimators of the population parameters.
The Expected Value of the OLS
Estimators
 Including one or more irrelevant variables in a multiple regression model, or
overspecifying the model, does not affect the unbiasedness of the OLS estimators.
 However, including irrelevant variables may increase sampling variance.
 Excluding one or more relevant variables in a multiple regression model, or
underspecifying the model, generally causes the OLS estimators to be biased.
 Example. Omitted Variable Bias
 Suppose the true population model has two explanatory variables and an error term:
.
Estimated equation:
 However, due to our ignorance or data unavailability, we estimate the model by excluding
and perform a simple regression of y on only, obtaining the equation

.
The Expected Value of the OLS
Estimators
 To derive the bias, run a simple regression of on :
.
Estimated equation:
 Plug into the true model:

 Conclusion: All estimated coefficients will be biased.

The Expected Value of the OLS
Estimators
 We have the algebraic relationship
,
,
which implies the bias in is
.
 If – so that does not appear in the true model – then is unbiased.
 If – so that and are uncorrelated in the sample – then is unbiased for , even if .
The Expected Value of the OLS
Estimators
 Table. Summary of Bias in When Is Omitted in Estimating Equation
~ ~
Bias 𝛿 1 >0 𝛿 1 <0

 Example. Omitting ability in a wage equation

The Variance of the OLS Estimators

 Assumption MLR.5. Homoskedasticity

 The error u has the same variance given any value of the explanatory variables. In other
words, .
 Theorem 2. Sampling Variances of the OLS Slope Estimators
 Under Assumptions MLR.1, MLR.2, MLR.3, MLR.4 and MLR.5, conditional on the sample
values of the independent variables,

for , where is the total sample variation in , and is the R-squared from regressing on all
other independent variables (including an intercept).
The Variance of the OLS Estimators

 The Error Variance, .

 A high error variance increases the sampling variance because there is more “noise” in the
equation.
 The error variance does not decrease with sample size.
 The Total Sample Variation in , .
 The larger the total variation in is, the smaller is .

 Total sample variation automatically increases with the sample size.

 Increasing the sample size is thus a way to get more precise estimates.
The Variance of the OLS Estimators

 The Linear Relationships among the Independent Variables, .

 will be higher when can be better explained by the other independent variables.
 The sampling variance of the slope estimator for will be larger when gets larger: as .
 High (but not perfect) correlation between two or more independent variables is called
multicollinearity.
 Everything else being equal, for estimating , it is better to have less correlation between
and the other independent variables.
 Only the sampling variance of the variables involved in multicollinearity will be inflated;
the estimates of other effects may be very precise.
The Variance of the OLS Estimators

 True population model:

Estimated model 1:
Estimated model 2:
 Case 1. and are uncorrelated in the sample.
The Variance of the OLS Estimators

 Case 2. and are correlated in the sample, and .

 is preferred.
 Do not include irrelevant regressors!
 Case 3. and are correlated in the sample, and .

 Trade off bias and variance!

 In large samples, we would prefer .
The Variance of the OLS Estimators

 Given the OLS residuals:

,
The unbiased estimator of in the general multiple regression case is

where degrees of freedom .

 Theorem 3. Unbiased Estimation of
 Under the Gauss-Markov assumptions MLR.1, MLR.2, MLR.3, MLR.4 and MLR.5,
.
The Variance of the OLS Estimators

 - the standard error of the regression (SER), which is an estimator of the standard
deviation of the error term.

 It can either decrease or increase when another independent variable is added to a regression.
Efficiency of OLS

 Theorem 4. Gauss-Markov Theorem

 Under Assumptions MLR.1, MLR.2, MLR.3, MLR.4 and MLR.5, are the best linear unbiased
estimators (BLUEs) of , respectively.
 Estimator: It is a rule that can be applied to any sample of data to produce an estimate.
 Unbiased: In the current context, an estimator of is an unbiased estimator of if for any .
 Linear: In the current context, an estimator of is linear if, and only if, it can be expressed as a
linear function of the data on the dependent variable:
,
where each can be a function of the sample values of all the independent variables.
 Best: For the current theorem, best is defined as having the smallest variance: For any estimator
that is linear and unbiased, , and the inequality is usually strict.
Summary

 Multiple Regression Analysis: Estimation

 Motivation for Multiple Regression
 Mechanics and Interpretation of Ordinary Least Squares
 The Expected Value of the OLS Estimators
 Omitted Variable Bias
 The Variance of the OLS Estimators
 Multicollinearity
 Variances in Mis-specified Models
 Estimating σ2: Standard Errors of the OLS Estimators
 Efficiency of OLS: The Gauss-Markov Theorem

House Prices Prediction in King County
No ratings yet
House Prices Prediction in King County
10 pages
Review Questions
No ratings yet
Review Questions
9 pages
Practiceproblems DFA NFA PDF
No ratings yet
Practiceproblems DFA NFA PDF
3 pages
Ecc321 chapter 3
No ratings yet
Ecc321 chapter 3
8 pages
Lecture 2
No ratings yet
Lecture 2
39 pages
Econometrics I Lecture 4 Wooldridge
No ratings yet
Econometrics I Lecture 4 Wooldridge
33 pages
Introduction To Econometrics - Summary
No ratings yet
Introduction To Econometrics - Summary
23 pages
Unit 2 Topic 1 REGRESSION
No ratings yet
Unit 2 Topic 1 REGRESSION
19 pages
Bio2 Module 5 - Logistic Regression
No ratings yet
Bio2 Module 5 - Logistic Regression
19 pages
Name: Adewole Oreoluwa Adesina Matric. No.: RUN/ACC/19/8261 Course Code: Eco 307
No ratings yet
Name: Adewole Oreoluwa Adesina Matric. No.: RUN/ACC/19/8261 Course Code: Eco 307
13 pages
Econometrics jimma assignment
No ratings yet
Econometrics jimma assignment
6 pages
EDA 4th Module
No ratings yet
EDA 4th Module
26 pages
The Seven Classical OLS Assumptions: Ordinary Least Squares
No ratings yet
The Seven Classical OLS Assumptions: Ordinary Least Squares
7 pages
Econometrics
No ratings yet
Econometrics
13 pages
Summary: Correlation and Regression
No ratings yet
Summary: Correlation and Regression
6 pages
Models Assignment
No ratings yet
Models Assignment
43 pages
Eco Trix
No ratings yet
Eco Trix
16 pages
Lecture 5
No ratings yet
Lecture 5
30 pages
Unit II-II
No ratings yet
Unit II-II
21 pages
Ecotrix Ecotrix: B.A. Economics (Hons.) (University of Delhi) B.A. Economics (Hons.) (University of Delhi)
No ratings yet
Ecotrix Ecotrix: B.A. Economics (Hons.) (University of Delhi) B.A. Economics (Hons.) (University of Delhi)
18 pages
Linear Regression Assumptions and Limitations
No ratings yet
Linear Regression Assumptions and Limitations
10 pages
Module 4: Regression Shrinkage Methods
No ratings yet
Module 4: Regression Shrinkage Methods
5 pages
DA Unit-3
No ratings yet
DA Unit-3
11 pages
eco.fikadu ch3
No ratings yet
eco.fikadu ch3
22 pages
DA Unit-3 Short Q&A
No ratings yet
DA Unit-3 Short Q&A
17 pages
DA Unit-3
No ratings yet
DA Unit-3
14 pages
4 Econometric Techniques
No ratings yet
4 Econometric Techniques
30 pages
03 Logistic Regression
No ratings yet
03 Logistic Regression
23 pages
FDSA UNIT 5
No ratings yet
FDSA UNIT 5
48 pages
Econometrics Revision Work
100% (6)
Econometrics Revision Work
6 pages
Statistical Modelling: Regression: Multicollinearity
No ratings yet
Statistical Modelling: Regression: Multicollinearity
22 pages
Chapter 11 Summary: Observed Value of The Latent Variable. Here We Have To Multiply The Slope
No ratings yet
Chapter 11 Summary: Observed Value of The Latent Variable. Here We Have To Multiply The Slope
2 pages
CORRELATION
No ratings yet
CORRELATION
10 pages
Voorbeeldexamen Econometrie - Oplossing
No ratings yet
Voorbeeldexamen Econometrie - Oplossing
6 pages
1_UNIT 2 2 files merged
No ratings yet
1_UNIT 2 2 files merged
80 pages
Module 2 Transcripts_v3
No ratings yet
Module 2 Transcripts_v3
103 pages
Econometrics Unit 4
No ratings yet
Econometrics Unit 4
56 pages
Econometrics I - Lecture 6 (wooldridge)
No ratings yet
Econometrics I - Lecture 6 (wooldridge)
42 pages
Basic Regression Analysis
No ratings yet
Basic Regression Analysis
5 pages
Chapter 3 two variable regression model
No ratings yet
Chapter 3 two variable regression model
7 pages
Multicollinearity Assignment April 5
No ratings yet
Multicollinearity Assignment April 5
10 pages
Multiple Linear Regression: Application
No ratings yet
Multiple Linear Regression: Application
22 pages
Unit-14
No ratings yet
Unit-14
23 pages
Important points for Regression
No ratings yet
Important points for Regression
6 pages
U02Lecture06 Regression
No ratings yet
U02Lecture06 Regression
25 pages
Mme 8201-4-Linear Regression Models
No ratings yet
Mme 8201-4-Linear Regression Models
24 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
15 pages
15 Types of Regression in Data Science PDF
No ratings yet
15 Types of Regression in Data Science PDF
42 pages
Module 3 EDA
No ratings yet
Module 3 EDA
14 pages
Presentation regresion and correlation
No ratings yet
Presentation regresion and correlation
31 pages
Level 2 Quants Notes
No ratings yet
Level 2 Quants Notes
7 pages
Unit1 - Data Science - SPPU
No ratings yet
Unit1 - Data Science - SPPU
15 pages
5-LR Doc - R Sqared-Bias-Variance-Ridg-Lasso
No ratings yet
5-LR Doc - R Sqared-Bias-Variance-Ridg-Lasso
26 pages
Multiple Regression (2)
No ratings yet
Multiple Regression (2)
20 pages
Econometrics Practical
No ratings yet
Econometrics Practical
13 pages
model specification
No ratings yet
model specification
2 pages
Sta 3010 Quizes
No ratings yet
Sta 3010 Quizes
10 pages
ML Activity 2 - 2006205
No ratings yet
ML Activity 2 - 2006205
3 pages
7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression - Statistics By Jim
No ratings yet
7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression - Statistics By Jim
71 pages
Chap3 - Multiple Regression
No ratings yet
Chap3 - Multiple Regression
56 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Time Series Analysis
No ratings yet
Time Series Analysis
58 pages
Multirate Digital Filters, Filter Banks, Polyphase Networks, Applications: Tutorial
No ratings yet
Multirate Digital Filters, Filter Banks, Polyphase Networks, Applications: Tutorial
38 pages
DSM Assignment#3 (Module#4) .PDF
No ratings yet
DSM Assignment#3 (Module#4) .PDF
2 pages
Machine_Learning_Approaches_to_Predict_Asthma_Exac
No ratings yet
Machine_Learning_Approaches_to_Predict_Asthma_Exac
19 pages
Gibbs Phenomena1
No ratings yet
Gibbs Phenomena1
4 pages
Continuous Entropy
No ratings yet
Continuous Entropy
17 pages
CSC304 Lecture 6
No ratings yet
CSC304 Lecture 6
21 pages
Programme Guide
No ratings yet
Programme Guide
90 pages
Literature Survey Big Data
No ratings yet
Literature Survey Big Data
15 pages
Ge8151 PSPP Jan 2019 Rejinpaul PDF
No ratings yet
Ge8151 PSPP Jan 2019 Rejinpaul PDF
1 page
Predict Intercepts & Metallurgical Sampling Recommendation
No ratings yet
Predict Intercepts & Metallurgical Sampling Recommendation
8 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
2 pages
Herimanto 2020 J. Phys. Conf. Ser. 1566 012038 PDF
No ratings yet
Herimanto 2020 J. Phys. Conf. Ser. 1566 012038 PDF
7 pages
Gradient
No ratings yet
Gradient
14 pages
SQL Server TDE
No ratings yet
SQL Server TDE
3 pages
Descriptive ANSWER AI
No ratings yet
Descriptive ANSWER AI
15 pages
Natural Response Neither Grows Nor Approaches Zero As Time (For
No ratings yet
Natural Response Neither Grows Nor Approaches Zero As Time (For
9 pages
Btech 2 Sem Engineering Mathematics 2 Kas203t 2022
No ratings yet
Btech 2 Sem Engineering Mathematics 2 Kas203t 2022
2 pages
Preview-AI-Engineering-by-Chip-Huyen
No ratings yet
Preview-AI-Engineering-by-Chip-Huyen
21 pages
Codes
No ratings yet
Codes
170 pages
DHS-2023-Tentative-Agenda-14
No ratings yet
DHS-2023-Tentative-Agenda-14
4 pages
Signal Flow Graph
100% (1)
Signal Flow Graph
36 pages
ML Assignment 5
No ratings yet
ML Assignment 5
8 pages
PDF Introduction To Functional Data Analysis 1st Edition Piotr Kokoszka Download
100% (4)
PDF Introduction To Functional Data Analysis 1st Edition Piotr Kokoszka Download
46 pages
FYP - Thesis - 2022 (1) - Merged
No ratings yet
FYP - Thesis - 2022 (1) - Merged
50 pages
ANNand Its Applications
No ratings yet
ANNand Its Applications
16 pages
Hansen-J Test STATA
No ratings yet
Hansen-J Test STATA
5 pages
Cluster Analysis
No ratings yet
Cluster Analysis
77 pages
QM2 HM1
No ratings yet
QM2 HM1
3 pages