Lecture 2 & 3: Simple Linear Regression: Gumilang Aryo Sahadewo
Lecture 2 & 3: Simple Linear Regression: Gumilang Aryo Sahadewo
Department of Economics
Universitas Gadjah Mada
October 9, 2017
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 1 / 52
Logistics
Textbook:
JW: Introductory Econometrics: A Modern Approach by Jeffrey M.
Woolridge, sixth edition, required.
Lecture notes
Class notes
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 2 / 52
Review
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 3 / 52
Motivation
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 4 / 52
Simple Regression Model
Y = 0 + 1 X + u
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 5 / 52
The term u in a regression model
Y = 0 + 1 X + u
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 6 / 52
Interpreting Coefficients
Y = 1 X , if u = 0
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 7 / 52
An Example (A Simple Wage Equation)
Wage = 0 + 1 Education + u
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 8 / 52
Example: Test Score and Student-Teacher Ratio
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 9 / 52
Two Questions
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 10 / 52
Assumptions in Linear Regression Model
E [u | X ] = 0
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 11 / 52
Zero conditional mean assumption: an example
In a wage equation
Wage = 0 + 1 Education + u
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 12 / 52
Zero conditional mean assumption: an example
In a wage equation
Wage = 0 + 1 Education + u
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 12 / 52
Zero conditional mean assumption: an example
In a wage equation
Wage = 0 + 1 Education + u
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 12 / 52
Assumptions in Linear Regression Model
E [Y | X ] is a linear function of X; for any given value of X, the
distribution of Y is centered about E [Y | X ]
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 13 / 52
Linear regression model
Y = 0 + 1 X + u
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 14 / 52
How to estimate the model: data
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 15 / 52
Data: Test Score and Student-Teacher Ratio
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 16 / 52
How to estimate the model: ordinary least squares
The most popular approach to estimating 0 , 1 is ordinary least
squares.
We choose 0 , 1 to minimize the sum of squared residuals, i.e., we
solve
n
X 2
minimize Yi 0 1 Xi
0 ,1 i=1
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 17 / 52
Ordinary least squares
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 18 / 52
Ordinary least squares
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 19 / 52
Some comments on OLS estimators
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 20 / 52
Example: CEO Salary and Return on Equity
Let y be annual salary (in thousands of dollars) and x be the average
roe (in percentage) for the CEOs firm.
Salary = 0 + 1 ROE + u
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 21 / 52
Example: CEO Salary and Return on Equity
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 22 / 52
Example: CEO Salary and Return on Equity
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 22 / 52
Estimated Regression Line
Example: Monthly Wage and Years of Education
Dataset: wage.dta
Command: graph twoway (scatter wage educ) (lfit wage educ)
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 23 / 52
Estimated Regression Line
Example: Test Score and Student-Teacher Ratio
Dataset: caschool.dta
Command: graph twoway (scatter testscr str) (lfit testscr str)
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 24 / 52
Some important quantities
Total sum of squares (SST):
n
X 2
SST Yi Y
i=1
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 25 / 52
Goodness of fit: R 2
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 26 / 52
Example
\ = 963.191 + 18.501ROE
salary
n = 209, R 2 = 0.0132
[ = 0.90 + 0.54educ
wage
n = 526, R 2 = 0.163
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 27 / 52
Incorporating Nonlinearities
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 28 / 52
Incorporating nonlinearities in simple regression
log(wage) = 0 + 1 educ + u
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 29 / 52
Log-Linear Model
log(y ) = 0 + 1 x + u.
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 30 / 52
Example
\ = 0.584 + 0.083educ + u
log(wage)
n = 526, R 2 = 0.186
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 31 / 52
Log-Log Model
Interpretation of Coefficients: Elasticity
log(y ) = 0 + 1 log(x) + u.
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 32 / 52
Linear-Log Model
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 33 / 52
Advantages of linear regression
Easy.
Computationally simple/fast.
Speed insensitive to dimension of X .
Relatively easy to interpret.
Recognizable & crossdisciplinary.
Somewhat flexible.
Standardized.
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 34 / 52
Disadvantages of linear regression
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 35 / 52
How to Run Regressions in Stata?
reg Y X
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 36 / 52
How to Run Regressions in STATA?
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 37 / 52
Standard assumptions for the simple linear regresion model
Y = 0 + 1 X + u
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 38 / 52
Standard assumptions for the simple linear regresion model
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 39 / 52
OLS estimator is unbiased
Theorem (Unbiasedness)
Under Assumption SLR.1- SLR.4:
E (0 ) = 0 , and E (1 ) = 1
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 40 / 52
Interpretation of unbiasedness
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 41 / 52
Some Comments
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 42 / 52
Variances of the OLS Estimators
Var (ui | Xi ) = 2 , Xi
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 43 / 52
Variances of the OLS Estimators
E (Y |X ) = 0 + 1 X
Var (Y |X ) = Var (0 + 1 X + u|X ) = Var (u|X ) = 2
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 44 / 52
Graphical illustration of homoskedasticity
The variability of the unobserved influences does not dependent on
the value of the explanatory variable
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 45 / 52
Graphical illustration of heteroskedasticity
The variability of the unobserved influences depends on the value of
the explanatory variable (e.g. wage v.s. educ)
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 46 / 52
Homoskedasticity
Discussion
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 47 / 52
Variances of the OLS Estimators
Under SLR.1-SLR.5:
2 2
Var (1 ) = Pn 2
=
i=1 (Xi X ) SSTx
2 Pn 2
n 1
i=1 Xi
Var (0 ) = Pn 2
i=1 (Xi X )
q
Standard deviation of 1 is sd(1 ) = Var (1 ) = / SSTx
The sampling variability of the estimated regression coefficients will
be the higher the larger the variability of the unobserved factors, and
the lower, the higher the variation in the explanatory variable .
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 48 / 52
Variances of the OLS Estimators
Under SLR.1-SLR.5:
2 2
Var (1 ) = Pn 2
=
i=1 (Xi X ) SSTx
2 P n 2
n 1
i=1 Xi
Var (0 ) = Pn 2
i=1 (Xi X )
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 49 / 52
Estimating the Error Variance
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 50 / 52
Calculation of standard errors for regression coefficients
= 2 is called standard error of the regression
Standard error of 1
r s
\ 2
se(1 ) = Var (1 ) = = qP
SSTx n 2
i=1 (Xi X )
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 51 / 52
Standard errors in Stata regression output
Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 52 / 52