0% found this document useful (0 votes)

18 views

Lecture 6. Linear Regression

Uploaded by

Andrea

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Lecture 6. Linear Regression

Uploaded by

Andrea

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Outline

1 The simple regression model

ECMT5001: Principles of Econometrics
Lecture 6: Linear regression 2 Ordinary least squares (OLS)

3 Algebraic properties of OLS

Instructor: Simon Kwok1
1 School of Economics 4 Functional form
The University of Sydney

5 Assumptions of OLS

6 Statistical properties of OLS

1
Based on lecture notes by Nicolas de Roos.
Simon Kwok ECMT5001 L6 1 / 46 Simon Kwok ECMT5001 L6 2 / 46

Econometric modeling Outline

Econometric models can be used for
estimating and testing economic relationships 1 The simple regression model
evaluating policies from government or business
forecasting economic variables 2 Ordinary least squares (OLS)

Types of data 3 Algebraic properties of OLS

cross sections
I each observation is an individual at a point in time 4 Functional form
time series
I separate observations for each time period
5 Assumptions of OLS
pooled cross sections
I cross section data over multiple time periods
6 Statistical properties of OLS
panel data
I the same random sample is followed over multiple periods

Simon Kwok ECMT5001 L6 3 / 46 Simon Kwok ECMT5001 L6 The simple regression model 4 / 46
The simple regression model The simple regression model
Explains the variable y in terms of the variable x

random y= 0 + 1x +u
y intercept slope
error
Interpretation: studies how y varies with x
y= 0 + 1x +u

dependent independent @y @u
= 1 as long as =0
variable variable @x @x

@y
Terminology @x : by how much does the dependent variable change if the
independent variable increases by 1 unit?
y : dependent variable, explained variable, response variable
the interpretation is correct only if all other things remain equal when
x: independent variable, explanatory variable, regressor
the independent variable is changed
u: error term, disturbance, unobservable, residual

Simon Kwok ECMT5001 L6 The simple regression model 5 / 46 Simon Kwok ECMT5001 L6 The simple regression model 6 / 46

Simple regression examples Assumptions (introduction)

The population average of the error term is zero
Soybean yield and fertilizer
normalise the unobserved factors in the population to zero
yield = 0 + 1 fertilizer +u
E (u) = 0
1 measures the e↵ect of fertilizer on yield, holding all other factors Conditional mean independence
fixed
the explanatory variable must not contain information about the
u: rainfall, land quality, presence of parasites mean of the unobserved factors
A (simple) wage equation
E (u|x) = 0
wage = 0 + 1 educ +u
Example: wage equation
1 measures the change in hourly wage for another year of education,
the conditional mean independence assumption is unlikely to hold:
holding all other factors fixed people with more education will also be more intelligent on average
u: labour force experience, job tenure, work ethic, intelligence wage = +
0 1 educ+u e.g. intelligence

Simon Kwok ECMT5001 L6 The simple regression model 7 / 46 Simon Kwok ECMT5001 L6 The simple regression model 8 / 46
The population regression function The population regression function

The conditional mean independence assumption implies

E (y |x) = E ( 0 + 1x + u|x)
= 0 + 1x + E (u|x)
= 0 + 1x

i.e. the average value of the dependent variable can be expressed as a

linear function of the explanatory variable

Simon Kwok ECMT5001 L6 The simple regression model 9 / 46 Simon Kwok ECMT5001 L6 The simple regression model 10 / 46

Outline Regression data

To estimate the regression model we need data
1 The simple regression model a random sample of n observations

{(xi , yi ) : i = 1, . . . , n}
2 Ordinary least squares (OLS)
xi : value of the explanatory variable of the ith observation
3 Algebraic properties of OLS
yi : value of the dependent variable of the ith observation

4 Functional form (x1 , y1 ) first observation

(x2 , y2 ) second observation
5 Assumptions of OLS
(x3 , y3 ) third observation
..
6 Statistical properties of OLS .
(xn , yn ) nth observation

Simon Kwok ECMT5001 L6 Ordinary least squares (OLS) 11 / 46 Simon Kwok ECMT5001 L6 Ordinary least squares (OLS) 12 / 46
The regression objective Ordinary least squares
Fit a regression line as well as possible through the data:

ŷ = ˆ0 + ˆ1 x Regression residuals

ûi = yi ŷi = yi ˆ0 ˆ1 xi

Minimise the sum of squared residuals

n
X
min ûi2 ! ˆ0 , ˆ1
i=1

Ordinary least squares (OLS) estimates

Pn
(x x)(yi y )
ˆ1 = i=1 Pn i , ˆ0 = y ˆ1 x
i=1 (xi x)2

Simon Kwok ECMT5001 L6 Ordinary least squares (OLS) 13 / 46 Simon Kwok ECMT5001 L6 Ordinary least squares (OLS) 14 / 46

Ordinary least squares

Example (CEO salaries)

CEO salary and return on equity

salary = 0 + 1 roe +u

salary is in thousands of dollars

roe is the return on equity of the CEO’s firm
Fitted regression

\ = 963.191 + 18.501roe
salary

if the return on equity increases by 1 percentage point then salary is

predicted to increase by $18,501
is there a causal interpretation?

Simon Kwok ECMT5001 L6 Ordinary least squares (OLS) 15 / 46

OLS regression line Ordinary least squares
The fitted regression line depends on the sample and will di↵er from the
(unknown) population regression line Example (Wage equation)
Wages and education

wage = 0 + 1 educ +u

wage: hourly wage in dollars

educ: years of education
Fitted regression

[ =
wage 0.90 + 0.54educ

one more year of education is associated with an increase in the

hourly wage of $0.54
is there a causal interpretation?

Simon Kwok ECMT5001 L6 Ordinary least squares (OLS) 16 / 46 Simon Kwok ECMT5001 L6 Ordinary least squares (OLS) 17 / 46

Ordinary least squares Outline

Example (Voting outcomes)

1 The simple regression model
Voting outcomes and campaign expenditures (two parties)

voteA = 0 + 1 shareA +u 2 Ordinary least squares (OLS)

voteA: percentage of vote for candidate A 3 Algebraic properties of OLS

shareA: percentage of campaign expenditure by candidate A
Fitted regression 4 Functional form
\ = 26.81 + 0.464shareA
voteA
5 Assumptions of OLS
if candidate A’s share of spending increases by one percentage point,
he or she receives 0.464 percentage points more of the total vote 6 Statistical properties of OLS
is there a causal interpretation?

Simon Kwok ECMT5001 L6 Ordinary least squares (OLS) 18 / 46 Simon Kwok ECMT5001 L6 Algebraic properties of OLS 19 / 46
Algebraic properties of OLS OLS example: CEO salaries
Fitted values and residuals

ŷi = ˆ0 + ˆ1 xi , ûi = yi ŷi

fitted or predicted values residuals

Algebraic properties of OLS regression

n
X n
X
ûi = 0, xi ûi = 0, y = ˆ0 + ˆ1 x
i=1 i=1

zero correlation sample averages

residuals
between residuals of y and x lie on
sum to zero For example, CEO number 12’s salary was $526,023 lower than predicted
and regressors the regression line
using information on the firm’s equity

Simon Kwok ECMT5001 L6 Algebraic properties of OLS 20 / 46 Simon Kwok ECMT5001 L6 Algebraic properties of OLS 21 / 46

Goodness of fit Goodness of fit

Goodness of fit
how well does the explanatory variable explain the dependent variable?
R-squared (or the coefficient of determination)
Decomposing the variation in y
measures the fraction of the variation in y that is explained by the
SST = SSE + SSR regression
SSE SSR
R2 = =1 , 0  R2  1
n
X n
X n
X SST SST
2 2
SST = (yi y) , SSE = (ŷi y) , SSR = ûi2
i=1 i=1 i=1 R 2 = 1 indicates a perfect fit
R 2 = 0 indicates no linear relationship between x and y
residual sum of
explained sum of
total sum of squares: squares: variation
squares: variation ex-
total variation in y not explained
plained by regression
by regression

Simon Kwok ECMT5001 L6 Algebraic properties of OLS 22 / 46 Simon Kwok ECMT5001 L6 Algebraic properties of OLS 23 / 46
OLS examples Outline

CEO salary and return on equity

1 The simple regression model
the regression explains only 1.3% of the total variation in salaries

\ = 963.191 + 18.501roe,
salary n = 209, R 2 = 0.0132 2 Ordinary least squares (OLS)

Voting outcomes and campaign expenditures 3 Algebraic properties of OLS

the regression explains 85.6% of the total variation in election
outcomes 4 Functional form

\ = 26.81 + 0.464shareA,
voteA n = 173, R 2 = 0.856 5 Assumptions of OLS

Caution: a high R 2 does not necessarily mean the regression has a causal 6 Statistical properties of OLS
interpretation

Simon Kwok ECMT5001 L6 Algebraic properties of OLS 24 / 46 Simon Kwok ECMT5001 L6 Functional form 25 / 46

Units of measurement Functional form

Units of measurement are important for the interpretation of regression We can use natural logarithms to examine non-linear relationships
results 1. Semi-logarithmic form
e.g. consider the relationship regression of log wages on years of education

expenditure = 0 + 1 income +u log(wage) = 0 + 1 educ +u

i.e. as household income rises, expenditure on food rises at a rate log(wage) is the natural logarithm of wages
determined by 1 this changes the interpretation of the regression coefficient
Suppose both income and expenditure are measured in dollars @wage
@ log(wage) 1 @wage wage
how would we interpret 1 and 0? 1 = = =
@educ wage @educ @educ
Suppose instead that income is measured in thousands of dollars
i.e. the percentage change in the wage if education is increased by 1
how would we interpret 1 and 0? year

Simon Kwok ECMT5001 L6 Functional form 26 / 46 Simon Kwok ECMT5001 L6 Functional form 27 / 46
Example: wage equation Functional form

2. Log-logarithmic form
wage CEO salary and firm sales
Fitted regression wage growth of 8.3%
per year of education log(salary ) = 0 + 1 log(sales) +u
\ = 0.584 + 0.083educ
log(wage) we take natural logs of both salary and sales
t his again changes the interpretation of the regression coefficient
The wage increases by 8.3%
for each extra year of @salary
@ log(salary ) salary
education 1 = = @sales
@ log(sales) sales
0 educ
i.e. the percentage change in salary if sales increase by 1%
or the elasticity of salary with respect to sales

Simon Kwok ECMT5001 L6 Functional form 28 / 46 Simon Kwok ECMT5001 L6 Functional form 29 / 46

Example: CEO salaries Outline

1 The simple regression model

Fitted regression
2 Ordinary least squares (OLS)
\ ) = 4.822 + 0.257 log(sales)
log(salary
3 Algebraic properties of OLS

i.e. a 1% increase in sales is associated with a 0.257% increase in 4 Functional form

salary

Note that the log-log form suggests a constant elasticity 5 Assumptions of OLS

6 Statistical properties of OLS

Simon Kwok ECMT5001 L6 Functional form 30 / 46 Simon Kwok ECMT5001 L6 Assumptions of OLS 31 / 46
OLS estimates Standard assumptions for linear regression

The estimated regression coefficients are random variables because they Assumption SLR.1 (Linear in parameters)
are calculated from a random sample
the data xi , yi , x, y are random and depend on the sample the population relationship
y= 0 + 1x +u between y and x is linear
Pn
(x x)(yi y )
Pn i
ˆ1 = i=1 , ˆ0 = y ˆ1 x
i=1 (xi x)2 Assumption SLR.2 (Random sampling)

Interpretation
ˆ1 is the sample covariance between x and y divided by the sample the data is a random sample
{(xi , yi ) : i = 1, . . . , n}
drawn from the population
variance of x

Key questions
yi = + therefore each data point fol-
are the estimators unbiased? 0 1 xi + ui
lows the population equation
what are the variances of the estimators?

Simon Kwok ECMT5001 L6 Assumptions of OLS 32 / 46 Simon Kwok ECMT5001 L6 Assumptions of OLS 33 / 46

Random sampling Random sampling

y
Consider the following hypothetical relationship between wages and values drawn
education for ith worker
the population consists of all workers in country A (xi , yi )
there is a linear relationship between wages and years of education in
the population yi
randomly draw a worker from the population ui
the wage and education level of the worker are random because we do
not know in advance which worker will be drawn PRF: E (y |x) = 0 + 1
put the worker back into the population and repeat the random draw
n times u1 deviation from population
y1 relationship for ith worker:
the wages and years of education of the sampled workers are used to
estimate the linear relationship between wages and education u i = yi 0 1 xi

0 x1 xi x
Simon Kwok ECMT5001 L6 Assumptions of OLS 34 / 46 Simon Kwok ECMT5001 L6 Assumptions of OLS 35 / 46
Standard assumptions for linear regression Outline

Assumption SLR.3 (Sample variation in explanatory variable) 1 The simple regression model

n 2 Ordinary least squares (OLS)

X the values of the explanatory
(xi x)2 > 0
variable are not all the same
i=1 3 Algebraic properties of OLS

Assumption SLR.4 (Zero conditional mean) 4 Functional form

the explanatory variable contains

5 Assumptions of OLS
E (u|x) = 0 no information about the mean
of the unobserved factors
6 Statistical properties of OLS

Simon Kwok ECMT5001 L6 Assumptions of OLS 36 / 46 Simon Kwok ECMT5001 L6 Statistical properties of OLS 37 / 46

Statistical properties of OLS Standard assumptions for linear regression

Variance of the OLS estimators

Theorem (Unbiasedness of OLS) how far can we expect our estimates to be from the population values
If assumptions SLR.1 - SLR.4 hold then on average?
sampling variability is measured by the variances of the estimators
E ( ˆ0 ) = 0, E ( ˆ1 ) = 1
Var ( ˆ0 ), Var ( ˆ1 )
Interpretation
in any random sample, the estimated coefficients may be larger or Assumption SLR.5 (Homoskedasticity)
smaller
the explanatory variable contains
on average in repeated samples, they will be equal to the values 2
Var (u|x) = no information about the vari-
determined by the population relationship between x and y
ability of the unobserved factors

Simon Kwok ECMT5001 L6 Statistical properties of OLS 38 / 46 Simon Kwok ECMT5001 L6 Statistical properties of OLS 39 / 46
Homoskedasticity Heteroskedasticity

Simon Kwok ECMT5001 L6 Statistical properties of OLS 40 / 46 Simon Kwok ECMT5001 L6 Statistical properties of OLS 41 / 46

Statistical properties of OLS Estimating the error variance

Theorem (Variance of OLS estimators) the variance of u does not

2
If assumptions SLR.1 - SLR.5 hold then Var (ui |xi ) = = Var (ui ) depend on x; it is equal to
2 2
the unconditional variance
Var ( ˆ1 ) = Pn = ,
i=1 (xi x)2 SSTx
n n
1X 1X 2 this intuitive
˜2 = (ûi û i )2 = ûi
Pn Pn n n
2n 1 2 2n 1 2 estimator is biased!
i=1 xi i=1 xi i=1 i=1
Var ( ˆ0 ) = Pn =
i=1 (xi x)2 SSTx

The sampling variability of the estimated coefficients will be n

X this estimator is
1
larger if the variance of the unobserved factors is higher ˆ2 = ûi2 unbiased; n 2 is the
n 2
i=1 number of degrees of freedom
larger if there is less variability in the explanatory variable

Simon Kwok ECMT5001 L6 Statistical properties of OLS 42 / 46 Simon Kwok ECMT5001 L6 Statistical properties of OLS 43 / 46
Statistical properties of OLS OLS standard errors
Theorem (Unbiasedness of the error variance)
If assumptions SLR.1 - SLR.5 hold then

E (ˆ 2 ) = 2
Calculation of standard errors
1. estimate the regression ! ˆ0 , ˆ1
Estimation of standard errors for regression coefficients
2. calculate the regression residuals: ûi = yi ˆ0 ˆ1 xi
q q Pn
1
d ( ˆ1 ) = ˆ 2 /SSTx
se( ˆ1 ) = Var 3. calculate an estimate of the error variance: ˆ2 = n 2
2
i=1 ûi
v use this information to calculate standard errors ! se( ˆ1 ), se( ˆ0 )
q u n
4.
u X
se( 0 ) = Var ( 0 ) = t ˆ 2 n 1
ˆ d ˆ xi2 /SSTx
i=1

The estimated standard deviations of the regression coefficients are called

standard errors
they measure how precisely the regression coefficients are estimated
Simon Kwok ECMT5001 L6 Statistical properties of OLS 44 / 46 Simon Kwok ECMT5001 L6 Statistical properties of OLS 45 / 46

OLS Summary

1. We seek an explanation for y in terms of x

2. We look for a linear relationship:

y= 0 + 1x +u

3. Given a random sample {xi , yi }ni=1 , we choose the coefficients ˆ0 and

ˆ1 to minimise the sum of squared residuals Pn û 2
i=1 i
4. Under assumptions SLR.1-4, ˆ0 and ˆ1 are unbiased
5. Under assumptions SLR.1-5, ˆ0 and ˆ1 are BLUE

Simon Kwok ECMT5001 L6 Statistical properties of OLS 46 / 46

Lecture 2 Simple Regression Model
100% (1)
Lecture 2 Simple Regression Model
47 pages
ECC321 chapter2
No ratings yet
ECC321 chapter2
5 pages
Slides (Handout) - Caio - Chapter 2 (Wooldridge)
No ratings yet
Slides (Handout) - Caio - Chapter 2 (Wooldridge)
86 pages
Lecture 2
No ratings yet
Lecture 2
39 pages
Chapter 2
No ratings yet
Chapter 2
22 pages
Multiple regression
No ratings yet
Multiple regression
14 pages
Chapter 1: The Nature of Econometrics and Economic Data
No ratings yet
Chapter 1: The Nature of Econometrics and Economic Data
19 pages
02 Simple Regression
No ratings yet
02 Simple Regression
29 pages
Chapter 2
No ratings yet
Chapter 2
22 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
42 pages
qrm2 Session1 2
No ratings yet
qrm2 Session1 2
89 pages
Simple Linear Regression Model (1)
No ratings yet
Simple Linear Regression Model (1)
51 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
16 pages
Econometrics 7
No ratings yet
Econometrics 7
49 pages
Chapter 2 Econometric
No ratings yet
Chapter 2 Econometric
28 pages
ECO 401 Econometrics: SI 2021 Week 2, 14 September
100% (1)
ECO 401 Econometrics: SI 2021 Week 2, 14 September
47 pages
Hayashi 1 13
No ratings yet
Hayashi 1 13
13 pages
Lecture 7. Multiple Regression
No ratings yet
Lecture 7. Multiple Regression
11 pages
Ols 23-24
No ratings yet
Ols 23-24
87 pages
As of Sep 16, 2020: Seppo Pynn Onen Econometrics I
No ratings yet
As of Sep 16, 2020: Seppo Pynn Onen Econometrics I
52 pages
Ols 2
No ratings yet
Ols 2
19 pages
Ordinary Least Squares Linear Regression Review: Week 4
No ratings yet
Ordinary Least Squares Linear Regression Review: Week 4
10 pages
AG909 Quantitative Methods For Finance
No ratings yet
AG909 Quantitative Methods For Finance
7 pages
The Simple Regression Model
No ratings yet
The Simple Regression Model
24 pages
STAT 445 Regression Analysis
No ratings yet
STAT 445 Regression Analysis
49 pages
ch-02-wooldridge-5e-ppt20250307
No ratings yet
ch-02-wooldridge-5e-ppt20250307
51 pages
Ordinary Least Squares: Linear Model
No ratings yet
Ordinary Least Squares: Linear Model
13 pages
Metrics 2019 Lec3
No ratings yet
Metrics 2019 Lec3
59 pages
ECON6001: Applied Econometrics S&W: Chapter 4: Linear Regression With One Regressor, An Introduction Dr. Gedeon Lim
No ratings yet
ECON6001: Applied Econometrics S&W: Chapter 4: Linear Regression With One Regressor, An Introduction Dr. Gedeon Lim
59 pages
Econometrics Chap - 2
No ratings yet
Econometrics Chap - 2
57 pages
ECO375H_Slides_2
No ratings yet
ECO375H_Slides_2
39 pages
CH 02 PPT Simple Linear Regression
No ratings yet
CH 02 PPT Simple Linear Regression
43 pages
Chapter 1 Article
No ratings yet
Chapter 1 Article
9 pages
CH 02 Simple Regression TQT
No ratings yet
CH 02 Simple Regression TQT
61 pages
Ch3 Slides Ed4 2024
No ratings yet
Ch3 Slides Ed4 2024
72 pages
Chapter 3 Econometrics
No ratings yet
Chapter 3 Econometrics
67 pages
Ch3_slides_Ed4_2024_20(1)
No ratings yet
Ch3_slides_Ed4_2024_20(1)
72 pages
Lecture set 2
No ratings yet
Lecture set 2
47 pages
2 - Model Linear Jamak Dan OLS
No ratings yet
2 - Model Linear Jamak Dan OLS
11 pages
Lec Topic3
No ratings yet
Lec Topic3
51 pages
Ordinary Least Squares: Rómulo A. Chumacero
No ratings yet
Ordinary Least Squares: Rómulo A. Chumacero
50 pages
Tema I (Mínimos Cuadrados Ordinarios)
No ratings yet
Tema I (Mínimos Cuadrados Ordinarios)
49 pages
Week 2
No ratings yet
Week 2
43 pages
Chap 2
No ratings yet
Chap 2
15 pages
Linear Regression Slides
No ratings yet
Linear Regression Slides
129 pages
Introduction To Econometrics (ET2013) : Teresa Randazzo
No ratings yet
Introduction To Econometrics (ET2013) : Teresa Randazzo
30 pages
L4 MLR with 2 regressors
No ratings yet
L4 MLR with 2 regressors
19 pages
Lecture 2 SLR - 1
No ratings yet
Lecture 2 SLR - 1
28 pages
Econometrics Cheatsheet en
No ratings yet
Econometrics Cheatsheet en
3 pages
Multiple Linear Regression Model
No ratings yet
Multiple Linear Regression Model
99 pages
Econ 399 Chapter2a
No ratings yet
Econ 399 Chapter2a
40 pages
Ch.2 The Simple Regression Model
No ratings yet
Ch.2 The Simple Regression Model
6 pages
Lecture 2-3
No ratings yet
Lecture 2-3
8 pages
Linear_Regression_Lecture_Series
No ratings yet
Linear_Regression_Lecture_Series
2 pages
Ie Slide02
No ratings yet
Ie Slide02
30 pages
EmFi L 04
No ratings yet
EmFi L 04
17 pages
Week 2, OLS
No ratings yet
Week 2, OLS
83 pages