Topic3_SimpleLinearRegressionModels
Topic3_SimpleLinearRegressionModels
21205-ECONOMETRICS
1 / 97
Basic concepts:
2 / 97
Regression Models
3 / 97
Regression Models
▶ The linear correlation coefficient r
Sxy
r=
Sx Sy
▶ Indicate if there is a linear association
between the two variables, but it does not
indicate the type of causal relationship (if
there is one).
X ⇒ Y? Y ⇒ X? X ⇔ Y?
Z ⇒ X and Z ⇒ Y ?
4 / 97
Regression Models
Examples:
▶ Does advertising expenditure (X ) affect a
company’s sales (Y )?
▶ Does consumption (Y ) increase when
income (X ) rises?
▶ How does the Gross Domestic Product (X )
affect the unemployment rate (Y )?
5 / 97
Regression Models
https://www.tylervigen.com/spurious-
correlations
6 / 97
Regression Models
7 / 97
Practical uses of regression models:
▶ Make predictions
▶ Example: If a company’s advertising
expenditure increases to a certain amount, by
how much are sales expected to increase?
▶ Test economic theories or hypotheses.
▶ Example: Is the income elasticity of demand
for tourism by British visitors to the Balearic
Islands greater than one?
▶ Simulate economic policies
▶ Example: If VAT increases by 3 percentage
points, by how much is tax revenue expected to
change? How will this increase affect
unemployment?
8 / 97
Stages of Econometric Modeling
2. Estimation
3. Validation
4. Use
9 / 97
Specification
▶ The starting point must be a theory that
links two variables in such a way that one x
causes the other y .
▶ Dependent or explained Variable y : it is the
variable to be predicted.
▶ Independent or explanatory Variable x: it is the
variable used to predict the value of the
dependent variable
▶ The relationship between these two
variables could be exact y = f (x), but in
Economics, it generally is not y = f (x) + u.
10 / 97
Specification
▶ the term u is a random variable that captures the
variability of y that cannot be explained by the
relationship between x and y .
yi = β0 + β1 xi + ui
yi = β0 + β1 xi + ui
salesi = β0 + β1 studentsi + ui
13 / 97
Simple linear regression model
yi = β0 + β1 xi + ui
14 / 97
Simple linear regression model
yi = β0 + β1 xi + ui
∂y △y
= β1 ⇒ ≈ β1 ⇒ △y ≈ β1 △x
∂x △x
▶ If β1 = 0, the regression model becomes y = β0 + u,
▶ In the example: the student population x has
no effect on quarterly sales y . This is a
hypothesis that should always be tested.
15 / 97
Simple linear regression model
16 / 97
Simple linear regression model
17 / 97
Estimation
▶ The least squares method is a method
where the sample data are used to
calculate the estimated regression
equation. This is, to compute βc0 and βc1 .
▶ Estimated Equation of the Simple Linear
Regression: It is obtained by replacing β0
and β1 with its sample statistics (or
estimators) βc0 and βc1
yb = βc0 + βc1 x
18 / 97
Estimation
19 / 97
Estimation
▶ When we apply the estimated equation to each value
of x in the sample we get the adjusted value of y :
ybi = βc0 + βc1 xi
21 / 97
Estimation
Student Population and Sales example
22 / 97
Estimation
Student Population and Sales scatter plot
23 / 97
Estimation
▶ There is a positive linear relationship
between x and y (sales and students).
▶ For the restaurant i, the estimated simple
regression equation is
25 / 97
Estimation
▶ If we replace yci by its expression:
n
(yi − (βc0 + βc1 xi ))2
X
min
β
c0 ,β
c1 i=1
βc0 = y − βc1 x
26 / 97
Estimation
Student Population and Sales example
27 / 97
Estimation
28 / 97
Estimation
▶ Therefore, the estimated regression
equation is:
yb = 60 + 5x
yb = 60 + 5 ∗ 16 = 140
30 / 97
Estimation
Student Population and Sales example
31 / 97
Estimation
32 / 97
Estimation
33 / 97
Estimation
▶ Estimated regression equation:
Pn
i=1 (xi − x)(yi − y)
βc1 = Pn 2
=4
i=1 (xi − x)
βc0 = y − βc1 x = 80
▶ Therefore,
yb = 80 + 4x
▶ When experience increases by 1 year then
the annual sales increases by 4 thousands
dollars on average.
34 / 97
Estimation
Example. Voting and Campaign Spending
(1988)
▶ Estimated regression equation:
\ i = 26.81 + 0.464spendingAi ;
votesA
where
▶ votesAi : % of votes obtained by candidate A.
▶ spendingAi : % of total campaign spending
attributed to candidate A.
▶ How to interpret the slope and the
intercept? 35 / 97
Goodness of fit: Coefficient of
determination R 2
▶ How is the fit of the estimated regression
equation yb = 60 + 5x to the data?
▶ In general, can we say that the values of yci
are closed to the values observed yi ?
▶ The Sum of Squared Errors (SSE):
n n
(yi − yci )2 = ei2
X X
SSE =
i=1 i=1
where ei = yi − yci are the residuals or
estimated errors using the LS estimator.
36 / 97
Coefficient of determination R 2
Student Population and Sales example
37 / 97
Coefficient of determination R 2
38 / 97
Coefficient of determination R 2
▶ Without knowledge of any other variable,
the sample mean y would be used as a
sales estimate y :
y = yi /n = 1300/10 = 130.
P
39 / 97
Coefficient of determination R 2
Student Population and Sales example
40 / 97
Coefficient of determination R 2
41 / 97
Coefficient of determination R 2
42 / 97
Coefficient of determination R 2
Student Population and Sales example
43 / 97
Coefficient of determination R 2
▶ Since TSS = 15730 and SSE = 1530, we
have that the Regression Sum of Squares
is:
RSS = TSS − SSE = 14200
▶ The quotient RSS/TSS evaluates the
goodness-of-fit of the estimated regression
equation.
▶ We define this quotient as the Coeficient
of Determination (R 2 )
44 / 97
Coefficient of determination R 2
RSS
R2 =
TSS
45 / 97
Coefficient of determination R 2
46 / 97
Coefficient of determination R 2
RSS SSE
R2 = =1−
TSS TSS
▶ R 2 is a value between 0 and 1: 0 ≤ R 2 ≤ 1
▶ If R 2 = 0 ⇒ RSS = 0 The model explains
nothing of y from the variable x.
▶ If R 2 = 1 ⇒ RSS = TSS Perfect fit: y
functionally depends on x.
47 / 97
Coefficient of determination R 2
48 / 97
Coefficient of determination R 2
▶ If there is NO intercept in the model:
y = β1 x + u
R 2 = r 2 = Corr (x, y )2
Proof:
n n
2
+ βc1 xi − y )2
X X
RSS = (yci − y) = (βc0
i=1 =1
50 / 97
Coefficient of determination R 2
Then,
2 Pn
2
2 RSS βc1 i=1 (xi − x)
R = = Pn 2
TSS i=1 (yi − y )
2 Pn 2
βc1 − x)2 /(n − 1) βc1 Sx2
i=1 (xi
= Pn 2
=
i=1 (yi − y ) /(n − 1) Sy2
2 Sxy
We apply now that βc1 = Sx2
:
2
Sxy
Sx2
Sx2 (Sxy )2
2
R = =
Sy2 Sx2 Sy2
!2
Sxy
= = r2
Sx Sy
51 / 97
Coefficient of determination R 2
▶ R 2 = r 2 = Corr (x, y )2
▶ Then,
√
r= sign(βc1 ) R2
▶ Student Population and Sales example
52 / 97
Properties of the Least Squares
estimator
53 / 97
Assumptions
A1. We have a random sample of size n,
{(xi , yi ) : i = 1, 2, ..., n}, which follows the
population model of the equation:
yi = β0 + β1 xi + ui , i = 1, 2, ..., n
55 / 97
Simple linear regression model
Graphically, A3 is satisified when
E(ui |xi ) = 0 ⇒ E(yi |xi ) = β0 + β1 xi
56 / 97
Simple linear regression model
Graphically, A4 is satisified when Var (ui |xi ) is
constant ⇒ Var (yi |xi ) = σ 2
57 / 97
Simple linear regression model
Grphically, if A4 is not satisified Var (ui |xi ) is not
constant ⇒ Var (yi |xi ) = σi2
58 / 97
Assumptions on the Error Term
Although it is not strictly necessary, we can make an
additional assumption about the error term:
A5. u has a normal distribution for any value of the
explanatory variable.
ui ∼ iidN(0, σ 2 )
yi |xi ∼ N(β0 + β1 xi , σ 2 )
59 / 97
Properties of the Least Squares
estimator
Pn
i=1 (xi − x)(yi − y)
βc1 = Pn 2
i=1 (xi − x)
βc0 = y − βc1 x
Under assumptions A1-A4 about the error term:
1. Linearity: βc0 and βc1 can be written as linear
functions of yi .
2. Unbiased: E(βc0 ) = β0 and E(βc1 ) = β1 .
3. Consistent.
4. Efficient (within certain classes).
60 / 97
Unbiasedness of βd1
Proof the βc1 is unbiased:
Pn
i=1 (xi − x)(yi − y)
βc
1 = Pn 2
i=1 (xi − x)
Pn
Let TSSx = i=1 (xi − x)2 , and we simplify the numerator:
n
X n
X n
X
(xi − x)(yi − y ) = (xi − x)yi − (xi − x)y
i=1 i=1 i=1
Now,
n
X n
X
(xi − x)y = y (xi − x) = 0
i=1 i=1
Then Pn
i=1 (xi− x)yi
βc1 =
TSSx
61 / 97
Unbiasedness of βd1
62 / 97
Unbiasedness of βd1
Proof the βc1 is unbiased:
We compute expected value of βc1 conditional on x,
E(βc1 |x):
Pn
− x)ui
i=1 (xi
E(βc |x)
1 = E(β1 + |x)
TSSx
Pn
(xi − x)E(ui |x)
= β1 + i=1
TSSx
Finally, ussing A3: E(ui |x) = 0
Pn
i=1 (xi− x) ∗ 0
E(βc |x)
1 = β1 + = β1
TSSx
63 / 97
Unbiasedness of βd1
64 / 97
Unbiasedness of βd0
▶ Under A1-A3: E(βc0 ) = β0
▶ Proof:
Then,
65 / 97
Variances of βd1 and βd0
Using the assumptions A1-A4, Var (ui |xi ) = σ 2
(Homoskedasticity),
▶ Variance of βc1 :
σ2 σ2
Var (βc1 ) = Pn 2
=
i=1 (xi − x) (n − 1)sx2
▶ Variance of βc0 :
x2
1
Var (βc0 ) = σ 2 + Pn 2
n i=1 (xi − x)
66 / 97
Properties of the Least Squares
estimator
67 / 97
Properties of the Least Squares
estimator
▶ Consistency:
68 / 97
Estimator of σ 2
▶ The variance of the error term σ 2 is an unknown
parameter that has to be estimated.
2 = s2 =
1530
σ
d
= 191.25
8
70 / 97
Estimator of σ 2
▶ s2 is an unbiased estimator of σ 2 .
71 / 97
Estimator of σ 2
▶ When we substitute σ 2 by s2 in Var (βc1 ) we
denote it as the estimated variance:
2 s2
sβc = Pn 2
i=1 (xi − x)
1
72 / 97
Estimator of σ 2
√
▶ In the Pizza Parlors example, s = 191.25 = 13.829.
Given ni=1 (xi − x)2 = 568, it follows that
P
s 13.829
sβb1 = qP = √ = 0.5803
n 2 568
i=1 (x i − x)
σ2
βc1 ∼ N β1 , Pn 2
i=1 (xi − x)
x2
1
βc0 ∼ N β1 , σ 2 + Pn 2
n i=1 (xi − x)
▶ We can do interval estimation or test
hypothesis of the parameters.
74 / 97
Significance test
▶ The Significance test:
y = β0 + β1 x + u
H0 : β 1 = 0
HA : β1 ̸= 0
75 / 97
Significance test
▶ If we reject H0 , we conclude that β1 ̸= 0 and
that there is a statistically significant
relationship between the two variables.
▶ Test statistic:
βc1 − β10
t= ∼ tn−2
sβc1
▶ Test Statistic:
βc1 − 0 βc1
t= = ∼ tn−2 under H0
sβc1 sβc1
▶ Decision Rules:
▶ p-value Method: Reject H0 if p − value ≤ α
▶ Critical Value Method: Reject H0 if t ≤ −tα/2 or
if t ≥ tα/2
78 / 97
The Upper-Tailed or Lower-Tailed
t-test
H0 : β1 ≤ β10 H0 : β1 ≥ β10
HA : β1 > β10 HA : β1 < β10
▶ Test Statistic:
βc1 − β10
t= ∼ tn−2 under H0
sβb1
▶ Decision Rules:
▶ p-value Method: Reject H0 if p − value ≤ α
▶ Critical Value Method: Reject H0 if t ≥ tα for the
Upper-Tailed Test, and if t ≤ −tα for the
Lower-Tailed Test.
79 / 97
Confidence interval for β1
80 / 97
Forecast
▶ One of the main applications of the SLRM
is forecasting the dependent variable (yc0 ),
given a value of the regressor (x0 ).
▶ Point forecast: given a value of x0 , we
forecast a specific value of y :
▶ Interval forecast:
point forecast (yc0 ) ± margin of error.
81 / 97
Point Forecast
y0 = β0 + β1 x0 + u0
82 / 97
Point Forecast
▶ E(e0 ) = 0, why?
83 / 97
Point Forecast
In the Pizza Parlor example:
▶ Estimated regression line:
86 / 97
Interval Forecast of E(y0)
▶ To compute an interval forecast we need to
know the variance of the estimator E(yc0 ):
(x − x0 )2
1
Var (E(yc0 )) = s2 + Pn 2
n i=1 (xi − x)
87 / 97
Interval Forecast of E(y0)
88 / 97
Interval Forecast of E(y0)
▶ The expression of the variance of E(yc0 ) is obtained
as follows:
▶ We want to forecast
E(y |x = x0 ) = E(y0 ) = β0 + β1 x0 ⇒ β0 = E(y0 ) − β1 x0
▶ Replacing β0 in yi :
yi = (E(y0 ) − β1 x0 ) + β1 xi + ui
= E(y0 ) +β1 (xi − x0 ) +ui
| {z } | {z }
β0∗ xi∗
v
u 2
u1 x∗
SE(yb0 ) = Sβb∗ = s t
+ Pn ∗ ∗
0 n i=1 (xi − x )
90 / 97
Interval Forecast of E(y0)
▶ The CI for E(y0 ), given a value of x0 , is:
where
v
(x − x0 )2
u
u1
SE(yb0 ) = st +
u
Pn
n − x)
i=1 (xi
92 / 97
Interval Forecast of y0
▶ The confident interval for y0 given a
particular value x0 is as follows:
where
yc0 = βc0 + βc1 x0
v
(x − x0 )2
u
u 1
Syb0 = st1 + + Pn
u
n i=1 (xi − x)
93 / 97
Interval Forecast of y0
▶ The prediction error in point estimation is
e0 = y0 − yc0 = β0 + β1 x0 +u0 − yc0 = E(y0 ) − yc0 + u0
| {z }
E(y0 )
▶ Then,
v
1 (14 − 10)2
u
u
Syb0 = 13.829 1 +
t
+ = 14.69
10 568
▶ And the 95% confidence interval for y0 is:
96 / 97
Conclusions
▶ The SLRM is a simple model of causal
relationship between two variables that has
many applications in business and
economics.
▶ It allows us to forecast the dependent
variable.
▶ Understanding the SLRM will help
understand more general models with many
regressors and with binary variables.
97 / 97