Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Topic3_SimpleLinearRegressionModels

Uploaded by

espertum lluctel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Topic3_SimpleLinearRegressionModels

Uploaded by

espertum lluctel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 97

Topic 3.

The Simple Linear


Regression Model (SLRM)

Rocío Álvarez Aranda


rocio.alvarez@uib.es

21205-ECONOMETRICS

1 / 97
Basic concepts:

▶ Explained variable ▶ Sum of squares


▶ Explanatory errors (SSE)
variable ▶ Coefficient of
▶ Error term determination
▶ LS estimator ▶ Significance test

2 / 97
Regression Models

▶ Quantification of relationships between


variables where changes in some variables
explain or cause changes in other
variables.

3 / 97
Regression Models
▶ The linear correlation coefficient r
Sxy
r=
Sx Sy
▶ Indicate if there is a linear association
between the two variables, but it does not
indicate the type of causal relationship (if
there is one).

X ⇒ Y? Y ⇒ X? X ⇔ Y?

Z ⇒ X and Z ⇒ Y ?
4 / 97
Regression Models

Examples:
▶ Does advertising expenditure (X ) affect a
company’s sales (Y )?
▶ Does consumption (Y ) increase when
income (X ) rises?
▶ How does the Gross Domestic Product (X )
affect the unemployment rate (Y )?

5 / 97
Regression Models

▶ Correlation does not imply Causation.

▶ Spurious relationships are non-causal


relationships between two variables, but
they are correlated.

https://www.tylervigen.com/spurious-
correlations

6 / 97
Regression Models

▶ To carry out causal analysis, it is necessary


to rely on a theoretical model of the
relationship between the two variables.
▶ We must base it on Economic Theory (and
common sense, e.g., the future cannot
cause the present; in small economies,
external variables affect internal ones).

7 / 97
Practical uses of regression models:
▶ Make predictions
▶ Example: If a company’s advertising
expenditure increases to a certain amount, by
how much are sales expected to increase?
▶ Test economic theories or hypotheses.
▶ Example: Is the income elasticity of demand
for tourism by British visitors to the Balearic
Islands greater than one?
▶ Simulate economic policies
▶ Example: If VAT increases by 3 percentage
points, by how much is tax revenue expected to
change? How will this increase affect
unemployment?
8 / 97
Stages of Econometric Modeling

The research process that will ultimately lead us


to a valid regression model for use consists of
four stages:
1. Specification

2. Estimation

3. Validation

4. Use

9 / 97
Specification
▶ The starting point must be a theory that
links two variables in such a way that one x
causes the other y .
▶ Dependent or explained Variable y : it is the
variable to be predicted.
▶ Independent or explanatory Variable x: it is the
variable used to predict the value of the
dependent variable
▶ The relationship between these two
variables could be exact y = f (x), but in
Economics, it generally is not y = f (x) + u.
10 / 97
Specification
▶ the term u is a random variable that captures the
variability of y that cannot be explained by the
relationship between x and y .

▶ why to include u in the model?

1. Randomness in human behavior and


responses: for example, two families with the
same income (x) do not consume the same
(y ).
2. u can capture the joint effect of a large number
of unobserved variables.
3. Measurement errors in y (but not in x).
11 / 97
Simple linear regression model
▶ Simple linear regression model: it is the
regression analysis between an
independent variable and a dependent
variable, where the relationship between
these variables is approximated by a
straight line.

yi = β0 + β1 xi + ui

where β0 + β1 xi is the sistematic part and ui


the random part.
12 / 97
Simple linear regression model
▶ Pizza Parlors is a chain of Italian food
restaurants located near university
campuses.
▶ Managers believe that the quarterly sales of
these restaurants (denoted by y ) are
directly related to the size of the student
population (denoted by x)

yi = β0 + β1 xi + ui

salesi = β0 + β1 studentsi + ui
13 / 97
Simple linear regression model

yi = β0 + β1 xi + ui

▶ β0 and β1 are unknown coefficients, they


are the parameters of the models.
▶ β0 is called the constant, intercept, or
independent term. It is the expected value
of y when x = 0. Thus, it is the point where
the regression line cross the vertical axis.

14 / 97
Simple linear regression model

yi = β0 + β1 xi + ui

▶ β1 is the slope of the regression line and represents


the marginal effect of a one-unit change in x on y .

∂y △y
= β1 ⇒ ≈ β1 ⇒ △y ≈ β1 △x
∂x △x
▶ If β1 = 0, the regression model becomes y = β0 + u,
▶ In the example: the student population x has
no effect on quarterly sales y . This is a
hypothesis that should always be tested.

15 / 97
Simple linear regression model

16 / 97
Simple linear regression model

▶ The values of parameters β0 and β1 are


unknown, and it is necessary to estimate
them using sample data.
▶ We denote as βc0 the estimator of β0 and βc1
the estimator of β1 . How to obtain the
formulas for these estimators?

17 / 97
Estimation
▶ The least squares method is a method
where the sample data are used to
calculate the estimated regression
equation. This is, to compute βc0 and βc1 .
▶ Estimated Equation of the Simple Linear
Regression: It is obtained by replacing β0
and β1 with its sample statistics (or
estimators) βc0 and βc1

yb = βc0 + βc1 x
18 / 97
Estimation

19 / 97
Estimation
▶ When we apply the estimated equation to each value
of x in the sample we get the adjusted value of y :
ybi = βc0 + βc1 xi

▶ For each observation of x in the sample we will have


two values of y : yi the observed value and ybi the
adjusted or predicted value by the estimated
regression line.

▶ We denote as residual or estimated error the


difference between the observed value and the
adjusted value:
ei = yi − ybi
20 / 97
Estimation

21 / 97
Estimation
Student Population and Sales example

22 / 97
Estimation
Student Population and Sales scatter plot

23 / 97
Estimation
▶ There is a positive linear relationship
between x and y (sales and students).
▶ For the restaurant i, the estimated simple
regression equation is

\i = βc0 + βc1 estudentsi


yci = sales

▶ For restaurant i, yi denotes observed (real)


sales, and yci denotes estimated sales by
applying the equation above.
24 / 97
Estimation

▶ The Least Squares Method obtains the


values of βc0 and βc1 that minimize the sum
of the squares of the deviations
(differences) between the observed values
of the dependent variable yi and the
estimated values of yi :
n n
(yi − yci )2 = min ei2
X X
min
β
c0 ,β
c1 i=1 β
c0 ,β
c1 i=1

25 / 97
Estimation
▶ If we replace yci by its expression:
n
(yi − (βc0 + βc1 xi ))2
X
min
β
c0 ,β
c1 i=1

where (xi , yi ) for i = 1, ..., n is the sample.


▶ The solution to this mathematical problem is
Pn
i=1 (xi − x)(yi − y) Sxy
βc1 = Pn 2
= 2
⇒ Sx2 > 0 required
i=1 (xi − x) Sx

βc0 = y − βc1 x
26 / 97
Estimation
Student Population and Sales example

27 / 97
Estimation

▶ Then, we calculate the slope estimated


coefficient βc1
Pn
i=1 (xi − x)(yi − y) 2840
βc1 = Pn 2
= =5
i=1 (xi − x) 568

▶ We calculate the intercept βc0 as follows:

βc0 = y − βc1 x = 130 − 5 ∗ 14 = 60

28 / 97
Estimation
▶ Therefore, the estimated regression
equation is:
yb = 60 + 5x

▶ βc1 = 5 is positive so that the sales increases


as the size of the student population
increases.
▶ When student population increases by 1
unit then the sales increases by 5 units
on average.
29 / 97
Estimation

▶ For example, if you want to forecast the


quarterly sales of a restaurant located near
a campus of 16,000 students, you should
apply the following equation:

yb = 60 + 5 ∗ 16 = 140

the prediction is an expected sales of 140


thousands of dollars.

30 / 97
Estimation
Student Population and Sales example

31 / 97
Estimation

A sales manager collected the following data:

32 / 97
Estimation

Which pattern does the scatter plot suggest?

33 / 97
Estimation
▶ Estimated regression equation:
Pn
i=1 (xi − x)(yi − y)
βc1 = Pn 2
=4
i=1 (xi − x)

βc0 = y − βc1 x = 80
▶ Therefore,
yb = 80 + 4x
▶ When experience increases by 1 year then
the annual sales increases by 4 thousands
dollars on average.
34 / 97
Estimation
Example. Voting and Campaign Spending
(1988)
▶ Estimated regression equation:

\ i = 26.81 + 0.464spendingAi ;
votesA
where
▶ votesAi : % of votes obtained by candidate A.
▶ spendingAi : % of total campaign spending
attributed to candidate A.
▶ How to interpret the slope and the
intercept? 35 / 97
Goodness of fit: Coefficient of
determination R 2
▶ How is the fit of the estimated regression
equation yb = 60 + 5x to the data?
▶ In general, can we say that the values of yci
are closed to the values observed yi ?
▶ The Sum of Squared Errors (SSE):
n n
(yi − yci )2 = ei2
X X
SSE =
i=1 i=1
where ei = yi − yci are the residuals or
estimated errors using the LS estimator.
36 / 97
Coefficient of determination R 2
Student Population and Sales example

37 / 97
Coefficient of determination R 2

▶ Therefore, SSE = 1530 measures the


existing error using the estimated
regression equation yb = 60 + 5x to predict
sales.
▶ Now suppose you want to estimate the
sales y without knowing the student
population x.

38 / 97
Coefficient of determination R 2
▶ Without knowledge of any other variable,
the sample mean y would be used as a
sales estimate y :
y = yi /n = 1300/10 = 130.
P

▶ The Total Sum of Squares (TSS) is the


sum of yi − y that measures the error from
using y to estimate the sales:
n
(yi − y )2
X
TSS =
i=1

39 / 97
Coefficient of determination R 2
Student Population and Sales example

40 / 97
Coefficient of determination R 2

▶ The Regression Sum of Squares (RSS)


measures how the estimated values yb
deviate from the regression line of the
sample mean y :
n
(yci − y )2
X
RSS :=
i=1

41 / 97
Coefficient of determination R 2

Relationship between the Sum of Squares

42 / 97
Coefficient of determination R 2
Student Population and Sales example

43 / 97
Coefficient of determination R 2
▶ Since TSS = 15730 and SSE = 1530, we
have that the Regression Sum of Squares
is:
RSS = TSS − SSE = 14200
▶ The quotient RSS/TSS evaluates the
goodness-of-fit of the estimated regression
equation.
▶ We define this quotient as the Coeficient
of Determination (R 2 )
44 / 97
Coefficient of determination R 2

RSS
R2 =
TSS

▶ In the Pizza Parlors example, the value of


the R 2 is
RSS 14200
R2 = = = 0.9027
TSS 15730

45 / 97
Coefficient of determination R 2

▶ In the example of Pizza Parlors, we


conclude that 90.27% of the variability in
sales is explained by the linear relationship
between the student population (x) and
sales (y ).

46 / 97
Coefficient of determination R 2

▶ Other expression to compute R 2 :

RSS SSE
R2 = =1−
TSS TSS
▶ R 2 is a value between 0 and 1: 0 ≤ R 2 ≤ 1
▶ If R 2 = 0 ⇒ RSS = 0 The model explains
nothing of y from the variable x.
▶ If R 2 = 1 ⇒ RSS = TSS Perfect fit: y
functionally depends on x.

47 / 97
Coefficient of determination R 2

▶ If the variable x has no explanatory power,


then βc1 ≊ 0 ⇒ RSS ≊ 0 ⇒ R 2 ≊ 0.
▶ A value of R 2 close to 0 ⇒ Low explanatory
power of the regression.
▶ A value of R 2 close to 1⇒ High explanatory
power of the regression.

48 / 97
Coefficient of determination R 2
▶ If there is NO intercept in the model:

y = β1 x + u

the relation between TSS, RSS and SSE is


not satisfied:

TSS ̸= RSS + SSE

and R 2 can take negative values or values


greater than 1.
49 / 97
Coefficient of determination R 2
▶ Another expression to compute R 2 :

R 2 = r 2 = Corr (x, y )2

Proof:
n n
2
+ βc1 xi − y )2
X X
RSS = (yci − y) = (βc0
i=1 =1

now we use that βc0 = y − βc1 x


n n
c2 X
(y − βc1 x + βc1 xi − y )2 (xi −x)2
X
RSS = = β1
i=1 i=1

50 / 97
Coefficient of determination R 2
Then,
2 Pn
2
2 RSS βc1 i=1 (xi − x)
R = = Pn 2
TSS i=1 (yi − y )
2 Pn 2
βc1 − x)2 /(n − 1) βc1 Sx2
i=1 (xi
= Pn 2
=
i=1 (yi − y ) /(n − 1) Sy2
2 Sxy
We apply now that βc1 = Sx2
:
 2
Sxy
Sx2
Sx2 (Sxy )2
2
R = =
Sy2 Sx2 Sy2
!2
Sxy
= = r2
Sx Sy
51 / 97
Coefficient of determination R 2

▶ R 2 = r 2 = Corr (x, y )2

▶ Then,

r= sign(βc1 ) R2
▶ Student Population and Sales example

R 2 = 0.9027 and βc1 = 5 > 0



⇒ r = Corr (x, y ) = + 0.9027 = 0.9051

52 / 97
Properties of the Least Squares
estimator

▶ In order to ensure that the estimators βc0


and βc1 have good properties, certain
assumptions about the sample and the
error term must be met.

53 / 97
Assumptions
A1. We have a random sample of size n,
{(xi , yi ) : i = 1, 2, ..., n}, which follows the
population model of the equation:
yi = β0 + β1 xi + ui , i = 1, 2, ..., n

A2. Not all sample values of xi , namely


{xi : i = 1, 2, ..., n}, are equal. That is, not all
have the same value.
If Sx2 ̸= 0 ⇒ the condition A2 is satisfied.
54 / 97
Assumptions on the Error Term
A3. u is a random variable, whose expected
value conditional on x, is zero: E(ui |xi ) = 0

E(ui |xi ) = 0 ⇒ E(ui ) = 0 and Cov (ui , xi ) = 0.

A4. u has the same variance for any value of


the explanatory variable: Var (ui |xi ) = σ 2 , for
all i. Homoskedasticity assumption.

Var (ui |xi ) = σ 2 ⇒ Var (ui ) = σ 2 constant variance

55 / 97
Simple linear regression model
Graphically, A3 is satisified when
E(ui |xi ) = 0 ⇒ E(yi |xi ) = β0 + β1 xi

56 / 97
Simple linear regression model
Graphically, A4 is satisified when Var (ui |xi ) is
constant ⇒ Var (yi |xi ) = σ 2

57 / 97
Simple linear regression model
Grphically, if A4 is not satisified Var (ui |xi ) is not
constant ⇒ Var (yi |xi ) = σi2

58 / 97
Assumptions on the Error Term
Although it is not strictly necessary, we can make an
additional assumption about the error term:
A5. u has a normal distribution for any value of the
explanatory variable.

▶ A1+A2+A3+A4+A5 implies that ui are independent


and identically distributed N(0, σ 2 )

ui ∼ iidN(0, σ 2 )

▶ This implies that E(yi |xi ) = β0 + β1 xi and

yi |xi ∼ N(β0 + β1 xi , σ 2 )

59 / 97
Properties of the Least Squares
estimator
Pn
i=1 (xi − x)(yi − y)
βc1 = Pn 2
i=1 (xi − x)
βc0 = y − βc1 x
Under assumptions A1-A4 about the error term:
1. Linearity: βc0 and βc1 can be written as linear
functions of yi .
2. Unbiased: E(βc0 ) = β0 and E(βc1 ) = β1 .
3. Consistent.
4. Efficient (within certain classes).
60 / 97
Unbiasedness of βd1
Proof the βc1 is unbiased:
Pn
i=1 (xi − x)(yi − y)
βc
1 = Pn 2
i=1 (xi − x)
Pn
Let TSSx = i=1 (xi − x)2 , and we simplify the numerator:
n
X n
X n
X
(xi − x)(yi − y ) = (xi − x)yi − (xi − x)y
i=1 i=1 i=1

Now,
n
X n
X
(xi − x)y = y (xi − x) = 0
i=1 i=1
Then Pn
i=1 (xi− x)yi
βc1 =
TSSx
61 / 97
Unbiasedness of βd1

Proof the βc1 is unbiased:


We substitute the value of yi = β0 + β1 xi + ui
Pn
− x)(β0 + β1 xi + ui )
i=1 (xi
βc
1 =
TSSx
Pn Pn Pn
(x
i=1 i − x) i=1 (xi − x)xi (xi − x)ui
= β0 + β1 + i=1
TSSx TSSx TSSx
Pn
(xi − x)ui
= β0 ∗ 0 + β1 ∗ 1 + i=1
TSSx
Pn
(xi − x)ui
= β1 + i=1
TSSx

62 / 97
Unbiasedness of βd1
Proof the βc1 is unbiased:
We compute expected value of βc1 conditional on x,
E(βc1 |x):
Pn
− x)ui
i=1 (xi
E(βc |x)
1 = E(β1 + |x)
TSSx
Pn
(xi − x)E(ui |x)
= β1 + i=1
TSSx
Finally, ussing A3: E(ui |x) = 0
Pn
i=1 (xi− x) ∗ 0
E(βc |x)
1 = β1 + = β1
TSSx

63 / 97
Unbiasedness of βd1

▶ As E(βc1 |x) = β1 for any value of x then,

E((βc1 ) = Ex (E(βc1 |x)) = β1

64 / 97
Unbiasedness of βd0
▶ Under A1-A3: E(βc0 ) = β0

▶ Proof:

E(βc0 |x) = E(y − βc1 x) = E(y |x) − E(βc1 x|x)


= β0 + β1 x − β1 x = β0

Then,

⇒ E(βc0 ) = Ex (E(βc0 |x)) = Ex (β1 ) = β1

65 / 97
Variances of βd1 and βd0
Using the assumptions A1-A4, Var (ui |xi ) = σ 2
(Homoskedasticity),
▶ Variance of βc1 :

σ2 σ2
Var (βc1 ) = Pn 2
=
i=1 (xi − x) (n − 1)sx2

▶ Variance of βc0 :

x2
 
1
Var (βc0 ) = σ 2  + Pn 2

n i=1 (xi − x)

66 / 97
Properties of the Least Squares
estimator

▶ Efficiency - The Gauss-Markov Theorem:

Under A1-A4, the estimators βc0 and βc1 of


LS are the best linear unbiased estimators
(BLUE), among all possible linear unbiased
estimators.

67 / 97
Properties of the Least Squares
estimator

▶ Consistency:

limn→∞ Var (βc0 ) = 0 and limn→∞ Var (βc1 ) = 0

⇒ βc0 and βc1 are consistent estimators.

68 / 97
Estimator of σ 2
▶ The variance of the error term σ 2 is an unknown
parameter that has to be estimated.

▶ Using A3 that E(ui ) = 0 we have that


σ 2 = Var (ui ) = E(ui2 ) − E(ui )2 = E(ui2 ).

▶ If we consider ei as an estimator of ui , we can use


the variace of the residuals ei as an estimator of the
variance of the error term:
Pn
e2 SSE
σ = s = i=1 i =
c2 2
n−2 n−2
We adjust the denominator to n − 2 because to
compute the residuals we need to estimate two
parameters, β0 and β1 . 69 / 97
Estimator of σ 2

▶ In the Pizza Parlors example,


SSE = ni=1 ei2 = 1530 and n − 2 = 8. Thus,
P

2 = s2 =
1530
σ
d
= 191.25
8

70 / 97
Estimator of σ 2

▶ s2 is an unbiased estimator of σ 2 .

▶ The standard error of the regression is


the squared root of s2 , s:
v
√ u Pn 2
t i=1 ei
u
s= s2 =
n−2

71 / 97
Estimator of σ 2
▶ When we substitute σ 2 by s2 in Var (βc1 ) we
denote it as the estimated variance:

2 s2
sβc = Pn 2
i=1 (xi − x)
1

▶ Estimated Standard Deviation (or Error)


of βc1 :
v
s2
u
r
2 =
u s
sβc1 = sβc =
u
tP
n (x −
qP
1
i=1 i x)2 n (x
i=1 i − x)2

72 / 97
Estimator of σ 2

▶ In the Pizza Parlors example, s = 191.25 = 13.829.
Given ni=1 (xi − x)2 = 568, it follows that
P

s 13.829
sβb1 = qP = √ = 0.5803
n 2 568
i=1 (x i − x)

is the estimated standard deviation of βc1 .

▶ This value gives us an estimation of the average


distance between the estimator and the value of β1 .
For this reason it is called estimated standard
error.
73 / 97
Inference

▶ Under A1-A5, the LS estimators have the


following distributions :

σ2
 

βc1 ∼ N β1 , Pn 2

i=1 (xi − x)

x2
  
1
βc0 ∼ N β1 , σ 2  + Pn 2

n i=1 (xi − x)
▶ We can do interval estimation or test
hypothesis of the parameters.
74 / 97
Significance test
▶ The Significance test:

y = β0 + β1 x + u

If x and y are linearly related, then


β1 ̸= 0. The signifficance test determines
whether β1 ̸= 0.

H0 : β 1 = 0
HA : β1 ̸= 0

75 / 97
Significance test
▶ If we reject H0 , we conclude that β1 ̸= 0 and
that there is a statistically significant
relationship between the two variables.
▶ Test statistic:
βc1 − β10
t= ∼ tn−2
sβc1

that follows a t-Student distribution with n − 2


degrees of freedom, under the null hypothesis
β1 = 0. Note that we do not know σ 2 and use its
estimator s2 instead.
76 / 97
Significance test
In the Pizza Parlors example
▶ The value of the test statistic in the sample:
βc1 − 0 5
t= = = 8.62
sβc1 0.5803
▶ Using α = 1% with 8 degrees of freedom,
the critical values are t = −3.355 and
t = 3.355. Then, H0 is rejected at the 1%
significance level, and there is a significant
relationship between population of students
and sales.
77 / 97
Significance test Summary
H0 : β1 = 0
HA : β1 ̸= 0

▶ Test Statistic:
βc1 − 0 βc1
t= = ∼ tn−2 under H0
sβc1 sβc1
▶ Decision Rules:
▶ p-value Method: Reject H0 if p − value ≤ α
▶ Critical Value Method: Reject H0 if t ≤ −tα/2 or
if t ≥ tα/2
78 / 97
The Upper-Tailed or Lower-Tailed
t-test
H0 : β1 ≤ β10 H0 : β1 ≥ β10
HA : β1 > β10 HA : β1 < β10
▶ Test Statistic:
βc1 − β10
t= ∼ tn−2 under H0
sβb1
▶ Decision Rules:
▶ p-value Method: Reject H0 if p − value ≤ α
▶ Critical Value Method: Reject H0 if t ≥ tα for the
Upper-Tailed Test, and if t ≤ −tα for the
Lower-Tailed Test.
79 / 97
Confidence interval for β1

CI(β1 , (1 − α)%) = βc1 ± tn−2,α/2 Sβc1

▶ In the Pizza Parlors example, a 99%


confidence interval is desired for β1 :

CI(β1 , 99%) = ...

80 / 97
Forecast
▶ One of the main applications of the SLRM
is forecasting the dependent variable (yc0 ),
given a value of the regressor (x0 ).
▶ Point forecast: given a value of x0 , we
forecast a specific value of y :

yc0 = E(y |x0 )

▶ Interval forecast:
point forecast (yc0 ) ± margin of error.
81 / 97
Point Forecast

▶ It is the projection of the regression line:

yc0 = βc0 + βc1 x0

▶ The real and unknown value of y is y0 :

y0 = β0 + β1 x0 + u0

▶ Prediction Error: e0 = y0 − yc0

82 / 97
Point Forecast

▶ E(e0 ) = 0, why?

E(e0 |x) = E(y0 − yc0 |x)


= E(β0 + β1 x0 + u0 − (βc0 + βc1 x0 )|x)
 
= E(β0 − βc0 |x) + E (β1 − βc1 )x0 |x + 
E(u
0 |x)


 
= β0 − E(βc0 |x) + β1 − E(βc1 |x) x0 = 0

E(e0 |x) = 0 ⇒ E(e0 ) = Ex (E(e0 |x)) = Ex (0) = 0


using A3 E(u|x) = 0 and that βc0 and βc1 are unbiased.

83 / 97
Point Forecast
In the Pizza Parlor example:
▶ Estimated regression line:

E(yi |xi ) = yci = 60 + 5xi

▶ What will be the predicted sales of a


restaurant located near a campus with
10,000 students?
▶ Point forecast:

E(y0 |x0 ) = yc0 = ...


84 / 97
Interval Forecast of E(y0)
▶ Given a value of x0 we can observe
different values of y0 . Then, we are often
more interested in predicting the expected
value E(y |x0 ) = E(y0 ) rather than a specific
value of y0 .
▶ The point forecast remains the same as
before:
E(y |xi ) = E(yc0 ) = βc0 + βc1 x0
▶ But the prediction error is going to be
different.
85 / 97
Interval Forecast of E(y0)

▶ Now we want to forecast


E(y0 |x = x0 ) = β0 + β1 x0
▶ The point forecast is E(yc0 ) = βc0 + βc1 x0

▶ Then the prediction error is

e0 = E(y0 ) − E(yc0 ) = (β0 − βc0 ) + (β1 − βc1 )x0

86 / 97
Interval Forecast of E(y0)
▶ To compute an interval forecast we need to
know the variance of the estimator E(yc0 ):

(x − x0 )2 
 
1
Var (E(yc0 )) = s2  + Pn 2
n i=1 (xi − x)

Then, the standard deviation is as follows:


v
(x − x0 )2
u
u1
SE(yb0 ) = st + Pn
u
2
n i=1 (xi − x)

87 / 97
Interval Forecast of E(y0)

▶ In the Pizza Parlors example, the standard


deviation of E(yc0 ) for x0 = 10 is:
v
1 (14 − 10)2
u
u
SE(yb0 ) = 13.829
t
+ = 4.95
10 568

88 / 97
Interval Forecast of E(y0)
▶ The expression of the variance of E(yc0 ) is obtained
as follows:

▶ We want to forecast
E(y |x = x0 ) = E(y0 ) = β0 + β1 x0 ⇒ β0 = E(y0 ) − β1 x0

▶ Replacing β0 in yi :

yi = (E(y0 ) − β1 x0 ) + β1 xi + ui
= E(y0 ) +β1 (xi − x0 ) +ui
| {z } | {z }
β0∗ xi∗

▶ If we do regression of yi on xi∗ = xi − x0 the intercept


is E(y0 ).
89 / 97
Interval Forecast of E(y0)
▶ Then
βc0∗ = E(yc0 ) and Sβb∗ = SE(yb0 )
0

▶ if we apply the expression of Sβb∗ :


0

v
u 2
u1 x∗
SE(yb0 ) = Sβb∗ = s t
+ Pn ∗ ∗
0 n i=1 (xi − x )

▶ Now, since x ∗ = x − x0 , the expression simplifies to


v
u1 (x − x0 )2
u
SE(yb0 ) = st + Pn
n i=1 (xi − x)

90 / 97
Interval Forecast of E(y0)
▶ The CI for E(y0 ), given a value of x0 , is:

CI(E(y0 ), (1 − α)%) = E(yc0 ) ± tn−2,α/2 SE(yb0 )

where
v
(x − x0 )2
u
u1
SE(yb0 ) = st +
u
Pn
n − x)
i=1 (xi

▶ In the Pizza Parlors example where x0 = 10:

CI(E(y0 ), 95%) = ...


91 / 97
Interval Forecast of E(y0)
Pizza Parlors example: CI(E(y0 ), 95%) for
different values of x0

92 / 97
Interval Forecast of y0
▶ The confident interval for y0 given a
particular value x0 is as follows:

CI(y0 ), (1 − α)%) = yc0 ± tn−2,α/2 Syb0

where
yc0 = βc0 + βc1 x0
v
(x − x0 )2
u
u 1
Syb0 = st1 + + Pn
u
n i=1 (xi − x)

93 / 97
Interval Forecast of y0
▶ The prediction error in point estimation is
e0 = y0 − yc0 = β0 + β1 x0 +u0 − yc0 = E(y0 ) − yc0 + u0
| {z }
E(y0 )

▶ In point estimation, we have two sources of


prediction error: the LS estimators and the model’s
error term, u. Then,
Var (e0 ) = Var (E(yc0 )) + Var (u0 ) = Var (E(yc0 )) + σ 2
▶ Thus, the estimated variance of yc0 is:
(x − x0 )2 (x − x0 )2
   
1 1
Sy2b 2
= |{z}
s +s 2
+ Pn 2
= s 1 + + Pn
0 n i=1 (xi − x)
n i=1 (xi − x)
\
var (ui ) | {z }
S2
E(y0 )
b
94 / 97
Interval Forecast of y0
In the Pizza Parlors example:
n
(xi −x)2 = 568
X
s = 13.829, x0 = 10, x = 14, and
i=1

▶ Then,
v
1 (14 − 10)2
u
u
Syb0 = 13.829 1 +
t
+ = 14.69
10 568
▶ And the 95% confidence interval for y0 is:

CI(y0 , 95%) = [76.13, 143.88]


95 / 97
Interval Forecast of E(y0)
Pizza Parlors example: CI(y0 , 95%) for different
values of x0

96 / 97
Conclusions
▶ The SLRM is a simple model of causal
relationship between two variables that has
many applications in business and
economics.
▶ It allows us to forecast the dependent
variable.
▶ Understanding the SLRM will help
understand more general models with many
regressors and with binary variables.
97 / 97

You might also like