0% found this document useful (0 votes)

42 views

Week 2 - Simple Linear Regression

The document defines the objectives of simple linear regression as estimating regression parameters and introducing OLS estimators. It then provides definitions of key terms like dependent and independent variables. The document derives the OLS estimates by minimizing the sum of squared residuals to obtain the normal equations and estimates beta_0 and beta_1. An example calculates the OLS estimates for a dataset on corn yield and fertilizer amount over 11 years.

Uploaded by

Brave Kamuzeri

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views

Week 2 - Simple Linear Regression

Uploaded by

Brave Kamuzeri

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Week 2: Lecture 3 -4

1
Objectives:
To define a simple linear regression

Introduce the OLS estimators

To estimate the regression parameters

Use example to estimate the OLS parameters.

2
Definitions:
We are interested in explaining y in terms of x or how y varies with changes
in x.
In writing a model that explains this there are three issues:

First, since there is never an exact relationship, how do we allow for other factors to
affect y?

Second, what is the functional form between y and x?

Third, how can we be sure that we are capturing a ceteris paribus relationship between y
and x?
3
Simple Linear Regression
Some of these can be solved by writing the equation yt = β0+β1xt+u
which is known as the simple linear regression (SLR) or the two variable regression
model or a bivariate regression model.

4
Simple Linear Regression …..
The variables y and x have several different names that are used interchangeably:

-The y variable is called the dependent variable, the explained variable, the response
variable, the predicted variable, and the regressand
-The x variable is called the independent variable, the explanatory variable, the
control variable, the predictor variable, and the regressor
-The variable, μ, represents all factors other than x that affect y.
It is known as the error term, the disturbance term, the stochastic term, and the
random term

5
Linear Regression….
 Therefore, if the change in μ =0 then the change in y will = the change in x.
 βo represents the intercept and β1 represents the slope parameter in a SLR.

• Using the assumption that E(μ)=0, and that E(β0)=β0 and E(β1)= β1, we can obtain what is
known as the ‘population regression function’ (PRF):
• E(y|x) = E(β0 + β1x + μ)
• E(y|x) = E(β0 + β1x) + E(μ)
• E(y|x) = E(β0) + E(β1x) + E(μ)
• Therefore: E(y|x) = β0 + β1x.
• This means the linear relationship of the PRF gives us a one unit increases in x changes the
expected value of y by the slope amount

6
Deriving the OLS estimates
A line best fit drawn is called the sample regression function that is

𝒀𝒊 = 𝜷𝟎 + 𝜷𝟏 𝑿𝒊

• For any xi value, the difference between the actual value of Yi and the value given
by the sample regression function is called the residual, μi, where:
• 𝝁𝒊 = 𝒚𝒊 − 𝒚𝒊 = 𝒚𝒊 − 𝜷𝟎 + 𝜷𝟏 𝒙𝒊

7
Deriving the OLS estimates…..
For a correctly specified model the residual is the sample estimate of the error
term.

Like the error term (μ), it represents that part of the value of the variable y that
the estimated linear model is unable to explain.

The OLS method states that we should choose the SMF line that has the smallest
residuals.

The line that minimises the amount of variation in y that cannot be explained by
the model.

8
Deriving the OLS estimates…..
To measure the variation we sum the residuals.
However they would equal 0 are they are negative and
positive.
To avoid this we sum their squared values:

𝒏 𝒏 𝟐 𝒏 𝟐
 𝒊 𝝁𝟏 = 𝒊=𝟏 𝒚𝒊 − 𝒚𝒊 = 𝒊=𝟏 𝒚𝒊 − 𝜷𝟎 − 𝜷𝟏 𝒙𝒊

9
Deriving the OLS estimates…..
The OLS procedure minimises the function with respect to two unknowns βo and
β1.

Using calculus the necessary conditions for local extremum are:

𝜕 𝑛 2
𝑖=1 𝜇𝑖
 = −2 𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖 = 0
𝛿 𝛽0

𝜕 𝑛 2
𝑖=1 𝜇𝑖
 = −2 𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖 𝑥𝑖 = 0
𝛿 𝛽1

10
Deriving the OLS estimates…..
Expanded to give “normal equation”

 𝑦𝑖 − 𝛽0 𝑛 − 𝛽1 𝑥𝑖1 = 0

 𝑥𝑖 𝑦𝑖 − 𝛽0 𝑥𝑖 − 𝛽1 𝑥𝑖2 = 0 and solved

11
Deriving the OLS estimates…..
𝑛 𝑛
𝑖=1 𝑥𝑖 𝑦𝑖 −
𝑛
𝑖=1 𝑥𝑖
𝑛
𝑖=1 𝑦𝑖
𝛽1 = 𝑛 𝑥2− 𝑛 𝑥 2
𝑛 𝑖=1 𝑖 𝑖=1 𝑖

𝑛
𝑖=1 𝑥𝑖 −𝑥𝑖 𝑦𝑖 −𝑦𝑖
 𝑛 2
𝑖=1 𝑥𝑖 −𝑥𝑖

𝛽0 = 𝑦𝑖 − 𝛽1 𝑥 𝑖

12
Examples: Corn produced with fertilizer used
Year n Yi Xi
2001 1 40 6
2002 2 44 10
2003 3 46 12
2004 4 48 14
2005 5 52 16
2006 6 58 18
2007 7 60 22
2008 8 68 24
2009 9 74 26
2010 10 80 32
2011 11 85 38

13
Examples: Corn produced with fertilizer
used
• The Table gives the kilogram of corn per hectare, Y, resulting from the use of
various amounts of fertilizers in kg per hectare, X, produced on a farm in 11 years
from 2001 to 2011.

• These are plotted in a scatter diagram.

• The relationship between X and Y is approximately linear i.e., the points would fall
on or near a straight line).

14
Examples: Corn produced with fertilizer
used: Scatter plot
90

70
Corn Y, yield

0
0 5 10 15 20 25 30 35 40

Fertilizer, X

15
The Ordinary Least Squares Methods
OLS is technique for fitting the “best” straight line to the sample of XY
observations.

It involves minimizing the sum of squared (vertical) deviations of points from the
line:

2
𝑀𝑖𝑛 𝑌𝑖 − 𝑌𝑖
Where Yi refers to the actual observations, and 𝑌𝑖 refers to the corresponding
fitted values
Remember: 𝑌𝑖 − 𝑌𝑖 = 𝜇𝑖 , the residual.
16
The Ordinary Least Squares Methods…
• 𝛽0 = 𝑦𝑖 − 𝛽1 𝑥 𝑖

• It is often useful to use an equivalent formula for estimating 𝛽1 .

𝑥𝑖 𝑦 𝑖 𝑐𝑜𝑣(𝑋,𝑌)
• 𝛽1 = 𝑋𝑖2
= 2
𝜎𝑋

• Where 𝑥𝑖 = Xi - 𝑥, and 𝑦𝑖 = Yi - 𝑦.
• The estimated least squares regression (OLS) equation is then:
• 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖

17
The calculations to estimate the regression for the corn-
fertilizer problem
Year n Yi Xi yi xi x i yi x2i
2001 1 40 6 -19.5455 -13.8182 270.0826 190.9421

2002 2 44 10 -15.5455 -9.81818 152.6281 96.39669

2003 3 46 12 -13.5455 -7.81818 105.9008 61.12397

2004 4 48 14 -11.5455 -5.81818 67.17355 33.85124

2005 5 52 16 -7.54545 -3.81818 28.80992 14.57851

2006 6 58 18 -1.54545 -1.81818 2.809917 3.305785

2007 7 60 22 0.454545 2.181818 0.991736 4.760331

2008 8 68 24 8.454545 4.181818 35.35537 17.4876

2009 9 74 26 14.45455 6.181818 89.35537 38.21488

2010 10 80 32 20.45455 12.18182 249.1736 148.3967

2011 11 85 38 25.45455 18.18182 462.8099 330.5785

n =11 Sum 655 218 0.00 0.00 1465.09 939.64

Mean 59.54545 19.81818 Residuals Residuals

18
Calculations
𝑥𝑖 𝑦𝑖 1465.09
𝛽1 = = = 1.559 𝑠𝑙𝑜𝑝𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑑 𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑙𝑖𝑛𝑒
𝑋𝑖2 939.64

𝛽0 = 𝑦𝑖 − 𝛽1 𝑥 ≅ 59.55 − 1.559 19.82 = 28.6447 𝑌 𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡

The estimated regression equation:

𝑌𝑖 = 28.6447 + 1.559Xi

19
Calculations….
• Thus, when Xi = 0, 𝑌𝑖 = 28.6447 = 𝛽0

• When Xi = 19.82 = 𝑥 , 𝑌𝑖 = 28.6447 + 1.559(19.82)=59.5454 = 𝑦𝑖

• As a result, the regression line passes through point 𝑥𝑦𝑖

20
“Best line-of-fit”
40

25
Corn Yield, Y

0
0 10 20 30 40 50 60 70 80 90
Fertilizer, X

21
Summary
Simple regression is used for testing hypothesis about the relationship between a
dependent variable Y and an independent or explanatory variable X and for
prediction

Linear regression analysis assumes that there is an approximate linear relationship

between X and Y

• The set of random sample values of X and Y fall on or near a straight line

The error term (disturbance or stochastic term) measures the deviation of each
observed Y value from the true (but unobserved) regression line.

22
Summary….
The error term arise because of:

 Numerous explanatory variables with only slight and irregular effects on Y that are
omitted from the exact linear relationship given by the equation

 Possible errors of measurements in Y

 Random human behavior

23
Review questions
• The data in the following table reports the aggregate consumption (in million N$)
and disposable income (in millions N$) for the Namibian economy for the past 20
years from 2002 – 2021 (populate data for 2013 to 2021) do this by making use of a
trend equation.
1. Draw a scatter diagram for the data and determine by inspection if there exists
an appropriate linear relationship between Y and X.
2. State the general relationship between consumption and disposable income in
a) exact linear form,
b) stochastic from,
c) why would you expect most observed values of Y not to fall exactly on a straight
line.

24
Year
Reviewn Questions….data
Xi Yi
2002 1 114 102
2003 2 118 106
2004 3 126 108
2005 4 130 110
2006 5 136 122
2007 6 140 124
2008 7 148 128
2009 8 156 130
2010 9 160 142
2011 10 164 148
2012 11 170 150
2013 12 178 154
25
2014 13 188 153

Complete Business Statistics: by Amir D. Aczel & Jayavel Sounderpandian 6 Edition
No ratings yet
Complete Business Statistics: by Amir D. Aczel & Jayavel Sounderpandian 6 Edition
54 pages
2.simple Regression Analysis Chapter 6
No ratings yet
2.simple Regression Analysis Chapter 6
27 pages
2 Simple Regression Model
No ratings yet
2 Simple Regression Model
55 pages
REGRESSION ANALYSIS STA 221
No ratings yet
REGRESSION ANALYSIS STA 221
10 pages
Simple Regression Model
No ratings yet
Simple Regression Model
54 pages
4 - Empirical Modeling
No ratings yet
4 - Empirical Modeling
36 pages
MFIN 305_Lecture1
No ratings yet
MFIN 305_Lecture1
77 pages
Simple Linear Regression and Correlation: Abrasion Loss vs. Hardness
No ratings yet
Simple Linear Regression and Correlation: Abrasion Loss vs. Hardness
23 pages
Ordinary Least Squares With A Single Independent Variable
No ratings yet
Ordinary Least Squares With A Single Independent Variable
6 pages
Chapter 2
No ratings yet
Chapter 2
41 pages
Econometric s
No ratings yet
Econometric s
90 pages
Regression: Dr. Agustinus Suryantoro, M.S
No ratings yet
Regression: Dr. Agustinus Suryantoro, M.S
31 pages
Simkom-Materi Week 5 - Random-Number & Random-Variate Generation
No ratings yet
Simkom-Materi Week 5 - Random-Number & Random-Variate Generation
24 pages
DMJAP-LinearRegression-3
No ratings yet
DMJAP-LinearRegression-3
28 pages
UNIT-2 ML
No ratings yet
UNIT-2 ML
39 pages
STATE Assi 2 Saif
No ratings yet
STATE Assi 2 Saif
6 pages
Chapter 2
No ratings yet
Chapter 2
22 pages
Chapter 8
No ratings yet
Chapter 8
45 pages
CHAPTER THREE - Multiple Linear Regression Analysis
No ratings yet
CHAPTER THREE - Multiple Linear Regression Analysis
77 pages
Econ 321.6
No ratings yet
Econ 321.6
20 pages
Simple Regression
No ratings yet
Simple Regression
45 pages
Session 4 - Multiple Linear Regression
No ratings yet
Session 4 - Multiple Linear Regression
63 pages
Regression (Autosaved) (Autosaved)
No ratings yet
Regression (Autosaved) (Autosaved)
80 pages
Chapter Two: Bivariate Regression Mode
100% (1)
Chapter Two: Bivariate Regression Mode
54 pages
week 4 _ - Copy
No ratings yet
week 4 _ - Copy
8 pages
The Simple Linear Regression Model and Correlation
100% (1)
The Simple Linear Regression Model and Correlation
64 pages
Chapter 2 SLRM
No ratings yet
Chapter 2 SLRM
40 pages
STAT22209 - Chapter 02-Regression Analyisis - 2022
No ratings yet
STAT22209 - Chapter 02-Regression Analyisis - 2022
41 pages
Ken Black QA ch06
No ratings yet
Ken Black QA ch06
37 pages
Regression and Factor
No ratings yet
Regression and Factor
95 pages
Unit 2-Part 3-Linear Regression
No ratings yet
Unit 2-Part 3-Linear Regression
38 pages
03 Linear Regression
No ratings yet
03 Linear Regression
29 pages
LM10 Simple Linear Regression IFT Notes
No ratings yet
LM10 Simple Linear Regression IFT Notes
28 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
42 pages
Econometrics for Finance Lecture III
No ratings yet
Econometrics for Finance Lecture III
54 pages
SLR
No ratings yet
SLR
19 pages
Week 3-4
No ratings yet
Week 3-4
75 pages
Chapter Simple Linear Regression 1
100% (1)
Chapter Simple Linear Regression 1
77 pages
4 STAT-602 Regression & Correlation (Mid&Final)
No ratings yet
4 STAT-602 Regression & Correlation (Mid&Final)
22 pages
Regression
No ratings yet
Regression
19 pages
Lecture 3 Logistic Regression
No ratings yet
Lecture 3 Logistic Regression
14 pages
Correlation and Regression
No ratings yet
Correlation and Regression
10 pages
Stastics ll:6
No ratings yet
Stastics ll:6
22 pages
Principle of Least Square
No ratings yet
Principle of Least Square
6 pages
Econometrics 1: Classical Linear Regression Analysis
No ratings yet
Econometrics 1: Classical Linear Regression Analysis
20 pages
Lecture 11
No ratings yet
Lecture 11
16 pages
Chapter 5 Regression Analysis
No ratings yet
Chapter 5 Regression Analysis
14 pages
Simple Linear Regression and Correlation PDF
No ratings yet
Simple Linear Regression and Correlation PDF
7 pages
Econometrics 7
No ratings yet
Econometrics 7
49 pages
Estimating Demand Function
No ratings yet
Estimating Demand Function
45 pages
Mathematics Research Paper - Binomial Distribution With The Galton Board
No ratings yet
Mathematics Research Paper - Binomial Distribution With The Galton Board
13 pages
Q3 LAS 7 Pearson
No ratings yet
Q3 LAS 7 Pearson
7 pages
Econometrics Test Prep
100% (2)
Econometrics Test Prep
7 pages
Supervised Learning - Regression - Annotated
No ratings yet
Supervised Learning - Regression - Annotated
97 pages
Linear Regression Course
No ratings yet
Linear Regression Course
22 pages
125.785 Module 2.1
No ratings yet
125.785 Module 2.1
94 pages
Module Five: Correlation Objectives
No ratings yet
Module Five: Correlation Objectives
11 pages
Stat 11 q4 Week 7 SSLM
No ratings yet
Stat 11 q4 Week 7 SSLM
4 pages
STA 224 LECTURE NOTE 2
No ratings yet
STA 224 LECTURE NOTE 2
17 pages
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet
Lecture Note: Analysis of Financial Time Series
No ratings yet
Lecture Note: Analysis of Financial Time Series
12 pages
Analysis On Factors That Affect Stock Prices: A Study On Listed Cement Companies at Dhaka Stock Exchange
No ratings yet
Analysis On Factors That Affect Stock Prices: A Study On Listed Cement Companies at Dhaka Stock Exchange
21 pages
Assignment 2-BBA-Business Statistics
No ratings yet
Assignment 2-BBA-Business Statistics
3 pages
TENEBRIO MOLITOR Gasco 2016
No ratings yet
TENEBRIO MOLITOR Gasco 2016
12 pages
Tea Import Trend Analysis
No ratings yet
Tea Import Trend Analysis
6 pages
Baye 9e Chapter 03-2023
No ratings yet
Baye 9e Chapter 03-2023
29 pages
Natural Disasters Prediction
No ratings yet
Natural Disasters Prediction
21 pages
Full Factorial Design DOE..
No ratings yet
Full Factorial Design DOE..
6 pages
Research Scholar, Jiwaji University, Gwalior Associate Professor, Prestige Institute of Management, Gwalior
No ratings yet
Research Scholar, Jiwaji University, Gwalior Associate Professor, Prestige Institute of Management, Gwalior
11 pages
SSRN Id3944080
No ratings yet
SSRN Id3944080
12 pages
An Autoregressive Distributed Lag Modelling Approach To Cointegration Analysis
No ratings yet
An Autoregressive Distributed Lag Modelling Approach To Cointegration Analysis
47 pages
Stats 12 Practice Test
No ratings yet
Stats 12 Practice Test
6 pages
Quantitative Chemistry
No ratings yet
Quantitative Chemistry
13 pages
Cursus Advanced Econometrics
No ratings yet
Cursus Advanced Econometrics
129 pages
Statistics & Probability Q3 - Week 7-8
No ratings yet
Statistics & Probability Q3 - Week 7-8
10 pages
CH 6 The 2 K Factorial Design
No ratings yet
CH 6 The 2 K Factorial Design
56 pages
Determination of Synthetic Food Colourant in Tea Drink by Absorption Spectrophotometry
No ratings yet
Determination of Synthetic Food Colourant in Tea Drink by Absorption Spectrophotometry
11 pages
Magnus Preview
No ratings yet
Magnus Preview
52 pages
Đề thi cuối kỳ - Tổng hợp - EN1
No ratings yet
Đề thi cuối kỳ - Tổng hợp - EN1
7 pages
Calibration Linear
No ratings yet
Calibration Linear
15 pages
Ta Temu 6
No ratings yet
Ta Temu 6
10 pages
Notes Part 2
No ratings yet
Notes Part 2
101 pages
Identification of Demand Forecasting Model Considering Key Factors in The Context of Healthcare Products
No ratings yet
Identification of Demand Forecasting Model Considering Key Factors in The Context of Healthcare Products
5 pages
(Production and Operations Management) Chapter 2 Forecasting Summary
No ratings yet
(Production and Operations Management) Chapter 2 Forecasting Summary
8 pages
ARIMA Models and Intervention Analysis - R-Bloggers
No ratings yet
ARIMA Models and Intervention Analysis - R-Bloggers
18 pages
Paduraru Iuliana 411
No ratings yet
Paduraru Iuliana 411
8 pages
The SBAS Integrity Concept Standardised by ICAO. Application To EGNOS
No ratings yet
The SBAS Integrity Concept Standardised by ICAO. Application To EGNOS
7 pages
Statistical Analysis of Measurements Subject To Random Errors 1
No ratings yet
Statistical Analysis of Measurements Subject To Random Errors 1
16 pages
AI-Lecture-07 (Types of AI-Fake-Real-GenAI and Cost Function)
No ratings yet
AI-Lecture-07 (Types of AI-Fake-Real-GenAI and Cost Function)
34 pages
Forecasting Methods
No ratings yet
Forecasting Methods
20 pages