Simple Linear Regression
Simple Linear Regression
Learning Objectives
Probabilistic
Models
Regression Correlation
Models Models
Regression Models
Types of
Probabilistic Models
Probabilistic
Models
Regression Correlation
Models Models
Regression Models
• Answers ‘What is the relationship between the
variables?’
• Equation used
– One numerical dependent (response) variable
What is to be predicted
– One or more numerical or categorical
independent (explanatory) variables
• Used mainly for prediction and estimation
Regression Modeling
Steps
1. Hypothesize deterministic component
2. Estimate unknown model parameters
3. Specify probability distribution of random
error term
• Estimate standard deviation of error
4. Evaluate model
5. Use model for prediction and estimation
Model Specification
Regression Modeling
Steps
1. Define variables
• Conceptual (e.g., Advertising, price)
• Empirical (e.g., List price, regular price)
• Measurement (e.g., $, Units)
2. Hypothesize nature of relationship
• Expected effects (i.e., Coefficients’ signs)
• Functional form (linear or non-linear)
• Interactions
Model Specification
Is Based on Theory
• Theory of field (e.g., Sociology)
• Mathematical theory
• Previous research
• ‘Common sense’
Thinking Challenge:
Which Is More Logical?
Sales Sales
Advertising Advertising
Sales Sales
Advertising Advertising
Types of Relationships
(continued)
Strong relationships Weak relationships
Y Y
X X
Y Y
X X
Types of Relationships
(continued)
No relationship
X
Types of
Regression Models
1 Explanatory Regression 2+ Explanatory
Variable Models Variables
Simple Multiple
Non- Non-
Linear Linear
Linear Linear
Linear Regression Model
Types of
Regression Models
1 Explanatory Regression 2+ Explanatory
Variable Models Variables
Simple Multiple
Non- Non-
Linear Linear
Linear Linear
Linear Regression Model
y 0 1 x
Dependent Independent
(Response) (Explanatory)
Variable Variable
Line of Means
y
e a ns)
n e o fm
β x (li
β + 1 Change
=
E(y)
0
β1 = Slope in y
Change in x
β0 = y-intercept
x
Population & Sample
Regression Models
Population Random Sample
Unknown
y ?0 1 x ˆ
Relationship $
y 0 1 x $
$
$ $
$
$
Population Linear
Regression Model
y y i 0 1 xi i Observed
value
i = Random error
E y 0 1 x
x
Observed value
Sample Linear Regression
Model
y yi ˆ0 ˆ1 xi ˆi
^i = Random
error
Unsampled
observation
yˆi ˆ0 ˆ1 xi
x
Observed value
Estimating Parameters:
Least Squares Method
Regression Modeling
Steps
y
60
40
20
0 x
0 20 40 60
Thinking Challenge
• How would you draw a line through the points?
• How do you determine which line ‘fits best’?
y
60
40
20
0 x
0 20 40 60
Least Squares
• ‘Best fit’ means difference between actual y
values and predicted y values are a minimum
– But positive differences off-set negative
n n
yi yi ˆ i
2
ˆ 2
i 1 i 1
i 1
2 2
xi yi xi yi xi y i
2
x1 y1 x1 y12 x1y1
2 2
x2 y2 x2 y2 x2y2
: : : : :
2
xn yn xn2 yn xnyn
2 2
Σxi Σyi Σxi Σyi Σxiyi
Interpretation of Coefficients
^
1. Slope (1)
^
• Estimated y changes by 1 for each 1unit increase
in x ^
— If 1 = 2, then Sales (y) is expected to increase by 2
for each 1 unit increase in Advertising (x)
^
2. Y-Intercept (0)
• Average value of y when x = 0
^
— If 0 = 4, then Average Sales (y) is expected to be
4 when Advertising (x) is 0
Least Squares Example
You’re a marketing analyst for Hasbro Toys.
You gather the following data:
Ad $ Sales (Units)
1 1
2 1
3 2
4 2
5 4
Find the least squares line relating
sales and advertising.
Scattergram
Sales vs. Advertising
Sales
4
3
2
1
0
0 1 2 3 4 5
Advertising
Parameter Estimation
Solution Table
2 2
xi yi x i y i xiyi
1 1 1 1 1
2 1 4 1 2
3 2 9 4 6
4 2 16 4 8
5 4 25 16 20
15 10 55 26 37
Parameter Estimation
Solution
n
n
x i yi
n
i 1 i 1 15 10
x y
i i
n
37
5
ˆ1 i 1
.70
15
n 2 2
n x i 55
5
i 1
xi
2
i 1 n
yˆ .1 .7 x
Parameter Estimation
Computer Output
Parameter Estimates
^1
yˆ .1 .7 x
Coefficient Interpretation
Solution
^
1. Slope (1)
• Sales Volume (y) is expected to increase by .7
units for each $1 increase in Advertising (x)
2. Y-Intercept (^0)
• Average value of Sales Volume (y) is -.10 units
when Advertising (x) is 0
— Difficult to explain to marketing manager
— Expect some sales without advertising
Regression Line Fitted
to the Data
Sales
4
3 yˆ .1 .7 x
2
1
0
0 1 2 3 4 5
Advertising
Least Squares
Thinking Challenge
You’re an economist for the county cooperative.
You gather the following data:
Fertilizer (lb.) Yield (lb.)
4 3.0
6 5.5
10 6.5
12 9.0
© 1984-1994 T/Maker Co.
Find the least squares line relating
crop yield and fertilizer.