Simple Linear Regression Analysis
Simple Linear Regression Analysis
Regression Analysis
Regression Analysis is used to:
1)understand the relation between two variables
2)predict the value of one variable based on another
variable.
A regression model is comprised of a dependent
(response) variable and an independent (predictor)
variable.
Prediction Relationship
Regression Analysis
100
Y 50
0
0 20 X 40 60
Plot of all (Xi , Yi) pairs
Types of Regression Models
Y intercept Random
Error
Yi 0 1 X i i
Dependent
(Response)
Slope
Variable Independent
(Predictor/Explanatory)
Variable
Population Linear Regression
Model
Y Yi 0 1X i i Observed
Value
i = Random Error
0 1X i
YX
X
Observed Value
Sample Linear Regression Model
yˆ i b0 b1 xi
yi = Predicted Value of Y for observation i
yˆ i b0 b1 xi
The difference between the actual value of Y
and the predicted value (using sample data) is
known as the error.
Error = actual value – predicted value
Yi
Sample Linear Regression Model
yˆ i b0 b1 xi
n
n n
n xi yi xi yi
b1 i 1 i 1 i 1
2
n
n
n xi xi
2
i 1 i 1
b0 y b1 x
Table 3.1. Intelligence Test Scores and Freshmen Chemistry Grades
Test Score Chemistry
Student (x) Grade (y)
1 65 85
2 50 74
3 55 76
4 65 90
5 55 85
6 70 87
7 65 94
8 70 98
9 55 81
10 70 91
11 50 76
12 55 74
Figure 3.1. Scatter Diagram with regression line
100
95
yˆ i b0 b1 xi
Chemistry Grade
90
85
Determining point
80 estimate of b0 and b1
Using the Method of
75
Least Squares
70
40 45 50 55 60 65 70 75
Intelligence Test Score
Measures of Variation: The
Sum of Squares
Y
SSE =(Yi - Yi )2
_ b Xi
b0 + 1
SST = (Yi - Y) 2
Yi =
_
SSR = (Yi - Y)2
_
Y
X
Xi
Method of Least Squares
n n
SSE e ( yi b0 b1 xi )
2
i
2
i 1 i 1
n
i 1 i 1
Method of Least Squares
n
n n
n xi yi xi yi
b1 i 1 i 1 i 1
2
n
n
n xi xi
2
i 1 i 1
(x i x )( yi y )
b1 i 1
n
i
( x
i 1
x ) 2
Table 3.1. Intelligence Test Scores and Freshmen Chemistry Grades
Test Score Chemistry
Student (x) Grade (y)
1 65 85
2 50 74
b1 0.897
3 55 76
b0 30.056
4 65 90
5 55 85
yˆ i b0 b1 xi
6 70 87
yˆ i 30.056 0.897 xi
7 65 94
8 70 98
9 55 81
10 70 91
11 50 76
12 55 74
Figure 3.1. Scatter Diagram with regression line
100
95 yˆ i 30.056 0.897 xi
Chemistry Grade
90
85
80
100 Chart
Options
Fit Line
90
Regression
Line
80
Chemistry Grade
Regression
70 Rsq = 0.7438 Prediction
40 50 60 70 80
Line
Test Score
Using SPSS
Analyze
Regression
Linear
Coefficientsa a
Coefficients
Unstandardized Standardized
Unstandardized Standardized
Coefficients Coefficients
Coefficients Coefficients
Model B Std. Error Beta t Sig.
Model B Std. Error Beta t Sig.
1 (Constant) 30.043 10.137 2.964 .014
1 (Constant) 30.043 10.137 2.964 .014
Test Score .897 .167 .862 5.389 .000
Test Score .897 .167 .862 5.389 .000
a. Dependent Variable: Chemistry Grade
a. Dependent Variable: Chemistry Grade
yˆ i 30.043 0.897 xi
Using SPSS Standard Deviation
Analyze Coefficient of
Regression Correlation Determination around the
Linear regression line
Model Summaryb b
Model Summary
Adjusted Std. Error of
Adjusted Std. Error of
Model R R Square R Square the Estimate
Model R a R Square R Square the Estimate
1 .862 .744 .718 4.319
1 .862a .744 .718 4.319 Measures of
a. Predictors: (Constant), Test Score
a. Predictors: (Constant), Test Score
b. Dependent Variable: Chemistry Grade
Variation
b. Dependent Variable: Chemistry Grade
ANOVAb b
ANOVA
Sum of
Sum of
Model Squares df Mean Square F Sig.
Model Squares df Mean Square F Sig. a
1 Regression 541.693 1 541.693 29.036 .000 a
1 Regression 541.693 1 541.693 29.036 .000
Residual 186.557 10 18.656
Residual 186.557 10 18.656
Total 728.250 11
Total 728.250 11
a. Predictors: (Constant), Test Score
a. Predictors: (Constant), Test Score
b. Dependent Variable: Chemistry Grade
b. Dependent Variable: Chemistry Grade
Testing the Significance of b
r
2
SS regression
SSYˆ
Yˆ Y 2
Y Y
2
SS total SSY
Y
X2
X1
X
Regression Line
Residual Analysis
Purposes
Examine Linearity
Evaluate violations of assumptions
Studentized residuals:
Allows consideration for the magnitude of the
residuals
Residual Analysis for Linearity
Not Linear
Linear
e e
X X
Residual Analysis for Homoscedasticity
Heteroscedasticity
SR
Homoscedasticity
SR
X X
.08
.08
Density
.06
Density
.06
.04
.04
scatter r X, yline(0)
5
5
Linear
Residuals
Residuals
0
0
-5
-5
50 55 60 65 70
50 55 60
Test Score 65 70
Test Score
Residual Analysis for
Homoscedasticity
Homoscedasticity
scatter r1 X, yline(0) scatter sr X, yline(0)
2 2
2 2
1 1
1 Standardized residuals 1
Studentized residuals
Standardized residuals
Studentized residuals
0 0
0 0
-1 -1
-1 -1
-2 -2
50 55 60 65 70 -2 50 55 60 65 70 -2
50 55 60
Test Score 65 70 50 55 60
Test Score 65 70
Test Score Test Score
hettest
Homoscedasticity
Residual Analysis for
Independence
Residuals
Residuals
0
0
-5
-5
0 5 10 15
0 5 obs 10 15
obs
Residual Analysis for
Independence
Durbin-Watson Statistic.
The D-W statistic is
defined as:
Independent