Simple Linear Regression
Simple Linear Regression
Anova table → find out the total variability of the regression model
Mean between data and mean of data (yi-y mean)
SST=RSS+SSE (total sum of squares = regression sum of squares + sum of squared errors)
The least square method helps us to find a line to minimise SSE
ANOVA table → how fitting is the regression model fitting to our observed data
Measure to fit of our model:
Approach 1: standard error of estimate (SEE, Standard Error of Estimate, SD of errors,s )
s=SEE= sqrt (MSE)= Sqrt (SSE/n-2 )
F stat = MSR/MSE
Indications
Observed data is close to the regression line, SSE low, SEE is also low
R^2 is high → if R^2 is high and SEE is low → a good indication where the model is good fit (high confidence of the
estimate)
Observed data is far to the regression line, SEE is high while R^2 is low, regression is a poor fit
When R^2 = 1, SSE must be equal to 0, i.e all the points fall on a straight line
F stat = MSR/MSE
Test for population coefficient of correlation
Residuals
2
(𝑆𝑦𝑦=∑(𝑦𝑖 −𝑦) ) ** (Syy)=sy^2 (n-1)
CH 5 Confidence interval T distribution (mount shaped, symmetric about 0, fatter tail than standard
Normal, larger df → closer to standard
Normal )