Linear Regression
Linear Regression
James H. Steiger
Regression – The General Setup
You have a set of data on two variables, X
and Y, represented in a scatter plot.
You wish to find a simple, convenient
mathematical function that comes close to
most of the points, thereby describing
succinctly the relationship between X and Y.
Linear Regression
The straight line is a particularly simple
function.
When we fit a straight line to data, we are
performing linear regression analysis.
Linear Regression
The Goal
Find the “best fitting” straight line for a set of data.
Since every straight line fits the equation
Y = bX + c
with slope b and Y-intercept c, it follows that our task is
to find a b and c that produce the best fit.
Ei
Yˆi
The Least Squares Criterion
The least squares criterion operationalizes
the notion of best fitting line as that choice
of b and c that minimizes the sum of
squared “residuals,” or errors. Solving this
problem is an elementary problem in
differential calculus. The method of
solution need not concern us here, but the
result is famous, important, and should be
memorized.
The Least Squares Solution
Sy
b = rYX
SX
c = Y• − bX •
The Standardized Least Squares
Solution