Regression Analysis: Basic Statistics
Regression Analysis: Basic Statistics
Basic Statistics
Lecture 13
Regression Analysis
Simple Regression
y f ( x 1, x 2 , x 3 )
Linear Regression
Fits data to a straight line
Other: Curvilinear Regression (curved line)
From Geometry:
Any line can be described by an equation
For any point on a line for X, there will be a corresponding Y
the equation for this is y = mx + c
m is the slope, b is the Y-intercept (when X = 0)
Slope = change in Y per unit change in X
Prediction
Prediction
Correlation doesn’t help in making predictions
Regression enable us to make predictions using
regression line
Symmetric
Correlation coefficients are symmetrical i.e. rxy = ryx.
Regression coefficients are not symmetrical i.e. bxy ≠
byx.
Origin & Scale
Correlation is independent of the change of origin and scale
Regression coefficient is independent of change of origin
but not of scale
TYPES OF REGRESSION ANALYSIS
Simple Linear
Regression
σ𝑥
X – 𝑋 = 𝑟. σ (𝑌 −
𝑦
𝑌)
Regression Equation of X on Y
X = a + bY
X – 𝑋 = 𝑏𝑥𝑦 (𝑌 − 𝑌)
REGRESSION COEFFICIENTS
Regression coefficient measures the average
change in the value of one variable for a unit
change in the value of another variable.
These represent the slope of regression line
σ𝑦
Regression coefficient of Y on X: byx = 𝑟. σ�
�
Regression coefficient of X on Y: bxy = 𝑟. σ𝑥
σ𝑦
PROPERTIES OF REGRESSION COEFFICIENTS
Coefficient of correlation is the geometric mean of the
regression coefficients. i.e. r = 𝑏𝑥𝑦 . 𝑏𝑦𝑥
Both the regression coefficients must have the same
algebraic sign.
Coefficient of correlation must have the same sign
as that of the regression coefficients.
Both the regression coefficients cannot be greater than
unity.
coefficient. i.e. 𝑏 𝑥 𝑦 +𝑏2 𝑦 𝑥 ≥ r
Arithmetic mean of two regression coefficients is
equal to or greater than the correlation
Regression
Equations
Using Regression
Using Normal Coefficients
Equations
REGRESSION EQUATIONS IN INDIVIDUAL SERIES USING
NORMAL EQUATIONS
Another Method
o Value of Y when X = 10
o Value of X when Y = 2.5
Y = – 1.025 + 0.581X
X = 2.432 + 0.386Y
Y = 4.785
X = 3.397
REGRESSION EQUATIONS USING REGRESSION COEFFICIENTS
(USING STANDARD DEVIATIONS)
Regression Equation of Y on X
σ𝑦
Y – 𝑌 = b yx (X – 𝑋) where yx
b =
σ𝑥
𝑟.
Regression Equation of X on Y
σ𝑥
X – 𝑋 = bxy (Y – 𝑌) where bxy
= 𝑟. σ
𝑦
Q8. Two regression lines involving the two variables x & y are Y=5.6
+ 1.2x and X = 12.5 + 0.6y. Find the means of x & y and their correlation
coefficient.
STANDARD ERROR OF ESTIMATE
Standard error of estimate helps us to know that to
what extent the estimates are accurate.
It shows that to what extent the estimated values by
regression line are closer to actual values
For two regression lines, there are two standard
error of estimates:
Standard error of estimate of Y on X (Syx)
Standard error of estimate of X on Y (Sxy)
FORMULAE FOR SE (Y ON X)
2
Σ 𝑌 −𝑌𝑐
Syx = 𝑁 Y = Actual Values,
Yc = Estimated Values
Σ𝑌 2 −𝑎Σ𝑌 −𝑏Σ𝑋𝑌
Syx = 𝑁 Here a & b are to be
Syx = σy 1 − 𝑟2
PRACTICE PROBLEMS – SE
Q7: Find the Standard error of estimates if
σx = 4.4, σy = 2.2 & r = 0.8 Ans: 1.32, 2.64