Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
73 views

Regression Analysis: Basic Statistics

Regression analysis is a statistical tool used to predict the value of a dependent variable based on the value of one or more independent variables. It can be used to describe data and assess the strength of relationships between variables. Simple linear regression involves two variables (X and Y) where the value of Y is estimated using the regression line equation Y = a + bX. Multiple regression expands on this idea to use several independent variables to predict the dependent variable. The regression coefficients, equations, and standard error of estimate are key outputs of regression analysis used to understand relationships in data and make predictions.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views

Regression Analysis: Basic Statistics

Regression analysis is a statistical tool used to predict the value of a dependent variable based on the value of one or more independent variables. It can be used to describe data and assess the strength of relationships between variables. Simple linear regression involves two variables (X and Y) where the value of Y is estimated using the regression line equation Y = a + bX. Multiple regression expands on this idea to use several independent variables to predict the dependent variable. The regression coefficients, equations, and standard error of estimate are key outputs of regression analysis used to understand relationships in data and make predictions.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

REGRESSION ANALYSIS

Basic Statistics

Lecture 13
Regression Analysis

Regression Analysis is a very powerful


tool in the field of statistical analysis
in predicting the value of one
variable, given the value of another
variable, when those variables are
related to each other.
What is regression?

Fitting a line to the data using an equation in order to describe


and predict data

Simple Regression

Uses just 2 variables (X and Y) y  f ( x)


Other: Multiple Regression (one Y and many X’s)

y  f ( x 1, x 2 , x 3 )
Linear Regression
Fits data to a straight line
Other: Curvilinear Regression (curved line)
From Geometry:
Any line can be described by an equation
For any point on a line for X, there will be a corresponding Y
the equation for this is y = mx + c
m is the slope, b is the Y-intercept (when X = 0)
Slope = change in Y per unit change in X

Y-intercept = where the line crosses the Y axis (when X = 0)


REGRESSION
 Regression Analysis measures the nature and extent
of the relationship between two or more variables,
thus enables us to make predictions.
 Regression is the measure of the average
relationship between two or more variables.
UTILITY OF REGRESSION
 Degree & Nature of relationship
 Estimation of relationship

 Prediction

 Useful in Economic & Business Research


DIFFERENCE BETWEEN CORRELATION &
REGRESSION

 Degree & Nature of Relationship


 Correlation is a measure of degree of relationship
between X & Y
 Regression studies the nature of relationship between
the variables so that one may be able to predict the
value of one variable on the basis of another.
 Cause & Effect Relationship
 Correlation does not always assume cause and effect
relationship between two variables.
 Regression clearly expresses the cause and effect
relationship between two variables. The independent variable
is the cause and dependent variable is effect.
DIFFERENCE BETWEEN CORRELATION & REGRESSION

 Prediction
 Correlation doesn’t help in making predictions
 Regression enable us to make predictions using
regression line
 Symmetric
 Correlation coefficients are symmetrical i.e. rxy = ryx.
 Regression coefficients are not symmetrical i.e. bxy ≠
byx.
 Origin & Scale
 Correlation is independent of the change of origin and scale
 Regression coefficient is independent of change of origin
but not of scale
TYPES OF REGRESSION ANALYSIS

 Simple & Multiple Regression


 Linear & Non Linear Regression

 Partial & Total Regression


SIMPLE LINEAR REGRESSION

Simple Linear
Regression

Regression Regression Regression


Lines Equations Coefficients
REGRESSION LINES
 The regression line shows the average relationship between
two variables. It is also called Line of Best Fit.
 If two variables X & Y are given, then there are two
regression lines:
 Regression Line of X on Y
 Regression Line of Y on X
 Nature of Regression Lines
 If r = ±1, then the two regression lines are coincident.
 If r = 0, then the two regression lines intersect each other at
90°.
 The nearer the regression lines are to each other, the greater will be
the degree of correlation.
 If regression lines rise from left to right upward, then
correlation is positive.
REGRESSION EQUATIONS
 Regression Equations are the algebraic formulation of regression lines.
 There are two regression equations:
 Regression Equation of Y on X
 Y = a + bX
 Y – 𝑌 = 𝑏𝑦𝑥 (𝑋 − 𝑋)
σ𝑦
 Y – 𝑌 = 𝑟. σ (𝑋 − 𝑋)
𝑥

σ𝑥
 X – 𝑋 = 𝑟. σ (𝑌 −
𝑦
𝑌)
 Regression Equation of X on Y
 X = a + bY
 X – 𝑋 = 𝑏𝑥𝑦 (𝑌 − 𝑌)
REGRESSION COEFFICIENTS
 Regression coefficient measures the average
change in the value of one variable for a unit
change in the value of another variable.
 These represent the slope of regression line

 There are two regression coefficients:

σ𝑦
 Regression coefficient of Y on X: byx = 𝑟. σ�

 Regression coefficient of X on Y: bxy = 𝑟. σ𝑥
σ𝑦
PROPERTIES OF REGRESSION COEFFICIENTS
 Coefficient of correlation is the geometric mean of the
regression coefficients. i.e. r = 𝑏𝑥𝑦 . 𝑏𝑦𝑥
 Both the regression coefficients must have the same
algebraic sign.
 Coefficient of correlation must have the same sign
as that of the regression coefficients.
 Both the regression coefficients cannot be greater than
unity.
coefficient. i.e. 𝑏 𝑥 𝑦 +𝑏2 𝑦 𝑥 ≥ r
 Arithmetic mean of two regression coefficients is
equal to or greater than the correlation

 Regression coefficient is independent of change of


origin but not of scale
OBTAINING REGRESSION EQUATIONS

Regression
Equations

Using Regression
Using Normal Coefficients
Equations
REGRESSION EQUATIONS IN INDIVIDUAL SERIES USING
NORMAL EQUATIONS

 This method is also called as Least Square Method.


 Under this method, regression equations can be
calculated by solving two normal equations:
 For regression equation Y on X: Y = a + bX
 Σ𝑌 = 𝑁𝑎 + 𝑏Σ𝑋
 Σ𝑋𝑌 = 𝑎Σ𝑋 + 𝑏Σ𝑋2

 Another Method

 byx = 𝑁 .Σ𝑋𝑌 − Σ𝑋.Σ𝑌 &


𝑁 .Σ𝑋 −(Σ𝑋)
2 2
a = 𝑌 − b𝑋

 Here a is the Y – intercept, indicates the minimum value


of Y for X = 0
 & b is the slope of the line, indicates the absolute
increase in Y for a unit increase in X.
 Regression Equation of Y on X
𝑁 .Σ𝑋𝑌 −
 Y – 𝑌 = byx (X – 𝑋) where byx Σ𝑋.Σ𝑌
𝑁 .Σ𝑋 2 −(Σ𝑋)2
 =
Regression Equation of X on Y
𝑁 .Σ𝑋𝑌 −
 X – 𝑋 = bxy (Y – 𝑌) where bxy Σ𝑋.Σ𝑌
𝑁.Σ𝑌 2 −(Σ𝑌)2
=
Calculate the regression equations
PRACTICE PROBLEMS
Q1: Calculate the regression equation of Y on X & X on Y using
method of least squares: Y = 1.3X + 1.1,
X = 0.5 + 0.5Y
X 1 2 3 4 5
Y 2 5 3 8 7

Q2: Given the following data:


N = 8, ƩX = 21, ƩX2 = 99, ƩY = 4, ƩY2 = 68, ƩXY = 36
Using the values, find:
o Regression Equation of Y on X
o Regression Equation of X on Y

o Value of Y when X = 10
o Value of X when Y = 2.5
Y = – 1.025 + 0.581X
X = 2.432 + 0.386Y
Y = 4.785
X = 3.397
REGRESSION EQUATIONS USING REGRESSION COEFFICIENTS
(USING STANDARD DEVIATIONS)

 Regression Equation of Y on X
σ𝑦
 Y – 𝑌 = b yx (X – 𝑋) where yx
b =
σ𝑥
𝑟.
 Regression Equation of X on Y
σ𝑥
 X – 𝑋 = bxy (Y – 𝑌) where bxy
= 𝑟. σ
𝑦

Q4: Estimate Y when X = 9 as per the following information:


X Y = 15.88
Y
Arithmetic Mean 5 12
Standard Deviation 2.6 3.6
Correlation Coefficient 0.7
PRACTICE PROBLEMS
Q5: If 𝑋 = 25, 𝑌 = 120, bxy = 2. Estimate the value of X
when Y = 130. X = 45

Q6: Given two regression equations:


3X + 4Y = 44
5X + 8Y = 80 8,5,– 0.91, 3.7
Variance of X = 30. Find mean of 𝑋
& 𝑌, r and σ𝑦
SHORTCUT METHOD OF CHECKING
REGRESSION EQUATIONS

 Suppose two regression equations are as follows:


 a1x + b1y + c1 = 0
 a2x + b2y + c2 = 0

Case 1: If a1b2 ≤ a2b1 (in magnitude, ignoring negative), then


 a1x + b1y + c1 = 0 is the regression of Y on X
 a2x + b2y + c2 = 0 is the regression of X on Y

Case 2: If a1b2 > a2b1 (in magnitude, ignoring negative), then


 a1x + b1y + c1 = 0 is the regression of X on Y
 a2x + b2y + c2 = 0 is the regression of Y on X
Q 7. Two regression lines involving the two variables x & y are x + 4y +3 = 0
and 4x + 9y + 5 = 0 Find the means of x & y and their correlation
coefficient.

Q8. Two regression lines involving the two variables x & y are Y=5.6
+ 1.2x and X = 12.5 + 0.6y. Find the means of x & y and their correlation
coefficient.
STANDARD ERROR OF ESTIMATE
 Standard error of estimate helps us to know that to
what extent the estimates are accurate.
 It shows that to what extent the estimated values by
regression line are closer to actual values
 For two regression lines, there are two standard
error of estimates:
 Standard error of estimate of Y on X (Syx)
 Standard error of estimate of X on Y (Sxy)
FORMULAE FOR SE (Y ON X)
2
Σ 𝑌 −𝑌𝑐
 Syx = 𝑁 Y = Actual Values,
Yc = Estimated Values
Σ𝑌 2 −𝑎Σ𝑌 −𝑏Σ𝑋𝑌
 Syx = 𝑁 Here a & b are to be

obtained from normal equations

 Syx = σy 1 − 𝑟2
PRACTICE PROBLEMS – SE
Q7: Find the Standard error of estimates if
σx = 4.4, σy = 2.2 & r = 0.8 Ans: 1.32, 2.64

Q8: Given: ƩX = 15, ƩY = 110, ƩXY = 400, ƩX2 = 250,


ƩY2 = 3200, N = 10. Calculate Syx Ans: 13.21

Q12: Compute regression equation Y on X. Hence,


X 6 2 10 4 8
Find Syx Y 9 11 5 8 7
Ans: Y = 11.9 – 0.65X, 0.79
Advantages of Regression
Analysis

Regression analysis provides estimates of values of the dependent


variables from the values of independent variables.

Regression analysis also helps to obtain a measure of the error


involved in using the regression line as a basis for estimations .

Regression analysis helps in obtaining a measure of the degree of


association or correlation that exists between the two variable.

You might also like