Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
6 views

Bi- Variate Data Analysis - II Regression Analysis

This document provides an overview of regression analysis, a statistical method used to study the relationship between two or more variables and to estimate unknown values of a dependent variable based on known values of independent variables. It discusses the definitions, characteristics, uses, limitations, and differences between correlation and regression analysis, as well as types of regression such as simple, multiple, linear, and non-linear regression. Additionally, it covers regression lines, equations, coefficients, and the standard error of estimate to assess the reliability of predictions.

Uploaded by

Ayisha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
6 views

Bi- Variate Data Analysis - II Regression Analysis

This document provides an overview of regression analysis, a statistical method used to study the relationship between two or more variables and to estimate unknown values of a dependent variable based on known values of independent variables. It discusses the definitions, characteristics, uses, limitations, and differences between correlation and regression analysis, as well as types of regression such as simple, multiple, linear, and non-linear regression. Additionally, it covers regression lines, equations, coefficients, and the standard error of estimate to assess the reliability of predictions.

Uploaded by

Ayisha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 37
MODULE 2 = BI—VARIATE DATA ANALYSIS — II REGRESSION ANALYSIS INTRODUCTION The study of regression has great importance in statistical analysis. It is a Statistcg tool for measuring the average relationship between any two or more closely related variable, in terms of their original units. This method is advantageously used by Statisticians jy estimating the unknown values of a dependent variable from the known values of a independent variable. This method is invariably used for studying the relationship between two or mor related variables such as price and demand, production and consumption, advertisemen, expenditure and volume of sales, sales and profit, etc. The study of regression analysis js very important and useful to make the best estimates and future estimation. MEANING OF REGRESSION According to Oxford English Dictionary, the word “regression’ means ‘Stepping back” or “Returning to average value. The term was first of all used by Sir Francis Galton relating to a study of hereditary characteristics, in 1877. While studying the relationship Regression is the study of the nature of relationship between two variables, so that one may be able to predict the unknown value of one variable fora known value of another variable) In regression, one variable is considered as an independent variable and the other ReVanate Date AMUN SISH Rearession ANaY ss a i 09 relationship between demand and price, probable the help of the known values of price DEFINITION OF REGRESSION values of demand can be estimated with The technique of re t of re ‘alysis has been defined by various authors differently. Some important definitions of regression are as follows. 1, GRegression Analy: two or more variables is a mathematical measure of the average relationship between in terms of the original units of the data”) M.M. Blair. wie gression analysis measures the nature and extent of the relation between two or more variables, thus enables us to make predictions”. Hirsch. ‘egression analysis attempts to establish the nature of relationship between variables, that is to study the functional relationship between the variables and there by to provide a mechanism for predicting or forecasting”. Ya Lun Chou. 4, “The term regression analysis refers to the methods by which estimates are made of the vaiues of a variable from a knowledge of the values of one or more other variables and to the measurement of the errors involved in this estimation process”. Morris Hamburg. Itis clear from the above definitions that regression analysis is a statistical method of studying the nature of relationship between two variables and with the help of which we are in a position to estimate the unknown values of one variable from the known values of another variable. Characteristics From the above definitions we can derive the following essential characteristics of Regression analysis: It consists of a mathematical device that is used to measure the average relationship between two or more closely related variables. It is used for estimating the unknown value of a dependent variable with reference to the known value of its related independent variable. 4 3/ It consists of two lines of equation, namely, (i) Regression equation of X on Y, and (ii) Regression equation of Y on X. ties of Regression Analysis ‘on is very useful and important in statistical analysis. It has a ressi ; The study of regs idely used in various fields relating to almost all the number of utilities for which it is wi Natural, physical and social sciences. ; ATIVE FECHINIOUES FOR BU strips, 4 QUAND EE ses of regression analysis: The following are some of the u : nship between two variables, i f relatio 1, Regression analysis explains the nature o} i i ci redicted i Fa dependent variable can be pr 7 2. By regression analysis, the value of a depe vei . a , it . if the price of a co basis of the value of an independent variable. For example. i pr oMModiy, s. then its probable demand can be predicted by regression. ri " ; s of economic and business TeSearc, 3. It is a valuable tool for solving many problem: ak With the help of regression, business and economic policies can be formulated, It provides a measure of coefficient of correlation between two variables, which Cap 4. ‘ be calculated by taking the square root of the product of the two regression coefficient; r= ybxyx byx 5. echnique is highly used in our day-to-day life and sociological studies as wel) a to estimate the various factors such as birth rate, death rate, yield rate, tax rate, etc, . It isan important tool for statistical analysis and it is widely used in market research, 7. Last but not the-least, it is used to measure the errors involved in the process estimation. The uses of regression analysis are not confined to economic and business fields only. It is also used in natural, physical and social sciences. Limitations In spite of the above utilities, the technique of regression analysis suffers from the following limitations: 1. Itinvolves a complicated procedure of mathematical calculation and conclusion. 2. It cannot be used in case of qualitative phenomenon like beauty, intelligence, honesty, et: 3. It is assumed that the cause and effect relationship between the variables remain: unchanged. This assumption may not always hold good. Hence, estimation of the values of a variable made on the basis of regression equation may lead to misleading conclusions. 4. This method is difficult to operate and understand by a layman. Difference between Correlation and Regression Analysis The correlation coefficient measures the degree of covariability between two variables. while regression establishes a functional relationship between dependent and independent variables. The two techniques are directed towards a common purpose of establishing the degree and direction of relationship between two or more variables. But there are somé basic differences between correlation and regression analysis, piste Date AMADSE-T Resressaom ER, The major differences between c aa Correlation +—Zasrelation means the relat Correlation means the relationship > petween 100 oF more variables. which may vary in sympathy so that the movement in one tends to be accompanied by a corresponding movement in the other > tt finds out the degree and nature of relationship between two variables and is not the cause and effect relationship. 3. Correlation analysis coasists of caly one coefficient (r) e ‘Correlation analysis is a relative measure of the linear relationship between two variables. |. Correlation doesnot help in making rrelation and regression are bested below i — one 1 Regression means goug Sack and @ = 3 mathemat measure showreg the average relationship betweea two v 2 It indicates the cause and effect relationship between the vanables and establishes a functional relapoeship 3. Regression analysis consists of two cvefficients. ie. boy and yx Regression analy sis is an absolute measure of both linear and noe-linear relanoeshep benween mo or more related variables. 3. Regression enables us to make prediction. prediction. & Sometimes, there may exist spurious or | 6. In regression analysis, there is nothing non-sense correlation between two | _ like non-sense regression. variables by chance. 7. Correlation coefficient is independent of | 7. ‘quefficient is independent of the change of origin and scale. change of origin bat net of scale @ fr is not very useful for further | 8. Riswitely used for further mathemsatical i treatmeat_ | treatment. Ris immaterial whether X dependson¥ | 9. There is a functional relationship or Y depends on X. |” Between the to variables so that we sas | identify the dependent and independest variables. 70. Any one of the regression coefficients can To. Coefficient of correlation can never exceed unity (Greater than =1) very Well exceed unity. However, beck the regression coefficients cannot exceed unity. ‘9 wings of the same bird. The former “Actually, correlation and regression are the tw more variables, but the latter measures, establishes some sort of relationship between two oF| the average relationship between two or more closely related variables. Types of Regression Analysis The main types of Regression Analysis JV. Simple and Multiple Regression S2. 2. Lincer and Non-Linear Regression Partial and Total Regression are as follows: cuynssBIT ATIVE TECHIIOVES FOR Bi spy. am STII MM MMe 1), Simple and Multiple Regression o. we study only two 1 variable is indey 1d rainfall isan exarnp nds on sabes. tC. n two variables at a time is known, variables at a time. in whieh, nt. The functional relaticn,,” In simple repression analyst fe of simple regression, Simmiteg’ variable 1s dependent and another hetween agraultueal production an the influence of advertivement depe The reyression analysis for studying, more than 140 i multiple on. In rmaltiple regression arralysis. One iS ST eae Othe, are independent variables, The study of the effect of rainfall fertil oh ee ; ciep is.an example of multiple regression. Similarly. the turnover depends on advertisin and income of the people, ete. (2) Linear and Non-Linear Regression ww with the independent variable in some constay When the dependent variable changes ne 7 type of relationship is depicted on a graph ty ratio, it is called as linear regression. Such wwans of a straight fine, he other variable in a changing ratio, then it is referres, When one variable varies with 1 : he relationship between variable; to as nonlinear or curvi-linear regression. In this case tl can be represented by # curve other than straight line. (3) Partial and otal Regression If the study is based on more than two variables, it is called multiple regression. Usually. they take the form of a multiple relationship. For example, when the effect of advertising expenditure, price of the goods. and income of the people on the volume of sales are measured, it is a case of total regression analysis. When two or more variables are studied for functional relationship, but at a time the relationship between only two variables is studied, and other variables are held constant, 6 partial correlation. iben itis knows ai REGRESSION LINES The lines of best fit expressing mutual average relationship between two variables are kowisn ay regression lines, For txo sariables, there are to regression lines. Let us consider two variables X and ¥. Mf se have no justification to assume any one as dependent variable and other as independent variable, cither of the two may be taken as independent. On the basis of reyrevsiun line. we can predict the value of a dependent variable on the basis of a given value of the independent variable. Iftwo variables X and Y are given, then there are two regression lines related to them, Bi-Larntte Data AMON Rearewton Vest semanas stamens (1) Regression Line of \ on ¥ The regression line of X on Y pives the best estimate for the value of K for amy given value of ¥ (2) Regression Line of Y on X The regression line of Y on X gives the best estimate for the vatue of ¥ for amy grren value of X. Functions or Uses of Regression Lines The following are the utilities or functions of regression lines 1. It indicates the relationship between two variables. 2. It indicates the nature and degree of correlation. 3. It makes the best estimate of one variable based on the other variable. 4. Itis used to measure the change in one variable with a unit change in the other variable. Algebraic Method of Studying Regression Regression equations are algebraic formulation of regression lines. Regression equations represent regression lines. Just as there are two regression lines, there are two regression equations. 1. Regression Equation of X on Y This equation is used to estimate the probable value of X on the basis of the given value of Y. This equation is expressed in the following way: X-X=bxy(Y -Y) where. bxy = Regression coefficient of X on Y The value of bxy can be determined by usi any one of the following formulae: 2 (i) bxy= rn ox? (26% Oy NEXY -(EXH(EY) NE (LY (Based on actual data or square of value method) N3dxdy - (Zdx)(Zdy) N3dy? ~ (Edy)? (Here, deviations are taken from assumed mean) (ii) bxy (iii) bry = narra rive FEC HNPOUES FOR Btespy a Fy __ Ly civy tray 9 ie tore deviations are taken from Actual mean) quation of Vow estimate the probable value © nilowing Way pressed in the f om 2 Regression F This equation is used to value of N This equation wey 4 on the hasis of the 9 y-¥etyX-%) where un = Regression coefficient of ¥ on X mined by using any ‘one of the following formulae The value of byx can be dete oy (i) bys o . NEXY -(EXMEY) (ii) byx= z 5 NIX? (EX (Based on actual data or square Of N = (ii) byx= Ldxdy a Soy) NEdx? -(Zdx)° (Here, deviations are taken from assul f value method) med mean) . (iv) byx 2S xx" (Here deviations are taken from Actual mean) Regression Coefficients Regression coefficient is a vital variable with respect to & unit change in ‘There are two regression co-efficients: (i) Regression coefficient of X on Y ‘This shows that with # unit change in the value of Y variable, what will be the average change in the value of X variable. It is represented by “bxy”. | factor that measures the change in the value of one he value of another variable. oy (ii) Regression coefficient of ¥ on X ‘This shows that with a unit change in the value of X variable, change in the value of Y variable. It is represented by “byx”. ‘what will be the averagt pe lursane Date Anahete I Rewrrcewih Nid h amma jes of bey and bys can alen determined by other formulie mominned NB The earlier Properties of Regression Coefficients The main propertics of regression coefficients are as follows (1) Coefficient of correlation is the geometric mean of the regression coefficients r= glbxy » byx (2) Both the regression coefficients must have the same algebraic signs. This means that. when onc regression coefficient is positive, the other would also be positive. Similarly. when one regression coefficient is negative, the other would also be negative. It is never possible that one regression coefficient is positive and the other is negative. The coefficient of correlation will have the same sign as that of regression coefficients. That means, if both regression coefficients are positive, then the correlation coefficient would also be positive and vice versa. Both the regression coefficients cannot be greater than unity, That means, if one regression coefficient is greater than unity, then the other must be less than unity. Regression coefficients are independent of origin but not of scale. Product of regression coefficients is the square of correlation coefficient. “4 ) (6) bxy * byx =r? (7) Standard Error of Estimate Regression equations provide us with a method of obtaining the most likely values only. That means, in regression, given the value of independent variable, we estimate the value of dependent variable by using regression equations. To find out an estimate, 100% accuracy is not possible. The difference between the actual value and predicted value is the error in prediction. Standard Error of estimate is the square root of the mean of the squares of these errors. By using standard error of estimate, we can check the reliability of our estimates. It shows that to what extent the estimated values by regression lines are closer to actual values. For two regression lines, there are two standard error of estimates: (i) Standard Error of Estimate of X on Y: x Sry = where, X ~ Actual values X, ~ Estimated values ONT AT STOR Bis, oNPy ee (ii) Standard Error of Pstimate of Yon x x ? ; Syn (VV ae Syx aye vt where. Y Actual values Yo — Estimated values Mustration 1 (Using the Actual Values of X and Y series) alculeeiie on Y and Y on X from the following egression equations of X 2 [3 Tats | (i) Estimate the value of X. when Y (ii) Estimate the value of Y, when X (iii) Also calculate correlation coefficient Solution: U In this method, actual values of X and Y are used to determine’ regression equat This method is used when the given values are small in magnitude. Calculation of Regression Equations ¢ the Actual Values of X and Y series x ca Y ie XY | 1 1 2 4 2 4 | 2 4 5 25 10 3 9 3 9 9 4 16 8 64. 32 5 25 7 49 35 EX=15 DX7=55 ZY=25 ZY*=151 EXY=88 1. Regression Equation of X on Y X-X=bxy(Y-Y) z x qd) x =2X_15_, N NEXY -(EX)(LY) NzY?-(Zyy _ 5x88 -(15)x(25) 5x151—(25)? 440-375 _ 65 _ "755-625 130 (3) bxy = 05 peVarmate Data Analysis Regression \nubsin XX ebay ¥) X-3 =05(Y~5) X-3 =05Y-25 xX x SY + 0.5 > (i) Regression Equation of Y on X w Y - Y= byn(X ~X) “I aw @Y = NEXY-~(EX)ZyY) NEX?-(2x)* = 5x 88=(15) x25) 5x55-(15)? ~ 440-375 _ 65 _, 275-225 50° (3) byx Y-Y =byx(X-X) Y-5 =1.3(X-3) y-5 =13X-39 Y =1.3X-3.9+5 Y =13X+11- (ii) 3. Estimating the value of X, when Y = 10 X=05Y+0.5 X=(0.5 x 10) + 0.5 X=5+05=5.5 4. Estimating the value of Y, when X= 7 Y=13X+1.1 Y=(13*7)+11 Y=9.1+ 1.1 =10.2 5. Calculation of correlation coefficient r= Jbxy x byx = 05x 1.3 = V0.65 = 0.806 (4) : Quan TTFATIVE TECHNIQUES FOR Brisyy, t OE ty Moetration 2 maximum temperature and erifimurn temperaturg, f India . certain day at 10 important cities located at different Paris © ww [tf at [May Temp] 29] 2 J 25 | 134 y Hs bt ? ry I tii reget ne Of eat ms eee q ind Y on X- The following data shows the Fit a regression line of X on Y a Estimate the Maximum Temperature when the Minimum Temperature is 12 ' 3. Estimate the Minimum Temperature when the Maximum Temperature is 40, 4. Also calculate Karl Pearson's coefficient of correlation. Solution / Calculation of Regression Equations x cae [tm va XY | 27 841 8 4 232 33 529 3 9 69 | | 25 625 7 49 is | 5 225 5 25 3 | | 27 729 8 64 216 «| | 29 841 19 361 551 | 24 576 10 100 240 | #2 ae ul 49 217 32 1024 5 5 160 | 35 1225 8 64 || Ex=270 | Ex=7576 | sy=80 | sy=810 | EXY=2215 1. Regression Equation of X on Y x -X=bxy(Y-Y) x -2% 220 _ 1 a) x 0 27 - @ ¥ === _NEXY-(EX)(TY) G) bay = Ny? (zy)? _ 10x 2215 - (270)x(80) ~"10x810—(80)? _ 22150-21600 ~~ 8100-6400 550 = —— = 0.324 1700 on Laman ROM AMON Reardon Ana cee“) “ay XX bei Y- VY) X- 27 0.324(Y — 8) X ~27= 0.324 ¥- 2.592 X = 0.324Y 2.502 + 27 X = 0.324 ¥ + 24.408 > (1) Regression Equation of Y on X Y ~Y = byx(X —X) c xX _ 270 n =fse ts MX N90 77? = _LY,80_ @Y =N 1078 _NEXY -(Ex)(LY) NIX? -(2x}? - — 102215 -(270)x(80) 10x 7576 -(270) _ 22,150 - 21,600 75,760 — 72,900 (3) byx (4) yx(X —X) 192 (X -27) 192 X — 5.184 192 X - 5.184 +8 Y =0.192 X + 2.816 > (2) Estimation of the maximum temperature when the minimum temperature is 12 Maximum Temperature = X =? Minimum Temperature = Y = 12 X = 0.324 Y + 24.408 > (1) X = (0.324 12) + 24.408 X = 3.888 + 24.408 = 28.296 Maximum Temperature (X) = 28.3°C Estimation of the minimum temperature when the maximum temperature is 40 Minimum Temperature = Y = ? Maximum Temperature = X = 40 1. Se ANT TIVE FECTIIOUES FOR Messin VY O192N 42816 9 OD = V © (0 102 = 40) 4 2816 6 10.496 Minimum Temperature (Y) = 10.8% lation & — Cateutation of Karl Pearson's Coe! efficient of Correlados : re yfbyy © byn 1 V0.324« 0.192 = ¥0.0622 = 0.25 en from Actual Mean) Xon Yand Y on X from the following data Mlustration 3 (Using Deviations tak Calculate the regression equations of [X10 J 20730 _T 40] 50 by ft 20 fF so [30 [so [70 Also calculate the coefficient of Correlation and Estimate the value of ¥ when X = 7 Solution method is used when the arithmetic ns ure taken from Actual Mean, an integer and not a fractional value. Computation of Regression Coefficients - (X-30) (Y-50) 3 x x x2 Y y y? xy [10 20 400 20 30 900 +600 | 20 -10 100 50 0 0 0 | 30 0 0 30 -20 400 0 40 +10 100 80 +30 900 500 50 +20 400 70 +20 400 +400 Ex=150 | Ex=0 | Ex*=1000] Fy=250 | Zy=0 | By%=2600 Exy=1300 Regression Equation of X on Y X-X = bay(¥-¥) (ly X 2 2X _150_ NS @) ¥ = EY. 250 5g (3) bxy = Z2¥ = 1300 _ xy? ~ 2600 (4) X-X =bxyy-¥y X~30= 0.5 (Y ~50) pe variate Date Analysis Regression ANAS, eee f] X- WHOS Y~ 25 X =0.SY~ 25430 X =05Y+5> (i) 2, Regression Equation of Y on X Y ~Y = byx(X ~X) (4) Y-¥ =byx(x-X) Y -$0 = 1.3 (X30) Y-50=1.3X-39 Y =13X-394+50 Y =13X+11> (ii) 3, Calculation of Correlation Coefficient r= oxy x byx = V0.5x13 = 10.65 = 0.806 4. Calculation of ¥ when X = 70 Y=13X+11 Y=(1.3*70)+11 Y=91 +11 = 102 Illustration 4 (Using Deviations taken from Assumed Mean) The following data gives the age and blood pressure of ten persons: Age: (K) 36 | 42 | 36 | 47 | 49 [ 42 | 60 | 72 | 63 | 55 | Blood PressuredY)47 | 125 | 118 | 128 | 145 | 140 | 155 | 160 | 149 | 150 (i) Determine regression gguation of X on Y and Y on x (ii) Determine the blood pressure ga person whose age is 45. (iii) Determine the age when the B.P. is 170. (iv) Determine the correlation coefficient between X and Y. QUANTA! Ln ‘ients ression Coeffic — ; Goletion aleuation of Regression cor = 4 Ang, oe +4 ‘a, 304 +4 5 0 +159 +44 +7 1go__| +10 ; +50 | Bx=522 Eart= 1148] ZY=1417 | Edy=17_JRdy 91733 | Edxdy=i9, Deviations are taken from Assumed mean 50 and 140 1, Regression Equation of X and Y X-X = bxy(¥-Y) © 2x 2 (a) X =a 52.2 | (b) ¥ ar un, 141.7 Sa (c) xy =§—— ——- ) bay = NEdy? = dy)? bay = 104!258-G2x17) 12580-374 _ 12206 _ 9 a46 10x1733-(17)?17330-289 17041 (4) X-X = bxy(y-y) X ~ $2.2= 0.716 (Y - 141.7) X - 52.2 = 0,716 ¥ ~ 101.457 X = 0.716 Y ~ 101.457 + 52.2 ( X = 0.716 Y - 49.257 — (1) 2. Regression Equation of Y on X Y -¥ = byx(X ~X) = 2% _ 522 7 @) xX = *49 = 52.2 y wZY_ 1417 b) ¥ wSt (b) Ni 104 =141,7 sevariate Data Analysis-I Reyression Analysts amma - NXdxdy ~(Ldx x Edy) _ 101258 ~ (22x17) NEdx? ~(Zdx)? 101148 ~ (22)? 12580-374 _ 12206 11480484 ~ 10996 (@ Y-Y — =byx(x-X) Y - 141.7 = 1.11 (X - 52.2) Y—-141.7 = 1.11 X- 57.942 Y =1.11 X-57.942 + 141.7 Y =1.11 X + 83.758 > (2) 3. Calculation of the Blood pressure of a person whose age is 45 (c) byx = Age (X) - BP(Y) = Y= LIX +83.758 > Q) Y -=(1.11 x 45) + 83.758 = 133.708 BP (Y) at the age (40) = 133.708 4, Estimation of the age when the B.P. is 170 BP(Y) = im Age (= x Dy ee = (0.716 x 170) — 49.257 21.72 — 49.257 = 72.463 Age = (X) = 72 5. Calculation of Karl Pearson’s Correlation Coefficient = afbxy x byx r=¥0.716x1.11 r= 0.79476 = 0.891 Comment: Blood pressure and age are positively correlated. That means, BP depends on age. This method (Deviations from ‘assumed mean) is used when the given values are large in magnitude or actual means turn out to be in fractions. Xx Illustration 5 From the data given below find: 1. Two regression equations. 2. The most likely marks in Statistics when the marks in Q.T are 30. OT in QT when the marks in Statistics are 50. most likely marks i : : : Taint 36] 28 [38 | y | ui [36 | 29 | 8 | a . w | 12[ 3] 30 | 35 Marks inv al ay] 46 | 49 | 4b I : Solution Computation of Regression C i . ( | . (430) xt y dy dy? aa s 28 a 8 9 * 2 4 46 16 36 a 1s 25 49 i) 81 a 2 4 4 + 1 5 1 1 36 4 16 ; 6 36 32 8 64 a ' 1 31 -9 81 [ +8 64 30 10 100 = 44 16 33 7 49 7 42 4 39 a 1 . oT Sax-20 | Sax=180] £Y=380 | Edy=—-20 | Zdy2=438 | dxdy=—)5 1. Regression Equation of X on Y x -X =bxy(Y-Y) yz _=X _ 320 (a) X =n 10 3 = _ZY_ 380 ON 0 _ NE dxdy—(Zdxxdy) _10x-133-(20x-20) NXdy’ ~(Zdy)? 10x438-(-20) _ (-1330)~(-400) _ -1330+400 _ -930 (c) bxy 4380-400 39803980 70.234 (4) X-X =bxy(y-y) X ~32 =-0.234 (Y -38) X — 32 =-0.234 Y + 8.892 xX = 0.234 Y + 8.892 + 32 x = -0.234 Y + 40.892 > (1) 2. Regression Equation of Y on X Y~Y=byx(X ~X) (a) K = 2X _320_ N B-vartate Date ARASH Regression Anatyen ry x : LY _ 380 wm ¥ -AF =i a (c) byx = NEGxdy Edxx Edy) 10% -133)-20e 20) NXdx? (Eda?) Tos a9 Qo) {-1330)~ (400) —1330 + 400 930 ee 0400 a0 ag et @® Y-Y =byx—X) Y-38 = 0.664 (x — 32) Y-38 = 0.664 X + 21.248 + 38 Y = 0.664 X + 59.248 + (2) 3. Estimation of the most likely marks in statistics when the marks in Q.T are 30 Y = 0.664 X + 59,248 Y = (-0.664 x 30) + 59.248 Y = ~19.92 + 59,248 = 39.328 Marks in statistics = 39 4. X= Estimation of the most likely marks in Q.T when the marks in statistics are 50. 0.234 Y + 40.892 = (0.234 x 50) + 40.892 = 29.192 . Marks in Q.T =29 IMPORTANT TYPICAL ILLUSTRATIONS From the examination point of view, Regression Equation using the Actual Values of X and Y is more convenient and safe. (Square of actual Value Method) Illustration 6 following marks: 1 2 | Marks byA | 40 32 3 28 26 4 30 30 A panel of Judges A and B graded seven debators independently and awarded the | Debator 5 34 6 [Marks by B 38 | 34 7 31 28 | An eighth debator was awarded 36 marks by Judge A, while Judge B was not present. Ifthe Judge B was also present, how many marks would you expect from him to award to the eighth debator assuming that the same degree of relationship exists in their judgement. Solution 44 39 38 Let the marks awarded by Judge A be X, while marks awarded by Judge B be Y. The marks expected to be awarded by Judge B can be determined by fitting Regression Equation Y and X, QUANTITATIVE TECHNIQUES FoR py a ee a nn ulation of Regression 0S x _ 1280 ] ae 1600 39 1521 1326 | ! fi 6 676 | te 30 900 990 | 3 38 1444 1672, | 44 1936 u 1156 1292 | a we 28 784 868 3 727505 | EXY=8066 | xa fp uxeare | sven? | BY 066 | 1. Regression Equation of YonX y -¥ =byx(X-X) NEXY-(2X)(ZY) 3) byx =<— @) bye =" NEX? (EX) a 7x 8066 — (245x227) YF 58781 — (245) _ 56462-55615 ~ 61467 - 60025 = 547 0.587 1442 (4) Y-¥ =byx(X-X) Y - 32.43 = 0.587 (X -35) Y —32.43 = 0.587 X - 20.545 Y = 0.587 X ~ 20.545 + 32.43 a = 0.587 X + 11.885 — (i) 2. Estimation of Y when X = 36 Y = 0.587 x + 11.885 Y= (0.587 * 36) + 11.885 Y= 21.132 + 11.885 = 33,017 Thus; if the Judge B cighth debut also present, he would have awarded 33 marks to? prevartate Data Analysts-II Regression ANQYS1S ES 87 Inustration 7 Obtain the two regression equations from the following data: ‘Age of Husband (X)| 21 22 23 | 24 25 26 | 27 | 28 29 | 30 ‘Age of Wife (Y) 18 18 20 21 22 23 25 27 28 28 Also find the coefficient af correlation from the regression coefficients. Estimate the age of wife, when husband’s age is 32. Solution Calculation of Regression Co-efficient x x? ¥ 7 XY 21 441 18 324 378 22 484 18 324 396 23 529 20 400 460 24 576 21 441 504 25 625 22 484 550 26 676 23 529 598 27 729 25 625 675 28 784 27 729 756 29 841 28 784 812 30 900. 28 784 840 EX=255 EX*=6585 LY=230 ZY*=5424 EXY=5969 1. . Regression Equation of X on Y X-X =bxy(Y-Y) NEXY-(ZX)(LY) ® by "NEY? ayy bay = 10%5969=(255x(230) 10x5424— (230) _ 59690-58650 ~ 54240-52900 1040 =——=0.776 1340 (4) X-X =bxy(¥-Y) X-25.5 =0.776 (Y-23) X 25.5 = 0.776 Y - 17.848 X =0.776 Y - 17.848 + 25.5 X =0.776 Y + 7.652 -> (i) a QUANTITATIVE Fee reenguEy FUR AUSIN, a I 2. Regression Equation of ¥ on X Y-YetyiX 0 5 288 ® & Sh. Wass SY 230 ay (3) byx oN byx 10x6585—(255)" __ 59690-58650 ~ 65850— 65025 1040 == =1.261 825 (4) Y-Y=byx(x-X) Y~23 = 1.261 (X -25.5) Y-23 = 1.261 X-32.156 X = 1.261 X-32.156 +23 X =1.261 X-9.156 > (ii) 3. Calculation of Karl Pearson’s Correlation Coefficient r= foxy x byx r= V0.776x1.261 = 0.9785 = 0.989 4. Estimation of the age of wife (Y), when husband’s age is 32 Y = 1.261 X-9.156 Y = (1.261 x 32) - 9.156 Y = 40.352 - 9.156 Y =31.196 Wife's Age (Y) = 31 Illustration 8 (Time Lag and Lead in Correlation and Regression) The following are the monthly figures of advertising expenditure and sales of a firn It is generally found that advertising expenditure has its impact on sales after tw months. Allowing for this time, calculate coefficient of correlati 5 ion and estimate th ib! sales when company spends % 250 on advertisement, sed eel Br-Vartate Data Analysis-I1 Regression Analysts (i) (ii) Regression Equation of Y on X Y -Y =byx(X-X) NEXY-(ZX)(ZY) where, byx = NEX? EX! _ 12x 440~-(29) (15) ~12x649—(29)? = 5280-435 _ 4845 _ 5 6g, ~ 7788-841 6947 Y ~Y =byx(X -X) Y-1.25 =0.697 (X-2.417) Y-1.25 = 0.697 X-1.685 Y 0.697 X — 1.685 + 1.25 Y = 0.697 X-0.435 > (if) (iii) Calculation of Correlation Coefficient r= foxy x byx r= 0.905 0.697 = 0.6308 = 0.794 Illustration 11 The eee are the intermediate results of the i rh X=90; 70; N= 10; Ex2= 6360; 3 ¢ two series X and Y: Y* = 2860; Exy = 3900 m-Varlate Data Analysti-l Reyression Andlys1¢ santas, (where, x and y are deviations from their respective means) [ 7] two regression coefficients, Regression Equations, and coefficient of Solution 1, Regression Coefficient of X on Y xy = 28 = 390 1 3636 Ly? 2860 2, Regression Coefficient of Y on X ys = 22% = 3900 _ 6139 Ex? 6360 3. Regression Equation of X on Y x-X=bxy(Y -Y) X - 90 = 1.3636 (Y — 70) X - 90 = 1.3636 Y - 95.452 X = 1.3636 Y— 95.452 + 90 X = 1.3636 Y — 5.452 X =1.364 Y — 5.45 > (i) 4. Regression Equation of Y on X y -¥ =byx(X -X) Y -70 = 0.6132 (X - 90) 6132 X — 55.188 6132 X — 55.188 + 70 6132 X + 14.812 Y =0.613 X + 14.81 = (ii) 5. Coefficient of Correlation r= 4/bxy x byx = ¥01.364x 0.613 = ¥0.8361 = 0.914 Illustration 12 For a bivariate data, you are given the following information: X(X — 58) = 46; L(x - 58)? = 3086; Z(Y - 58) = 9; 2(Y - 58)? = 483; 5(X — 58) (Y - 58) = 1095;N=7 (Assumed means of X and Y series are both 58) QUANTITATIVE TECHNIQUES FoR May, A You are required ; (1) To determine the two regression ee (2) To calculate the coefficient of correlation. Solution Since deviations are taken from assumed mean, the following formulae are Use XUN ~ $80) = Ldx = 46 XY ~ 58) = Sdy = 9 E(X ~ $8)? = Ldx? = 3086 EY ~ $8)? = Ydy? © 483 ECX ~ $8) (Y ~ $8) = Edady = 1095 N=7 1, Regression Equation of X on Y X-Xebayiy-Y) =58+2 = 58+ 1.29 = $9.29 N&dady ~ (Lax) (Lady) N¥Edy? -(Lay)? 71095 ~ (46) x (9) Fx483-0)° _ 2065-414 _ 7251 3381-81 3300 (iv) X-X=byx(Y -Y) X - 64.57 = 22(Y 59.29) X 64.57 = 2.2 Y- 130.438 X = 22 Y- 130.438 + 64.57 X = 22 Y~ 65.868 5 (i) (iii) bxy = =220 peturnne Dota ANAS Regression Anah-ss —— & Regression Equation of Y on X ¥ = byx(X ~X) y- = 64.57 = 59.29 = N&dady -(Edx Edy) NEdx* - (Dax = 221095 ~(46)x(9) 7 3086 (46)? @ Gi)

You might also like