Multiple Regression
Multiple Regression
Contents
2.1 Introduction......................................................................................................... 20
2.4 The Residual Standard Error and the Coefficient of Determination ........... 24
2.6 Confidence Interval for the Mean Response and Prediction Interval
for the Individual Response............................................................................... 30
Bibliography ................................................................................................................. 52
Abstract
In this chapter, we first discuss the basic the least squares slope estimations, and
assumption of multiple regression. We then Appendix 2 shows how multiple regression
specify the model of regression and show how can be used to investigate the cross-sectional
this regression coefficient can be estimated. relationship among price per share, dividend
Appendix 1 derives the sampling variance of per share, and return earning per share.
2.1 Introduction price per share, dividend per share, and retained
earnings per share.
The simple regression model is examined with *For basic concepts and models of simple
one independent variable (such as amount of regression, readers can refer to Chap. 15 of
fertilizer) and one dependent variable (such as Statistics for Business and Financial Economics
yield of corn). In many cases, however, more by Lee et al. (2013).
than one factor can affect the outcome under
study. In addition to fertilizer, rainfall and tem-
perature certainly influence the yield of corn. In 2.2 The Model and Its Assumptions
business, not only rates of return for the stock
market at large affect the return on General In this section, we first review the simple re-
Motors or Ford stock. Other variables, such as gression model and extend it to a multiple
leverage ratio, payout ratio, and dividend yield, regression model. Then we define and analyze
also contribute. Therefore, regression analysis the regression plane for two independent vari-
with more than one independent variable is an ables. Finally, the important assumptions we
important analytical tool. must make to use the multiple regression model
The model that extends a simple regression to are explored in some detail.
use with two or more independent variables is
called a multiple linear regression. Simple linear The Multiple Regression Model
regression analysis helps us determine the rela- In multiple regression, simple regression is
tionship between two variables or predict the extended by introducing more than one inde-
value of one variable from our knowledge of pendent variable. It is well known that a simple
another. Multiple regression analysis, in contrast, linear regression model can be defined as Yi ¼
is a technique for determining the relationship a þ bxi þ ei and its estimate as yi ¼ a þ bxi þ ei .
between a dependent variable and more than one The sample intercept a and the sample slope
independent variable. In addition, it can be used b are estimates for a and b, respectively.
to employ several independent variables to pre-
dict the value of a dependent variable. The normal equations used to estimate
In Sect. 2.2, we discuss the assumptions of unknown parameters a and b are
the multiple regression model. In Sect. 2.3, we X
n X
n
discuss estimated parameters of the multiple na þ b xi ¼ yi
regression model. Section 2.4 discusses the i¼1 i¼1
Yi ¼ a þ b1 X1i þ b2 X2i þ i ð2:1Þ firm. (Retained earnings per share equals earn-
ings per share minus dividend per share.) The
and its estimate is first goal of the analysis is to obtain the estimated
multiple regression model
yi ¼ a þ b1 x1i þ b2 x2i þ ei ð2:2Þ
^yi ¼ a þ b1 x1i þ b2 x2i ð2:2aÞ
where Eqs. (2.1) and (2.2) represent the multiple
population regression line and the multiple The value of b1 indicates that after the influ-
sample regression line, respectively. In Eq. (2.1), ence of the retained earnings per share is taken
a is the intercept of the regression; b1 is the slope into account, a $1 increase in the dividend per
that represents the conditional relationship share (Di ) will increase the mean value of the
between Y and X1 , assuming X2 is fixed; and b2 price per share ðPi Þ by b1 other things being
is the slope that represents the conditional rela- equal. Similarly, a $1 increase in retained earn-
tionship between Y and X2 , assuming X1 is fixed. ings per share will increase the mean price per
If the model defined in Eq. (2.1) is linear, then share by b2 . If there is only one explanatory
the relationship between Y and each of the variable, the estimated regression equation gen-
independent variables can be described by a erates a straight line. There are two explanatory
straight line. In other words, the conditional variables in Eq. (2.2a), so it represents a regres-
mean of the dependent variable is given by the sion plane (three-dimensional regression graph).
following population regression equation: On this three-variable regression plane, a com-
bination of three observations (one for the value
EðYi jX1 ¼ x1 ; X2 ¼ x2 Þ ¼ a þ b1 x1 þ b2 x2 of y, one for x1 , and one for x2 ) represents a
single point. This point can be depicted on a
The coefficients b1 and b2 are called partial three-dimensional scatter diagram. In Fig. 2.1,
regression coefficients. They indicate only the the best-fitted regression plane would pass near
partial influence of each independent variable the actual sample observation points indicated by
when the influence of all other independent the symbol , some falling above the plane and
variables is held constant. Just as in simple some below in such a way as to minimize L in
regression, the multiple sample regression line of
Eq. (2.2) can be used to estimate the multiple X
n
L¼ ðyi ^yÞ2 ð2:3Þ
population regression line of Eq. (2.1).
i¼1
The Regression Plane for Two Explanatory
Variables where yi and ^yi are as defined in Eqs. (2.2) and
Let us say that the stock price per share (y) can be (2.2a), respectively.2
modeled as a function of both dividend per share If there are k independent variables, then
(x1 ) and retained earnings (x2 ) per share.1 Eq. (2.1) can be generalized to
1 2
Practical examples based on Eq. (2.2) will be explored in Using Eq. (2.3) to estimate regression parameters will be
the Applications section of this chapter. discussed in Sect. 2.3.
22 2 Multiple Linear Regression
Fig. 2.1 Regression plane with yi ðPi Þ as dependent variable and with X1i ðDi Þ and x2i ðREi Þ as independent variables
Assumptions for the Multiple Regression Model zero: Covðei ; ei Þ ¼ 0 for i 6¼ j. This assump-
As in simple regression analysis, we need five tion means that there is no autocorrelation
assumptions to perform a regression analysis of (serial correlation) among residual terms.
the model defined in Eq. (2.4). 4. The independent variables are not perfectly
related to each other in a linear function. In
1. The error term ei is distributed with condi- other words, it is not possible to find a set of
tional mean zero and variance r2 for i = 1, 2, numbers d0 ; d1 ; d2 ; . . .; dk such that
… , n.
2. The error term ei is independent of each of the d0 þ d1 X1i þ d2 X2i þ þ dk Xki ¼ 0;
k independent variables X1 ; X2 ; . . .; Xk . In other i ¼ 1; 2; . . .; n
words, there are no measurement errors asso-
ciated with any independent variable. If there
exists a measurement error of any independent In practice, the linear relationship among
variable, then the slope estimate will be biased. independent variables is usually not perfect.
In Chap. 7 of this book, we will discuss alter- When a perfect linear relationship occurs, a
native methods to deal this problem. condition known as perfect collinearity exists.
3. Any two errors ei and ei are not correlated Multicollinearity is the condition in which two
with one another; that is, their covariance is variables are highly correlated.
2.3 Estimating Multiple Regression Parameters 23
2.3 Estimating Multiple Regression There are two equations and two unknowns,
Parameters b1 and b2 , associated with this equation system.
Hence, we can solve b1 and b2 uniquely by
To estimate the best-fitted regression plane, we substitution.
use the least squares method to estimate the Pn 0 0
P n
02
Pn 0 0
Pn 0 0
i¼1 x1i yi i¼1 x2i i¼1 x2i yi i¼1 x1i x2i
regression parameters. The principle of using the b1 ¼ Pn 02 Pn 02 Pn 0 0 2
least squares method for estimating the parame- i¼1 x1i i¼1 x2i i¼1 x1i x2i
Table 2.1 Data for Example 2.1 Hence, the regression line of Eq. (2.11) becomes
x1i x2i yi
^yi ¼ 0:980 þ 1:2385x1i þ 0:9925x2i ð2:12Þ
5 7 15
10 5 17 The next section shows how to compute
9 14 26 standard errors of estimates and the coefficients
13 8 24 of determination.
15 6 27
Total 52 40 109.0
Mean 10.4 8 21.8 2.4 The Residual Standard Error
and the Coefficient
of Determination
to show how computers calculate mean, vari-
As in the case of simple regression, the standard
ance, and covariance. You do not need to
error of estimate can be used as an absolute
remember the procedure.)
measure, and the coefficient of determination as a
Substituting information from Table 2.2 into
relative measure, of how well the multiple
Eqs. (2.7), (2.8), and (2.10), we obtain
regression equation fits the observed data.
P
n P
n
ðyi yÞ2 ðy ^yi Þ2 Hence, rffiffiffiffiffiffiffiffiffiffiffiffiffiffi
5:7912
i¼1 i¼1
se ¼ ¼ 1:7016
Sum of Squares ¼ Sum of Squares 53
Total Error
se is one of the important components in
ðSST) ðSSEÞ determining the distribution of estimated a, b1 ,
P
n
and b2 and fitted dependent variable (^y).
ð^yi yÞ2
i¼1
þ Sum of Squares The Coefficient of Determination
due to Regresssion We can use Eq. (2.13) to calculate a relative
ðSSRÞ measure of the goodness of fit for a multiple
regression.
ð2:13Þ
Pn
ð^yi yÞ2
The estimated dependent variable (y1 ) of R ¼ Pi¼1
2
n
multiple regression is determined by two or more ðyi yÞ2
i¼1
independent variables. SSR and SSE are the explained variation of y ðSSR) ð2:15Þ
¼
explained and unexplained sums of squares, total variation of y ðSST)
respectively. SSE
¼1
Using the definition of sum of squares error, SST
we can define the estimate of the standard devi-
ation of error terms, sometimes called the resid- The coefficient of determination R2 is the
ual standard error, as proportion of total variation in y (SST) that is
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi explained by the intercept and the independent
Pn
i¼1 ðyi ^ yi Þ2 variable x1 and x2 . Note that both R2 and se can
se ¼ ð2:14Þ be used to measure the goodness of fit for a
n3
regression. However, R2 is a relative measure and
Because there are three parameters—a, b1 , se an absolute measure. Now we use the ANOVA
and b2 —for Eq. (2.2) that we must estimate table given in Table 2.4 to calculate the rela-
before calculating the residual, the number of tionship between R2 and se for the general mul-
degree of freedom is ðn 3Þ. In other words, tiple regression model in Eq. (2.4).
ðn 3Þ sample values are “free” to vary. More There are four columns in Table 2.4. Column
generally, the number of degrees of freedom for (1) represents the sources of variation, column
estimating the residual standard error for (2) alternative sums of squares that are identical
Eq. (2.4) is ½n ðk þ 1Þ. to those discussed in Eq. (2.13), column (3) de-
grees of freedom associated with each source of
Example 2.2 (Computing yi , ei , and e2i )
Table 2.3 Actual values, predicted values, and residuals
Using the data presented in Example 2.1, we
for annual salary regression
can estimate yi , ei , and e2i as shown in Table 2.3.
Actual Predicted Residuals
Here ^yi is obtained by substituting x1i and x2i value, yi value, ^y1
into Eq. (2.12). For example, 14.1198 = 0.980 + ei e2i
1.2385(5) + 0.9925(7); ^ei ¼ yi ^yi . 15 14.1198 0.8802 0.7748
17 18.3272 −1.3272 1.7615
X
5
2 2
ðyi ^yi Þ ¼ ð15 ¼ 14:1198Þ þ ð17 18:3272Þ 2 26 26.0209 −0.0209 0.0004
i¼1 24 25.0200 −1.0200 1.0404
þ ð26 26:0209Þ2 þ ð24 25:0200Þ2 27 25.5120 1.4880 2.2141
þ ð27 25:5120Þ2 ¼ 5:7912 Total 109 − − 5.7912
26 2 Multiple Linear Regression
Table 2.4 Notation of analysis of variance table Table 2.5 Analysis of variance results
(1) (2) (3) (4) Source of Sum of Degrees of Mean
Source of Sum of squares Degrees of Mean variation squares freedom square
variation freedom square Due to 113.0088 k=2 56.5044
Due to P
n k SSR/k regression
SSR = ð^yi yÞ2
regression i¼1
Residual 5.7912 5–2−1=2 2.8956
Residual P
n n–k−1 SSR/
SSE = ðyi ^yi Þ2
(n – k −
Total 118.8 5−1=4 29.7
i¼1
1)
Total P
n n−1 SST/
SST = ðyi yÞ2
i¼1 (n − 1)
for the individual tests, and we normally abandon the ^yi will be close or equal to y, so the F-value
or modify the model. If the joint test is rejected, will be smaller or close to zero. Thus, we cannot
we must find out which regression coefficients reject the null hypothesis that ail regression
are significant, so we perform individual tests. coefficients are insignificantly different from
zero.
Test on Sets of Regression Coefficients
Substituting related data from Table 2.5 into
Until now, our discussion has been limited to
Eq. (2.19), we obtain
point estimation of multiple regression coeffi-
cients, the coefficient of determination, and the 113:0088=2 56:5044
standard error of estimate. Now we will discuss F¼ ¼
5:7912=2 2:8956
how to use the F-statistic to test whether all true
population regression (slope) coefficients equal ¼ 19:514
zero. The F-test rather than the t-test is used. The
null hypothesis for our case is
From F-distribution critical value table, we
find that the critical value for a significance level
H0 : b1 ¼ b2 ¼ ¼ bk ¼ 0 of a = 0.05 is F0.05,2.2 = 19.0, which is smaller
ð2:18Þ
H1 : At least one b is not zero: than 19.514. Therefore, we can conclude that at
least one of the regression coefficients is signif-
If the null hypothesis is not true, then each ^yi icantly different from zero. Thus, there is a
will differ from y substantially, and the explained regression relationship in the population, and the
P
variation ni¼1 ð^yi yÞ2 will be large relative to improvement of explanatory power achieved by
the unexplained residual variation fitting a regression plane is not due to chance. In
Pn other words, the null hypothesis that years of
i¼1 ðyi ^
2
yi Þ . In other words, the R indicated
2
in Eq. (2.15) is relatively large. Thus, we can education and years of work experience con-
construct the F ratio as indicated in Eq. (2.19) to tribute nothing to an individual’s annual salary is
test whether the null hypothesis can be rejected. rejected at a 5 percent level of significance.
Finally, the relationship between the R2 indi-
Pn cated in Eq. (2.15) and the F-statistic in
ð^yi yÞ2 =k Eq. (2.19) can be shown to be6
Fk;nk1 ¼ Pn i¼1
ð2:19Þ
i¼1 ðyi ^yi Þ2 =ðn k 1Þ
n k 1 R2
Fk;nk1 ¼ :
The F ratio we have constructed is the ratio of k 1 R2
two mean square errors, as we noted in the last
section, and they are two unbiased estimates of Hypothesis Tests for Individual Regression
variances. Following the definition of the F dis- Coefficients
tribution, we know that the F ratio has an F dis- In the last section, we used the F-statistic to do a
tribution with k and (n – k − 1) degrees of joint test about a regression relationship. Now we
freedom. This F ratio enables us to test whether want to use the t-statistic to test whether multiple
at least one of the regression coefficients is sig- regression coefficients are significantly different
nificantly different from zero. from zero.
Consider the case k = 2. If there is no
regression relationship (i.e., if b1 = b2 = 0) and
because
Because R2 ¼ 1SSE=SST ¼ SSR=SST,
6
ð2:8956Þð50Þ
where tnk1 represents a t-statistic with S2b1 ¼
ðn k 1Þ degrees of freedom, k = the number ð59:2Þð50Þ ð11Þ2
of independent variables, and Sbj represents the ð2:8956Þð50Þ
¼
standard error associated with bj. The concepts 2839
¼ 0:05100
and procedure used to calculate Sbj are similar to
those used for simple regression. However, Sbj is
and
quite tedious to calculate by hand; fortunately, its
value is readily available in the computer output ð2:8956Þð59:2Þ
of any standard regression analysis program. s2b2 ¼
2839
Thus in practice, we find t simply by finding the ¼ 0:06038
ratio of the coefficient to its estimated standard
error. When the calculated value of t exceeds the
critical value ta;nk1 indicated in the t-distribu- 7
Derivations of Eqs. (2.23) and (2.24) can be found in
tion table, the null hypothesis of no significance Appendix 1. Note that these two equations are generally
can be rejected. We conclude that the jth inde- estimated by computer packages (see Sect. 2.8). Manual
approaches are presented here to show how sample
pendent variable xj does have an important variances of multiple regression slopes are actually
influence on the dependent variable yi after the calculated.
2.5 Tests on Sets and Individual Regression Coefficients 29
Then Sb1 ¼ 0:2258 and Sb2 ¼ 0:2457. Divid- ta;nk1 ¼ t0:05;2 ¼ 2:920
ing b1 and b2 by Sb1 and Sb2 , we obtain t-values
for b1 and b2 . We choose a one-tailed test because a priori
theoretical propositions were that both x1 and x2
1:2385
tb0 1 ¼ ¼ 5:4849 were positively related to y. Comparing 5.4849
0:2258 and 4.0395 with 2.920, we conclude that both
0:9925
tb2 ¼ ¼ 4:0395 years of education and years of work experience
0:2457 are significantly related to an individual’s annual
Because n ¼ 5 and k ¼ 2, from t-distribution salary.
critical value table, the critical value for a Figure 2.2 presents all the estimates and
one-tailed test on either coefficient (at a signifi- hypothesis testing information we have discussed
cance level of a = 0.05) is in the last three sections. This example certainly
Fig. 2.2 MINITAB output of multiple regression in terms of data given in Table 2.1
30 2 Multiple Linear Regression
proves that multiple regression analysis can be ^yn þ 1 ¼ a þ b1 x1;n þ 1 þ b2 x2;n þ 1 ð2:27Þ
more efficiently performed by using the MINI-
TAB computer program. This is the best point estimate for both con-
ditional expectation and actual-value forecasts. In
other words, the forecast of conditional expec-
2.6 Confidence Interval tation value is equal to the forecast of actual
for the Mean Response value. However, the forecasts are interpreted
and Prediction Interval differently. The importance of these different
for the Individual Response interpretations will emerge when we investigate
the process of making interval estimates.
Point Estimates of the Mean and the Indi-
vidual Responses One of the important uses of Interval Estimates of Forecasts
the multiple regression line is to obtain predic- To construct a confidence interval for forecasts, it
tions and forecasts for the dependent variable, is necessary to know the distribution, mean, and
given an assumed set of values of the indepen- variance of ^yn þ 1 . The distribution of ^yn þ 1 is a t-
dent variables. This kind of prediction is called distribution with (n − 3) degrees of freedom. The
the conditional prediction (forecast), just as in variance associated with ^yn þ 1 may be classified
simple regression. Suppose the independent into three cases. First, we deal with a case in
variables are equal to some specified values which the conditional mean (^yn þ 1 ) is equal to the
x1;n þ 1 and x2;n þ 1 , and that the linear relationship unconditional mean (y). In the second and third
among yn , x1;n , and x2;n continues to hold.8 Then cases, we deal with the conditional mean. How-
the corresponding value of the dependent vari- ever, case 2 involves the mean response and case
able Yn þ 1;i is 3 the individual response.
Yn þ 1;i ¼ a þ b1 x1;n þ 1;i þ b2 x2;n þ 1;i þ en þ 1;i Case 2.1 [Conditional Expectation (Mean
ð2:25Þ Response) with x1;n þ 1 ¼ x1 and x2;n þ 1 ¼ x2 ]
From the definitions of the intercept of a
which, given x1;n þ 1 and x2;n þ 1 , has expectation regression and the sample regression line, we have
EðYn þ 1 jx1;n þ 1; x2;n þ 1 Þ ¼ a þ b1 x1;n þ 1 þ b2 x2;n þ 1 ^yn þ 1 ¼ ðy b1x1 b2x2 Þ þ b1 x1;n þ 1 þ b2 x2;n þ 1
ð2:26Þ ¼ y þ b1 ðx1;n þ 1 x1 Þ þ b2 ðx2;n þ 1 x2 Þ
Equation (2.26) yields the mean response If x1;n ¼ x1 and x2;n ¼ x2 , then ^yn þ 1 ¼ y. We
EðYn þ 1 jx1;n þ 1; x2;n þ 1 Þ that we want to estimate can obtain the estimate of the variance for yn þ 1
when the independent variables are fixed at as
x1;n þ 1 and x2;n þ 1 . Equation (2.25) yields the
actual value (or individual response) that we s2 ð^yn þ 1 Þ ¼ s2 ðyÞ ¼ s2e =n ð2:28Þ
want to predict.
To obtain the best point estimate, we first
estimate the sample regression line as defined in Case 2.2 [Conditional Expectation (Mean
Eq. (2.2). Then we substitute the given values Response) with x1;n þ 1 6¼ x1 and x2;n þ 1 6¼ x2 ]
x1;n þ 1 and x2;n þ 1 into the estimated Eq. (2.12), In this case, the forecast value can be defined as
obtaining
^yn þ 1 ¼ y þ b1 ðx1;n þ 1 x1 Þ þ b2 ðx2;n þ 1 x2 Þ
ð2:29Þ
8
x1;n þ 1 , and x2;n þ 1 , can be either given values or
forecasted values. When a regression is used to describe
a time-series relationship, they are forecasted values.
2.6 Confidence Interval for the Mean Response … 31
We therefore obtain the estimate of the vari- ^yn þ 1 ðtn3;a=2 ÞS1 ð2:33Þ
ance for ^yn þ 1 in terms of sample standard vari-
ance of estimates S2e as where s1 is defined in Eq. (2.30).
3. For prediction of the actual value yn þ 1 , the
s21 ¼ s2 ð^yn þ 1 Þ
" prediction interval is
2 1 ðx1;n þ 1 x1 Þ2 ðx2;n þ 1 x2 Þ2
¼ se þ þ ^yn þ 1;i ðtn3;a=2 ÞS2 ð2:34Þ
n ð1 r 2 ÞC12 ð1 r 2 ÞC22
2ðx1;n þ 1 x1 Þðx2;n þ 1 x2 Þr
where S2 is defined in Eq. (2.31).
ð1 r 2 ÞC1 C2 To show how Eq. (2.34) is applied in con-
ð2:30Þ structing the confidence interval for forecasting
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi the actual value of yn þ 1 , let us use the annual
Pn salary example (Table 2.2) to find the 95% pre-
where C1 ¼ i¼1 ðx1;j x1 Þ2 C2 ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn diction interval for annual salary, yn þ 1 , when a
i¼1 ðx2;i x2 Þ2 and r = correlation coeffi- person has 6 years of education and 5 years of
cient between x1;i and x2;i work experience. The predicted annual salary can
be computed from Eq. (2.12).
Case 2.3 [Actual Value (Individual Response) of
yn þ 1 ] ^yn þ 1;i ¼ 0:980 þ ð1:2385Þð6Þ þ ð0:9925Þð5Þ
After we have derived the sample variance for ¼ 13:3735 ðin thousands of dollarsÞ
^yn þ 1 , we derive the sample variance for indi-
vidual response (observation), yn þ 1;i (which From Table 2.2, we have
deviates from ^yn þ 1 by a random error ei ).
X
n
ðx1i x1 Þ2 ¼ 59:2;
yn þ 1;j ¼ ^yn þ 1 þ ei i¼1
X
n
The variance of an individual observation, ðx2i x2 Þ2 ¼ 50; n¼5
yn þ 1;i , includes the variance of the observation i¼1
Xn
about the regression line s2e as well as
ðx1i xi Þðx2i x2 Þ ¼ 11;
s2 ð^yn þ 1;i Þ. Because ^yn þ 1 and ei are independent, i¼1
s2 ðyn þ 1;j Þ ¼ s2 ð^yn þ 1 Þ þ s2e . More explicitly, x1 ¼ 10:4; x2 ¼ 8
s2 ð^yn þ 1;i Þ ¼ s21 þ s2e ¼ s22 ð2:31Þ
Using this information, we calculate
where s21 is defined in Eq. (2.30). pffiffiffiffiffiffiffiffiffi pffiffiffiffiffi
C1 ¼ 59:2 ¼ 7:6942; C2 ¼ 50 ¼ 7:0711
Using Eqs. (2.28), (2.30), and (2.31), we can obtain a Pn
i¼1 ðx1i x1 Þðx2i x2 Þ
confidence interval for prediction as follows: r ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn ffi
2 Pn
i¼1 ðx 1i
x 1 Þ i¼1 ðx 2i
x 2 Þ 2
When n is large, we can modify this expres- Bobko and Donnelly estimated this multiple
sion by replacing t with the appropriate normal regression model using data obtained from
deviate z. interviews. Their regression results are presented
MINITAB output showing prediction results in Table 2.6. As would be expected, performance
of x1;n þ 1;i¼6 and x2;n þ 1;i¼5 is presented in level was the single best predictor of 95 estimates
Fig. 2.3. The prediction interval shown in the last of judgments of overall worth. The other
row of Fig. 2.3 is (3.466, 23.280), which is job-level correlates were combat probability,
similar to what we calculated before. enlistment bonus, reenlistment bonus, aptitude,
In the next two sections, we will explore cost of error, and task variety. The first six of
applications of multiple regression in business these predictors had statistically significant
and economics. Section 2.8 explicitly treats the regression weights (coefficients), p-value < 0.05,
use of SAS and MINITAB computer programs to indicating their unique contribution to the pre-
do multiple regression analyses. diction of overall worth estimates. However, task
variety was not statistically significant.
population coefficient for the market is not equal table, we find that the critical value for the F-test
to zero. The t-statistic for the payout ratio, which is F0.01,2,30 = 5.39 and F0.01,2,40 = 5.18. Because
is calculated by dividing the parameter value the F-value for the regression is less than the
(−0.2133) by the standard error (0.5613), is critical value 5.39, the null hypothesis cannot be
−0.3800. Its p-value is 0.7061; thus, the null rejected.
hypothesis cannot be rejected. Application 2.3 Analyzing the Determination
R2 for the regression is 0.106. In other words, of Price per Share To further demonstrate
the independent variables explain about 10.6% of multiple regression techniques, let us say that a
the variation in the rate of return on JNJ stock. cross-sectional regression is run. In a
2
The adjusted R-square, R , which takes into cross-sectional regression, all data come from a
account overfitting in the sample, is equal to single period. The dependent variable in this
0.057. regression is the price per share (Pj) of the 30
The F-value, which tests the hypothesis that firms used to compile the Dow Jones Industrial
the population coefficients of the independent Average for the year 2009. The independent
variables are both zero against the alternative that variables are the dividend per share (DPSj) and
they are not, is equal to 2.184. The degrees of the retained earnings per share (EPSj) for the 30
freedom associated with this F-value are v1 = 2 firms. (Retained earnings per share is defined as
and v2 = 37. From F-distribution critical value earnings per share minus dividend per share.
2.7 Business and Economic Applications 35
Table 2.8 Pj = a + b1DPSj + b2 EPSj + ej fluctuations in the price per share. The adjusted
2
Variable Coefficient Standard t- p- R-square, R , is 0.703. The F-value for the
error value value
regression is 35.35. The numbers of degrees of
Constant 12.800 5.084 2.518 0.018
freedom for the regression and residual are 2 and
DPS 12.836 4.657 2.756 0.010
27, respectively. The critical value for F at a 1%
EPS 0.978 0.218 4.478 0.000 level of significance is 5.49. Because the
R2 = 0.724 regression F-value is greater than the critical
2
R ¼ 0:703 value, the null hypothesis that the coefficients are
F-value = 35.35 equal to zero is rejected.
Observations 30 The relationship among price per share, divi-
dend per share, and retained earning per share
will be discussed in Appendix 2.
Price per share is the close price of the end of Application 2.4 Multiple Regression Approach
year 2009; dividend per share and retained to Evaluating Real Estate Property
earnings per share are based on 2009 annual To show how the multiple regression technique
balance sheet and income statement.) The sample can be used by real estate appraisers, Andrews
regression relationship is and Ferguson (1986) used the data in Table 2.9
to do the multiple regression analysis
Pj ¼ a þ b1 DPSj þ b2 EPSj þ ej
ðj ¼ 1; 2; . . .; 30Þ yi ¼ b0 þ b1 x1i þ b2 x2i þ ei
where
Empirical results are presented in Table 2.8.
The constant term is significant with a t-value of yi ¼ sale price for ith house
2.518. This result means that the intercept term is x1i ¼ home size for ith house
statistically different from zero and the null x2i ¼ condition rating for ith house
hypothesis can be rejected at both a 10 and a 5%
level. The retained earnings per share variable is
highly significant with a t-value of 4.478 and a p- Table 2.9 Sale price, house size, and condition rating
value of 0.000. Thus, we can reject the null
Sale price, Home size, x1 Condition
hypothesis that the coefficient is equal to zero y (thousands of (hundreds of sq. rating, x2 (1–
and accept the alternative hypothesis that it dif- dollars) ft.) 10)
fers from zero and makes a contribution to price 60.0 23 5
per share. The coefficient for this variable is
32.7 11 2
0.978; mean price per share increases $0.978
57.7 20 9
when the retained earnings per share increases by
$1.00, given the dividend. 45.5 17 3
The coefficient for the dividend per share 47.0 15 8
variable has a t-value of 2.756 and a p-value of 55.3 21 4
0.010. This is the lowest level of significance at 64.5 24 7
which the null hypothesis can be rejected; thus, 42.6 13 6
the null hypothesis is rejected at both a 10 and a 54.5 19 7
5% level. The coefficient for dividend per share
57.5 25 2
is 12.836. When the dividend increases by $1.00,
Source R. L. Andrews and J. T. Ferguson, “Integrating
the price per share tends to rise by $12.836. Judgment with a Regression Appraisal.” The Real Estate
The value of R2 is 0.724, which means that the Appraiser and Analyst, Vol. 52, No. 2, Spring 1986
model explains 72.4% of the observed (Table 1)
36 2 Multiple Linear Regression
Table 2.10 Hours of labor and related factors cause costs to be incurred
Week Hours of Thousands of pounds Percentage of units shipped Average number of pounds per
labor, y shipped, x1 by truck, x2 shipment, x3
1 100 5.1 90 20
2 85 3.8 99 22
3 108 5.3 58 19
4 116 7.5 16 15
5 92 4.5 54 20
6 63 3.3 42 26
7 79 5.3 12 25
8 101 5.9 32 21
9 88 4.0 56 24
10 71 4.2 64 29
11 122 6.8 78 10
12 85 3.9 90 30
13 50 3.8 74 28
14 114 7.5 89 14
15 104 4.5 90 21
16 111 6.0 40 20
17 110 8.1 55 16
18 100 2.9 64 19
19 82 4.0 35 23
20 85 4.8 58 25
Source G. J. Benston (1966), “Multiple Regression Analysis of Cost Behavior,” Accounting Review, Vol. 41, No. 4,
657–672. Reprinted by permission of the publisher
38 2 Multiple Linear Regression
x2t ¼ percentage of units shipped by truck in tth MINITAB regression output is presented in
week Fig. 2.5. From p-values indicated in Fig. 2.5,
x3t ¼ average number of pounds per shipment in we find that b0 and b3 are significantly different
tth week from 0 at a = 0.01. Hence, we can conclude
that the only important variable in determining Computer outputs of Fig. 2.6a–c present the
the hours of labor required in the shipping following results of simple and multiple
department is the average number of pounds per regression.
shipment.
1. Estimated intercept and slopes;
2. F-values for the whole regression;
2.8 Using Computer Programs 3. t-Values for individual regression
to Do Multiple Regression coefficients;
Analyses 4. ANOVA of regression;
2
5. R2 and R
2.8.1 SAS Program for Multiple 6. p-Values;
Regression Analysis 7. Durbin–Watson D and first-order
autocorrelation;
In an example taken from Churchill’s Marketing 8. Standard error of residual estimate (mean
Research, data for the sales of click ballpoint square of error);
pens (y), advertising (x1 , measured in TV spots pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
9. Root MSE ¼ MSE error. For example, for
per month), number of sales representatives (x2 ), pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Eq. (b), Root MSE ¼ 2039:85310 ¼
and a wholesaler efficiency index (x3 ) were pre-
45:16473. The root MSE estimate can be
sented in Table 2.11.
used to measure the performance of
We investigated not only the relationship
prediction.
between two variables (y and x1 , y and x2 , and
y and x3 ). Now we will expand that analysis by
These SAS regression outputs give us almost
using the following three regression models:10
all the sample statistics we have examined so far.
yi ¼ a þ b1 x1i þ ei ðaÞ Now let us consider the practical implications of
Eqs. (a–c).
yi ¼ a þ b1 x1i þ b2 x2i þ ei ðbÞ Equation (b) specifies a regression model in
yi ¼ a þ b1 x1i þ b2 x2i þ b3 x3i þ et ðcÞ which sales is the dependent variable and the
independent variables are number of TV spots x1
Equation (a) can be used to investigate the and number of sales representatives x2 . The fitted
relationship between y and x1. regression equation is
Equation (b) can be used to analyze whether
^y ¼ 69:3 þ 14:2x1 þ 37:5x2 F ¼ 128:141
the second explanatory variable, x2 , improves the
ð2:994Þ ð5:315Þ ð5:393Þ
equation’s power to explain the variation of
sales. Equation (c) can be used to analyze whe-
ther the third explanatory variable, x3 , further Here t-values are indicated in parentheses.
improves that explanatory power. Part of the This regression indicates that when the num-
output of the SAS program for Eqs. (a–c) is ber of TV spots increases by 1 unit, sales
presented in Fig. 2.6a–c. Figure 2.6a shows the increase by $14,200 on average. And when the
regression results of Eq. (a), Fig. 2.6b the number of sales representatives increases by 1
regression results of Eq. (b), and Fig. 2.6c the person, sales increase by $37,500 on average.
regression results of Eq. (c). Using these results, The F-value for the regression of Eq. (b) is
we will review and summarize simple regression 128.141. There are 40 observations and 2 inde-
and multiple regression results. pendent variables, so the number of degrees of
freedom in the model is 40 − 2 − 1 = 37. By
interpolation, it can be shown that the critical value
10
In these regressions, we hold the price of a ballpoint pen of F0.05,2,37 is 3.25 (F-distribution critical value
and the income of a consumer constant, because this is a table). Because the F-value for the regression is
set of cross-sectional data.
40 2 Multiple Linear Regression
greater than the critical value, the hypothesis that table, by interpolation, we find that
the coefficients are equal to zero is rejected. From F0.05,3,36 = 2.88. F = 89.051 is much larger
the t-values associated with estimated regression than 2.88. This implies that we reject the fol-
coefficients, we find that the estimated intercept lowing null hypothesis of our joint test
and slopes are significant at a = 0 0.01. H0 : b1 ¼ b2 ¼ b3 ¼ 0
Because the t-values of b2 are significantly Now we can use t-statistics to test which
different from zero, we conclude that adding the individual coefficient is significantly different
number of sales representatives improves the from zero. From t-distribution critical value
equation’s power to explain sales. This conclu- table, by interpolation, we find that the critical
2 value of t-statistic is t0.005,36 = 2.72. By com-
sion can also be drawn from the fact that R has
increased from 0.7687 to 0.8670. paring this critical value with 4.738, 5.666, and
The fitted regression of Eq. (c) is 1.498, we conclude that b1 and b2 are signifi-
cantly different from zero and that b3 is not sig-
^y ¼ 31:1504 þ 12:9682x1 þ 41:2456x2 nificantly different from zero at a = 0.01. In
ð0:911Þ ð4:738Þ ð5:666Þ other words, the wholesaler efficiency index does
þ 11:5243x3 F ¼ 89:051 not increase the explanatory power of Eq. (c).
ð1:498Þ MINITAB Program for Multiple Regression
Prediction
Again, t-values are indicated in parentheses. MINITAB is used to run the regression defined
Following Sect. 2.5, we first test the whole in Fig. 2.6c and presented in Fig. 2.7. Besides
set of regression coefficients in terms of the F- regression parameters, we also predict y by
statistic. From F-distribution critical value assuming x1 ¼ 13, x2 ¼ 9, and x3 ¼ 5. The
Fig. 2.6 a SAS output for regression results of yi = a + b1x1i + ei, b SAS output for regression results of
yi ¼ a þ b1 x1i þ b2 x2i þ ei , c SAS output for regression results of yi ¼ a þ b1 x1i þ b2 x2i þ b3 x3i þ ei
42 2 Multiple Linear Regression
results are listed in the last row of Fig. 2.7. the statistical model to explain the sales. The
They are stepwise regression method suggests the follow-
ing steps.
1. ^yn þ 1;i ¼ 628:57
2. sð^yn þ 1 Þ ¼ 34:92 Step 1:
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
3. sð^yn þ 1;i Þ ¼ s2 ð^yn þ 1 Run simple regression on each explanatory
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi variable, and choose the model that explains the
þ s2e Þ ¼ ð34:92Þ2 þ 1973 ¼ 56:50
highest amount of variation in y. The R2 -value in
4. 95% confidence interval: (557.73, 699.40); each computer report is used to determine which
5. 95% prediction interval: (513.94, 743.19). variable enters the model first. Upon comparing
R2 values for the three models, we conclude that
Stepwise Regression Analysis x2 , which has the highest R2 -value (0.7775),
In this example, we want to use stepwise should enter the model first.
regression to establish a statistical model to
predict the sales of click ballpoint pens (y). We Independent variable R2 F-value
are considering three possible explanatory vari-
x1 0.7747 130.644
ables: advertising (x1 ) measured in TV spots per
x2 0.7775 132.811
month, the number of sales representatives (x2 ),
and a wholesaler efficiency index (x3 ). The x3 0.0000 0.000
question is what variables should be included in
46 2 Multiple Linear Regression
Step 2: Step 3:
The second variable to enter should be the vari- In this step, we want to decide whether another
able that, in conjunction with the first variable, variable should enter the model to explain y. Note
explains the greatest amount of variation in y. that every time an additional variable is included in
a model, R2 increases. The question is whether the
Independent R2 F-value increase in R2 justifies inclusion of the variable.
variables
We apply an F-test to answer this question.
x2 x1 0.8738 128.141
x2 x3 0.807 77.46 ðR2f R2R Þðkf kR Þ
F¼
ð1 R2R Þ=ðN kf 1Þ
The R2-values and F-values in the foregoing
table are obtained from Fig. 2.6b and Fig. 2.8. where
The table shows the results when x1 and x3 are R2f ¼ R2 of the model with the new variable
combined with x2 to explain the variation in R2R ¼ R2 of the model without the new variable
y. The combination of x1 and x2 clearly yields a kf ¼ number of the variables in the model with
higher R2 (0.8738). This suggests that x1 should the new variable
be the second variable to enter.
kR ¼ number of the variables in the model Figure 2.9 shows the output of a stepwise
without the new variable regression analysis using MINITAB.
To determine whether x3 should be included
in the model, we need to compare the R2 of the
model with x3 and R2 of the model without x3 . 2.9 Conclusion
Independent variables R2 In this chapter, we examined multiple regression
x2 x1 0.8738 analysis, which describes the relationship
x2 x1 x3 0.8812 between a dependent variable and two or more
independent variables. Methods of estimating
Using the foregoing formula, we compute multiple regression (slope) coefficients and their
standard errors were discussed in depth. The
ð0:8812 0:8738Þ=ð3 2Þ residual standard error and the coefficient of
F¼
ð1 0:8812Þ=ð40 3 1Þ determination were also explored in some detail.
¼ 2:24\F0:05;1;36 ¼ 4:11 Both t-tests and F-tests for testing regression
relationships were discussed in this chapter. We
Because including x3 does not increase R2 investigated the confidence interval for the mean
significantly, the null hypothesis that x3 should response and the prediction interval for the
not be included is not rejected in this case. Our individual response. And finally, we saw how
conclusion from the stepwise regression analysis multiple regression analyses can be used in
is that the best model should include only x1 and business and economics decision making. In the
x2 as explanatory variables. next chapter, we will discuss other topics in
Some computer packages are programmed to applied regression analysis. Examples of applied
perform the whole complicated stepwise regres- multiple regression analysis in both business and
sion in response to one simple command. finance will be discussed in some detail.
Estimations b2 ¼ yi
i¼1
ð1 r 2 ÞC22
We can obtain the correlation coefficient between Xn
¼ B2i yi
x1 and x2 as i¼1
Pn ð2:39Þ
ðx1i x1 Þðx2i x2 Þ
r¼ i¼1
ð2:35Þ
C1 C2 Substituting Eq. (2.2) into Eqs. (2.38) and
(2.39), we get
where
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X
n X
n X
n X
n
X n b1 ¼ a B1i þ b1 B1i x1i þ b2 B1i x2i þ B1i ei
C1 ¼ ðx1i x1 Þ2 ð2:36aÞ i¼1 i¼1 i¼1 i¼1
i¼1
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi and
X n
C2 ¼ ðx2i x2 Þ2 ð2:36bÞ
i¼1 X
n X
n X
n
b2 ¼ a B2i þ b1 B2i x1i þ b2 B2i x2i
i¼1 i¼1 i¼1
Substituting (2.35), (2.36a), and (2.36b) into
Xn
Eq. (2.7) yields þ B2i ei
Pn Pn i¼1
C 2
i¼1 ðx1i x1 Þðy0i Þ rC1 C2 x2 Þðy0i Þ
i¼1 ðx2i
b1 ¼ 2
C12 C22 r 2 C12 C22
Pn
It can be easily seen that B ¼ 0,
X ðx1i x1 Þðy Þ ðrC1 =C2 Þðx2i x2 Þðy0 Þ
Pn Pn Pn i¼1 1i
n 0
B ¼ 0, B x ¼ 1, i¼1 2i x2i ¼ 1,
B
¼ i i
Pi¼1 2i i¼1 1i P1i
ð1 r 2 ÞC12 n n
i¼1 i¼1 B2i x1i ¼ 0, and i¼1 B1i x2i ¼ 0. There-
Xn
fore, these two equations imply that
ðx1i x1 Þ ðrC1 =C2 Þðx2i x2 Þ 0
¼ yi
ð1 r ÞC12 2
X
i¼1
n
ð2:37Þ b1 Eðb1 Þ ¼ b1 b1 ¼ B1i ei ð2:40Þ
i¼1
Substituting
and
X
n X
n
ðx1i x1 Þy0i ¼ ðx1i xi Þyi X
n
i¼1 i¼1 b2 Eðb2 Þ ¼ b2 b2 ¼ B2i ei ð2:41Þ
i¼1
and
From Eq. (2.40), we obtain
X
n X
n
2
ðx2i x2 Þy0i ¼ ðx2i x2 Þyi !2 3 " !# 2
X
n X
n
i¼1 i¼1 Varðb1 Þ ¼ E4 B1i ei 5 E B1i ei
i¼1 i¼1
!2
into Eq. (2.37), and letting the coefficient of yi X
n X
n X
n
X
n ð2:42Þ
b1 ¼ B1i yi ð2:38Þ
i¼1
Appendix 1: Derivation of the Sampling Variance … 49
In Eq. (2.8), the last equality holds because This implies that the sample variance of
Eðei Þ ¼ 0 and Eðei ej Þ ¼ 0 when i 6¼ j. multiple regression slopes reduces to a simple
From the definition of B1i in Eq. (2.38), we regression case.
have
ðx1i x1 Þ2 þ r 2 C12 =C22 ðx2i x2 Þ2 2rðC1 =C2 Þðx1i x1 Þðx2i x2 Þ
B21i ¼ ð2:43Þ
ð1 r 2 Þ2 C14
The dividends and earnings model as shown consistent manner, which, given this evidence,
in Eq. (2.46) show the price of a share of stock to indicates that dividends are three to four times as
be linearly dependent upon dividends, earnings, important in the valuation procedure as are growth
and some constant term: opportunities. Once again, multicollinearity is a
problem that is left unquestioned.
P ¼ a0 þ a1 D þ a2 Y: ð2:47Þ
Gordon dismisses the pure-earnings approach
The a1 and a2 terms are the point of interest; if with the realization that investors would be
a1 is greater than a2, then dividends have a rel- indifferent between dividends and increases in
atively larger impact in the valuation procedure market value of their holdings if they valued only
than do earnings, and vice versa. The multiple earnings, dismissing market imperfections for
correlations for these eight regressions were quite now. If this were so, then the two relevant
high, averaging 0.905, with a range from 0.86 to coefficients would tend to be equal, and since
0.94. Of the dividend coefficients, seven were they were found to be drastically different in the
statistically significant at a 5% level (the only second set of regressions, he discounts this as a
negative value being the insignificant value), viable valuation format.
while two of the earnings coefficients were not In refining his model, it appears that Gordon
statistically significant. In each case, except that was primarily interested in not allowing
one where the dividend coefficient is insignifi- short-term aberrations to influence the parameter
cant, the dividend coefficient is greater in mag- estimates, although he also adjusted all variables
nitude than the earnings coefficient. This in accordance with a size factor, which, in some
evidence is shaky proof for dividend preference if respects, could be interpreted as the inclusion of
one recognizes the multicollinearity problem and a previously omitted variable that could affect the
the apparent lack of consistency of the coeffi- stability of some of the coefficients. The new
cients across industries or stability over time model appears as:
periods, even in any relative sense. þ B3 ðgÞ þ E4 ðg gÞ;
P ¼ B0 þ B1 d þ B2 ðd dÞ
The dividend-plus-investment-opportunities
ð2:49Þ
approach, which, as we said earlier, uses the
retained-earnings figure as a growth proxy, where
hinting again at an all internally financed firm, is
symbolized by Eq. (2.47) below: P = price per share/book value;
P ¼ a0 þ a1 D þ a2 ðY DÞ: ð2:48Þ d ¼ five-year average dividend/book value;
d = current year’s dividend/book value;
The multiple correlation figures and the con- g ¼ five-year average retained earnings/book
stant terms are the same as in the previous test; this value;
might have been expected, since no new variables g = current year’s retained earnings/book value
entered, and none left the formulation, but the
relative dividend and retained-earnings coeffi- The multiple correlations from the use of this
cients do change in relative values. Actually, the model are slightly lower than those in the
retained-earnings coefficients were the same as the dividend-plus-growth case presented earlier, but
earnings coefficients in the earlier trial, but it is most noticeably the constant terms are now all
really the difference, or the ratio, of the two that we quite small, whereas, with the previous models,
are interested in. The dividend coefficients also they were at times quite large, although statistical
appear to be relatively more consistent from significance was not discussed. Only the
industry to industry and stable in relative terms five-year dividend is significant at a 5% level for
through time. The new-found stability and con- all eight regressions with six of the current less
sistency is appealing because we would like to the long-run average being significant and five of
believe stocks are priced in some logical and each of the growth coefficients being significant
Appendix 2: Cross-sectional Relationship … 51
also, although not as much so as the dividend paper. The first trial was run using Eq. (2.48) as
coefficients. the model, without the F-coefficient. For each
All things considered, the dividend factors industry, the coefficients are relatively stable
appear to be the predominant influences on share across time periods, and in two instances the
values, although there are certain individual retained-earnings coefficient is greater than the
exceptions among the industries surveyed. The dividend coefficient. In particular, the electrical
evidence presented here in the revised model and utility industry is seen to be such a case, and the
in the models presented earlier must be inter- work was then redone using logarithms, to see if
preted as supporting Gordon’s contention that the linearity assumption was responsible for this
dividends do matter; in fact, given the methods result. Unsurprisingly, the coefficients change in
used here, they are detected as being the most relative magnitudes; this is due to a combination
important variable that could be included in the of the utility industry’s high dividend yield and
valuation formula. the nature of the logarithmic transformation.
Friend and Puckett (1964) were concerned Friend and Puckett concede there is no reason to
with unveiling the limitations to the type of prefer one method versus the other, and leave the
analysis performed by Gordon and others. They functional form issue at that.
pointed out a number of potential problems, the Returning to the omitted-variable issue,
first being the accounting earnings and Friend and Puckett first included the previous
retained-earnings figures and the high degree of period’s price as the firm-specific variable. As a
measurement error incurred when using these result, the retained-earnings coefficients are
values as proxies for economic or true earnings. greater than the dividend coefficients, the latter
Again, it is assumed that investors value eco- being negative in six of ten instances. The mul-
nomic earnings and not accounting earnings per tiple R-squared statistics are quite high but, by
se. Risk measures are also missing from the and large, the significance levels of the
analysis, as is any awareness of the dynamic retained-earnings coefficients are not. Besides
nature of corporate dividend policy. In short, we having very little economic rationale for the
must realize not only a multicollinearity problem inclusion of a lagged price variable, the statistics
and potential specification errors, but that omit- leave sizable uncertainties as to the validity of
ted variables must also be accounted for, if the this approach.
analysis is to be complete. Recognizing that accounting numbers over the
In a general sense, a composite firm-specific short run are subject to sizable measurement error,
variable, E, could be included in the standard the authors normalized earnings over ten years
model, which contains dividends and retained and then ran the regressions again without a
earnings, yielding: firm-specific variable. This rendered all coeffi-
cients statistically significant at a 5% level, and
P ¼ a0 þ a1 D þ a2 R þ F: ð2:50Þ the dividend coefficients were consistently higher
than the retained-earnings coefficients. This sug-
But to have any true economic meaning, this gests the same conclusions and interpretation as
variable should be specified, so multiple statis- were offered by Gordon. Adding the normalized-
tical trials were run to see which economic earnings price ratio lagged one period, it was
variable would be best to include in the analysis. found that the intercept or constant terms get very
Attempts were also made to alter or adjust the large, as do the dividend coefficients relative to the
retained-earnings figure so as to more accurately retained-earnings coefficients, and the R-squared
reflect its “true” value, as discussed in a pre- term for each regression is exceptionally high. The
ceding paragraph. earnings/price coefficient was found to be highly
In running the statistical tests, five industry negative; all these findings yield questionable
groups were analyzed at two points in time, results. In a separate analysis of the chemical
along the same lines of reasoning as in Gordon’s industry using dividends and normalized earnings,
52 2 Multiple Linear Regression
and then further adding the normalized dividend and retained-earnings effects, it was
earnings/price ratios, the retained-earnings coeffi- found that in only four years was there any dif-
cient was seen to dominate the dividend coeffi- ference between the two effects; this leads to the
cient. Again, though, the other coefficients were conclusion that all models developed in linear or
given, and certainly appear to merit some log linear form, and used to test this particular
attention. industry (and possibly other industries as well)
In summation, Friend and Puckett argue for are probably misspecified. This is a serious
longer-term measures for variables of accounting problem, in that the importance of dividends is
nature and earnings-to-price ratios as a sort of muddied because the model is not correct. The
risk or firm-specific variable, in that it shows the true value of dividends can be inferred only if the
rate at which the market capitalizes the firm’s model is correctly specified.
earnings. In using this variable, it should be Part of the problem here is due to the nature of
recognized that the dependent variable is being the industry, where high payouts are common
used to explain itself. This tends to show the and external financing is great and unaccounted
retained-earnings figure as important as or more for. These two factors serve to bias the dividend
important in valuation than dividends, thus sup- effect downward in the linear-form models.
posedly invalidating previous results to the con- The logarithmic-form models reduce the
trary. The real question appears to be which problem of weighting of regression coefficients
approach possesses the best economic ground- due to size disparities among firms, a problem
ing. We would like to think theory precedes the noticeably existent in the electric utility industry.
empiricism; and to pursue this, one must have an However, it does have the disadvantage of being
appropriate theory, a question to which Friend unable to cope with negative retained-earnings
and Puckett do not address themselves. figures, a phenomenon that is a reality in the
Lee (1976) cited the Friend and Puckett evi- current environment.
dence in attacking the issue of functional form, These caveats to both methods of analysis
concentrating on the electric utility industry should evoke more concern from empirical
where the risk differential between firms is often researchers in the future, since misspecification
seen to be negligible, thereby eliminating a large can have drastic effects on the conclusions
portion of the firm-specific effects. Using the reached.
generalized functional form developed by
Box and Cox (1964), the linear and log linear
forms can be tested on a purely statistical basis, Bibliography
and the log linear form can be tested against a
nonlinear form. Box, G. E., & Cox, D. R. (1964). An analysis of
Quite interestingly, Lee found that the log transformations. Journal of the Royal Statistical
linear form was statistically superior to the Society: Series B (Methodological), 26(2), 211–243.
linear-form model in explaining the dividend Fogler, H. R., & Ganapathy, S. (1982). Financial
econometrics. Englewood Cliffs, NJ: Prentice-Hall.
effect. The results of this comparison were Friend, I., & Puckett, M. E. (1964). Dividends and stock
essentially the same as in Friend and Puckett’s prices. American Economic Review, 656–682.
study. Using the linear form, nine of the ten years Ghosh, S. K. (1991). Econometrics: Theory and applica-
of data examined showed a stronger tions. Englewood Cliffs, NJ: Prentice Hall.
Gordon, M. J. (1959). Dividends earnings and stock
retained-earnings effect than that for dividends, prices. Review of Economics and Statistics,
while, in the log linear trial, all ten years showed 99–105.
stronger dividend effects. At this point, the only Greene, W. H. (2017). Econometric analysis (8th ed.).
question remaining is whether either of these New Jersey: Prentice Hall.
Gujarati, D., & Porter, D. (2011). Basic econometrics (5th
models accurately depicts the true functional form. ed.). MHE.
Using the true functional form from the gen- Johnston, J., & Dinardo, J. (1996). Econometrics methods
eralized functional form method to compare the (4th ed.). New York: McGraw-Hill.
Bibliography 53
Lee, C. F. (1976). Functional form and the dividend effect in Lee, C. F., & Lee, J. C. (2015). Handbook of financial
the electric utility industry. Journal of Finance, 1481–1486. econometrics and statistics. New York, NY:
Lee, A. C., Lee, J. C., & Lee, C. F. (2017). Financial Springer.
analysis, planning and forecasting: Theory and Lee, C.-F., Lee, J., & Lee, A. C. (2013b). Statistics for
application (3rd ed.). Singapore: World Scientific. business and financial economics (3rd ed). Berlin:
Lee, C. F., Finnerty, J., Lee, J., Lee, A. C., & Wort, D. Springer.
(2013a). Security analysis, portfolio management, and Theil, H. (1971). Principles of econometrics. Toronto,
financial derivatives. World Scientific. NY: Wiley.
Lee, C. F., & Lee, A. C. (2013). Encyclopedia of finance Wooldrige, J. M. (2010). Econometric analysis of cross
(2nd ed.). New York, NY: Springer. section and panel data (2nd ed.). The MIT Press.