0% found this document useful (0 votes)

60 views

Multiple Regression

Uploaded by

safira laras

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views

Multiple Regression

Uploaded by

safira laras

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Multiple Linear Regression

Contents
2.1 Introduction......................................................................................................... 20

2.2 The Model and Its Assumptions ....................................................................... 20

2.3 Estimating Multiple Regression Parameters ................................................... 23

2.4 The Residual Standard Error and the Coefﬁcient of Determination ........... 24

2.5 Tests on Sets and Individual Regression Coefﬁcients..................................... 26

2.6 Conﬁdence Interval for the Mean Response and Prediction Interval
for the Individual Response............................................................................... 30

2.7 Business and Economic Applications ............................................................... 33

2.8 Using Computer Programs to Do Multiple Regression Analyses ................. 39

2.8.1 SAS Program for Multiple Regression Analysis ................................... 39

2.9 Conclusion ........................................................................................................... 47

Appendix 1: Derivation of the Sampling Variance of the Least Squares Slope

Estimations ................................................................................................................... 48

Appendix 2: Cross-sectional Relationship Among Price Per Share, Dividend Per

Share, and Return Earning Per Share...................................................................... 49

Bibliography ................................................................................................................. 52

Abstract
In this chapter, we ﬁrst discuss the basic the least squares slope estimations, and
assumption of multiple regression. We then Appendix 2 shows how multiple regression
specify the model of regression and show how can be used to investigate the cross-sectional
this regression coefﬁcient can be estimated. relationship among price per share, dividend
Appendix 1 derives the sampling variance of per share, and return earning per share.

© Springer Science+Business Media, LLC, part of Springer Nature 2019 19

C.-F. Lee et al., Financial Econometrics, Mathematics and Statistics,
https://doi.org/10.1007/978-1-4939-9429-8_2
20 2 Multiple Linear Regression

2.1 Introduction price per share, dividend per share, and retained
earnings per share.
The simple regression model is examined with *For basic concepts and models of simple
one independent variable (such as amount of regression, readers can refer to Chap. 15 of
fertilizer) and one dependent variable (such as Statistics for Business and Financial Economics
yield of corn). In many cases, however, more by Lee et al. (2013).
than one factor can affect the outcome under
study. In addition to fertilizer, rainfall and tem-
perature certainly influence the yield of corn. In 2.2 The Model and Its Assumptions
business, not only rates of return for the stock
market at large affect the return on General In this section, we first review the simple re-
Motors or Ford stock. Other variables, such as gression model and extend it to a multiple
leverage ratio, payout ratio, and dividend yield, regression model. Then we define and analyze
also contribute. Therefore, regression analysis the regression plane for two independent vari-
with more than one independent variable is an ables. Finally, the important assumptions we
important analytical tool. must make to use the multiple regression model
The model that extends a simple regression to are explored in some detail.
use with two or more independent variables is
called a multiple linear regression. Simple linear The Multiple Regression Model
regression analysis helps us determine the rela- In multiple regression, simple regression is
tionship between two variables or predict the extended by introducing more than one inde-
value of one variable from our knowledge of pendent variable. It is well known that a simple
another. Multiple regression analysis, in contrast, linear regression model can be defined as Yi ¼
is a technique for determining the relationship a þ bxi þ ei and its estimate as yi ¼ a þ bxi þ ei .
between a dependent variable and more than one The sample intercept a and the sample slope
independent variable. In addition, it can be used b are estimates for a and b, respectively.
to employ several independent variables to pre-
dict the value of a dependent variable. The normal equations used to estimate
In Sect. 2.2, we discuss the assumptions of unknown parameters a and b are
the multiple regression model. In Sect. 2.3, we X
n X
n
discuss estimated parameters of the multiple na þ b xi ¼ yi
regression model. Section 2.4 discusses the i¼1 i¼1

standard error of the residual estimate and the

and
coefficient of determination. Section 2.5 investi-
gates tests on sets and individual regression X
n X
n X
n
a xi þ b x2i ¼ xi yi
coefficients. Section 2.6 discussed the confidence i¼1 i¼1 i¼1
interval for the mean response and prediction
interval for the individual response. In Sect. 2.7, The foregoing equations from simple linear
we consider business and economic applications. regression are the starting point for our explo-
In Sect. 2.8, we use computer programs to do ration of multiple regression in this chapter.
multiple regression analyses. In Sect. 2.9, we Suppose an individual’s annual salary (Y) de-
conclude this chapter. Appendix 1 shows the pends on the number of years of education (X1 )
derivation of the sampling variance of the least and the number of years of work experience (X1 )
squares slope estimations, and Appendix 2 dis- the individual has had. The population regression
cusses the cross-sectional relationship among model is
2.2 The Model and Its Assumptions 21

Yi ¼ a þ b1 X1i þ b2 X2i þ i ð2:1Þ firm. (Retained earnings per share equals earn-
ings per share minus dividend per share.) The
and its estimate is first goal of the analysis is to obtain the estimated
multiple regression model
yi ¼ a þ b1 x1i þ b2 x2i þ ei ð2:2Þ
^yi ¼ a þ b1 x1i þ b2 x2i ð2:2aÞ
where Eqs. (2.1) and (2.2) represent the multiple
population regression line and the multiple The value of b1 indicates that after the influ-
sample regression line, respectively. In Eq. (2.1), ence of the retained earnings per share is taken
a is the intercept of the regression; b1 is the slope into account, a $1 increase in the dividend per
that represents the conditional relationship share (Di ) will increase the mean value of the
between Y and X1 , assuming X2 is fixed; and b2 price per share ðPi Þ by b1 other things being
is the slope that represents the conditional rela- equal. Similarly, a $1 increase in retained earn-
tionship between Y and X2 , assuming X1 is fixed. ings per share will increase the mean price per
If the model defined in Eq. (2.1) is linear, then share by b2 . If there is only one explanatory
the relationship between Y and each of the variable, the estimated regression equation gen-
independent variables can be described by a erates a straight line. There are two explanatory
straight line. In other words, the conditional variables in Eq. (2.2a), so it represents a regres-
mean of the dependent variable is given by the sion plane (three-dimensional regression graph).
following population regression equation: On this three-variable regression plane, a com-
bination of three observations (one for the value
EðYi jX1 ¼ x1 ; X2 ¼ x2 Þ ¼ a þ b1 x1 þ b2 x2 of y, one for x1 , and one for x2 ) represents a
single point. This point can be depicted on a
The coefficients b1 and b2 are called partial three-dimensional scatter diagram. In Fig. 2.1,
regression coefficients. They indicate only the the best-fitted regression plane would pass near
partial influence of each independent variable the actual sample observation points indicated by
when the influence of all other independent the symbol , some falling above the plane and
variables is held constant. Just as in simple some below in such a way as to minimize L in
regression, the multiple sample regression line of
Eq. (2.2) can be used to estimate the multiple X
n
L¼ ðyi ^yÞ2 ð2:3Þ
population regression line of Eq. (2.1).
i¼1
The Regression Plane for Two Explanatory
Variables where yi and ^yi are as defined in Eqs. (2.2) and
Let us say that the stock price per share (y) can be (2.2a), respectively.2
modeled as a function of both dividend per share If there are k independent variables, then
(x1 ) and retained earnings (x2 ) per share.1 Eq. (2.1) can be generalized to

yi ¼ a þ b1 x1i þ b2 x2i þ ei Yi ¼ a þ b1 X1i þ b2 X2i þ þ bk Xki þ i

ð2:4Þ
where
yi ðPi Þ = stock price per share for the ith ﬁrm, The following section explains how regres-
X1i ðDi Þ = dividend per share for the ith ﬁrm, and sion parameters are estimated via the least
x2i ðREi Þ = retained earnings per share for the ith squares estimation method.

1 2
Practical examples based on Eq. (2.2) will be explored in Using Eq. (2.3) to estimate regression parameters will be
the Applications section of this chapter. discussed in Sect. 2.3.
22 2 Multiple Linear Regression

Fig. 2.1 Regression plane with yi ðPi Þ as dependent variable and with X1i ðDi Þ and x2i ðREi Þ as independent variables

Assumptions for the Multiple Regression Model zero: Covðei ; ei Þ ¼ 0 for i 6¼ j. This assump-
As in simple regression analysis, we need five tion means that there is no autocorrelation
assumptions to perform a regression analysis of (serial correlation) among residual terms.
the model defined in Eq. (2.4). 4. The independent variables are not perfectly
related to each other in a linear function. In
1. The error term ei is distributed with condi- other words, it is not possible to find a set of
tional mean zero and variance r2 for i = 1, 2, numbers d0 ; d1 ; d2 ; . . .; dk such that
… , n.
2. The error term ei is independent of each of the d0 þ d1 X1i þ d2 X2i þ þ dk Xki ¼ 0;
k independent variables X1 ; X2 ; . . .; Xk . In other i ¼ 1; 2; . . .; n
words, there are no measurement errors asso-
ciated with any independent variable. If there
exists a measurement error of any independent In practice, the linear relationship among
variable, then the slope estimate will be biased. independent variables is usually not perfect.
In Chap. 7 of this book, we will discuss alter- When a perfect linear relationship occurs, a
native methods to deal this problem. condition known as perfect collinearity exists.
3. Any two errors ei and ei are not correlated Multicollinearity is the condition in which two
with one another; that is, their covariance is variables are highly correlated.
2.3 Estimating Multiple Regression Parameters 23

2.3 Estimating Multiple Regression There are two equations and two unknowns,
Parameters b1 and b2 , associated with this equation system.
Hence, we can solve b1 and b2 uniquely by
To estimate the best-ﬁtted regression plane, we substitution.
use the least squares method to estimate the Pn 0 0
P n
02
Pn 0 0
Pn 0 0
i¼1 x1i yi i¼1 x2i i¼1 x2i yi i¼1 x1i x2i
regression parameters. The principle of using the b1 ¼ Pn 02 Pn 02 Pn 0 0 2
least squares method for estimating the parame- i¼1 x1i i¼1 x2i i¼1 x1i x2i

ters of one population regression model is ð2:7Þ

demonstrated in Eq. (2.3) and Fig. 2.1. Taking Pn 02
P n 0 0
Pn 0 0
P n 0 0

i¼1 x1i i¼1 x2i yt i¼1 x1i x2i i¼1 x1i yi
Eq. (2.2) as an example, we estimate the coefﬁ- b2 ¼ Pn 02 Pn 02 Pn 0 0 2
i¼1 x1i i¼1 x2i i¼1 x1i x2i
cients a, b1 , and b2 by minimizing
ð2:8Þ
X
n X
n
L¼ e2i ¼ ðyi a b1 x1i b2 x2i Þ2 From the estimated b1 and b2 , we obtain the
i¼1 i¼1
estimated regression line
The normal equations for estimating a, b1 , and ^y0i ¼ b1 x01i þ b2 x02i ð2:9Þ
b2 are3
It can be shown that the intercept of Eq. (2.2)
X
n X
n X
n
is estimated as5
na þ b1 x1i þ b2 x2i ¼ yi
i¼1 i¼1 i¼1
X
n X
n X
n X
n
a x1i þ b1 x21i þ b2 x1i x2i ¼ x1i yi a ¼ y b1x1 b2x2 ð2:10Þ
i¼1 i¼1 i¼1 i¼1
Xn Xn X
n Xn
a x2i þ bi x1i x2i þ b2 x22i ¼ x2i yi Example 2.1 (Annual Salary, Years of Educa-
i¼1 i¼1 i¼1 i¼1 tion, and Years of Work Experience)
ð2:5Þ Let us use the hypothetical data given in
Table 2.1 to demonstrate the procedure for esti-
If we substitute ðx1i xÞ, ðx2i x2 Þ, and mating a multiple regression. In Table 2.1,
ðy1 yÞ for x1i , x2i , and y1 , then the normal y represents an individual’s annual salary (in
equations reduce to4 thousands of dollars), x1 represents that individ-
ual’s years of education, and x2 represents her or
X
n X
n X
n
his years of work experience.
b1 x02
1i þ b2 x01i x02i ¼ x01i y0i
i¼1 i¼1 i¼1
From the data of Table 2.1, we estimate the
ð2:6Þ regression line
X
n X
n X
n
b2 x01i x02i þ b2 x02
2i ¼ x02i y0i
i¼1 i¼1 i¼1 ^yi ¼ a þ b1 x1i þ b2 x2i ð2:11Þ

The worksheet for estimating this regression

line is given in Table 2.2. (This table is included
3
Equation (2.5) is a three-equation simultaneous equation
system with three unknowns. The values of these three 5
Using the deﬁnitions of ^y0i ; x01i , and x02i , we can rewrite
unknowns can be obtained by solving this system of Eq. (2.9) as
simultaneous equations, by using the formula derived in
this section, or by using an appropriate computer package ð^yi yÞ ¼ b1 ðx1i x1 Þ þ b2 ðx2i x2 Þ
(see Sect. 2.8).
Pn P
4
PIn this new coordinatePn system, i¼1 P x1i ; ni¼1 x2i , and which becomes
n n
i¼1 yi becomeP i¼i ðx1i x1 Þ ¼ 0, i¼1 ðx2i x2 Þ ¼
ð2:90 Þ
n
0, and i¼1 ðyi
yÞ ¼ 0. If we set ^yi ¼ ðy b1 x1 b2 x2 Þ þ b1 x1i þ b2 x2i
x01i ¼ x1i x1 ; x02i ¼ x2i x2 , and y0i ¼ yi y, then
Eqs. (2.5) reduce to Eqs. (2.6).
24 2 Multiple Linear Regression

Table 2.1 Data for Example 2.1 Hence, the regression line of Eq. (2.11) becomes
x1i x2i yi
^yi ¼ 0:980 þ 1:2385x1i þ 0:9925x2i ð2:12Þ
5 7 15
10 5 17 The next section shows how to compute
9 14 26 standard errors of estimates and the coefficients
13 8 24 of determination.
15 6 27
Total 52 40 109.0
Mean 10.4 8 21.8 2.4 The Residual Standard Error
and the Coefficient
of Determination
to show how computers calculate mean, vari-
As in the case of simple regression, the standard
ance, and covariance. You do not need to
error of estimate can be used as an absolute
remember the procedure.)
measure, and the coefficient of determination as a
Substituting information from Table 2.2 into
relative measure, of how well the multiple
Eqs. (2.7), (2.8), and (2.10), we obtain
regression equation fits the observed data.

^ ð62:4Þð50Þ ð36Þð11Þ 3516 The Residual Standard Error

b1 ¼ ¼ ¼ 1:2385
ð59:2Þð50Þ ð11Þ2 2839 Just like simple regression, multiple regression
^ ð59:2Þð36Þ ð11Þð62:4Þ 2817:6 can be used to break down the total variation of a
b2 ¼ ¼ ¼ 0:99246
ð59:2Þð50Þ ð11Þ2 2839 dependent variable yi into unexplained variation
^
a ¼ 21:8 ð1:2385Þð10:4Þ ð0:99246Þð8Þ ¼ 0:980 and explained variation.

Table 2.2 Worksheet for estimating a regression line (Example 2.1)

x1i x2i y a b c aa bb cc
5 7 15 −5.4 −1 −6.8 29.16 1 46.24
10 5 17 −0.4 −3 −4.8 0.16 9 23.04
9 14 26 −1.4 6 4.2 1.96 36 17.64
13 8 24 2.6 0 2.2 6.76 0 4.84
15 6 27 4.6 −2 5.2 21.16 4 27.04
Mean 10.4 8 21.8
Total 52 40 109 0 0 0 59.2 50 118.8
ðx1i x1 Þðyi yÞ ðx2i x2 Þðyi yÞ ðx1i x1 Þðyi yÞ
ac bc ab
36.72 6.8 5.4
1.92 14.4 1.2
−5.88 25.2 −8.4
5.72 0 0
23.92 −10.4 −9.2
Total 62.4 36 −11
2.4 The Residual Standard Error … 25

P
n P
n
ðyi yÞ2 ðy ^yi Þ2 Hence, rffiffiffiffiffiffiffiffiffiffiffiffiffiffi
5:7912
i¼1 i¼1
se ¼ ¼ 1:7016
Sum of Squares ¼ Sum of Squares 53
Total Error
se is one of the important components in
ðSST) ðSSEÞ determining the distribution of estimated a, b1 ,
P
n
and b2 and fitted dependent variable (^y).
ð^yi yÞ2
i¼1
þ Sum of Squares The Coefficient of Determination
due to Regresssion We can use Eq. (2.13) to calculate a relative
ðSSRÞ measure of the goodness of fit for a multiple
regression.
ð2:13Þ
Pn
ð^yi yÞ2
The estimated dependent variable (y1 ) of R ¼ Pi¼1
2
n
multiple regression is determined by two or more ðyi yÞ2
i¼1
independent variables. SSR and SSE are the explained variation of y ðSSR) ð2:15Þ
¼
explained and unexplained sums of squares, total variation of y ðSST)
respectively. SSE
¼1
Using the definition of sum of squares error, SST
we can define the estimate of the standard devi-
ation of error terms, sometimes called the resid- The coefficient of determination R2 is the
ual standard error, as proportion of total variation in y (SST) that is
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi explained by the intercept and the independent
Pn
i¼1 ðyi ^ yi Þ2 variable x1 and x2 . Note that both R2 and se can
se ¼ ð2:14Þ be used to measure the goodness of fit for a
n3
regression. However, R2 is a relative measure and
Because there are three parameters—a, b1 , se an absolute measure. Now we use the ANOVA
and b2 —for Eq. (2.2) that we must estimate table given in Table 2.4 to calculate the rela-
before calculating the residual, the number of tionship between R2 and se for the general mul-
degree of freedom is ðn 3Þ. In other words, tiple regression model in Eq. (2.4).
ðn 3Þ sample values are “free” to vary. More There are four columns in Table 2.4. Column
generally, the number of degrees of freedom for (1) represents the sources of variation, column
estimating the residual standard error for (2) alternative sums of squares that are identical
Eq. (2.4) is ½n ðk þ 1Þ. to those discussed in Eq. (2.13), column (3) de-
grees of freedom associated with each source of
Example 2.2 (Computing yi , ei , and e2i )
Table 2.3 Actual values, predicted values, and residuals
Using the data presented in Example 2.1, we
for annual salary regression
can estimate yi , ei , and e2i as shown in Table 2.3.
Actual Predicted Residuals
Here ^yi is obtained by substituting x1i and x2i value, yi value, ^y1
into Eq. (2.12). For example, 14.1198 = 0.980 + ei e2i
1.2385(5) + 0.9925(7); êi ¼ yi ^yi . 15 14.1198 0.8802 0.7748
17 18.3272 −1.3272 1.7615
X
5
2 2
ðyi ^yi Þ ¼ ð15 ¼ 14:1198Þ þ ð17 18:3272Þ 2 26 26.0209 −0.0209 0.0004
i¼1 24 25.0200 −1.0200 1.0404
þ ð26 26:0209Þ2 þ ð24 25:0200Þ2 27 25.5120 1.4880 2.2141
þ ð27 25:5120Þ2 ¼ 5:7912 Total 109 − − 5.7912
26 2 Multiple Linear Regression

Table 2.4 Notation of analysis of variance table Table 2.5 Analysis of variance results
(1) (2) (3) (4) Source of Sum of Degrees of Mean
Source of Sum of squares Degrees of Mean variation squares freedom square
variation freedom square Due to 113.0088 k=2 56.5044
Due to P
n k SSR/k regression
SSR = ð^yi yÞ2
regression i¼1
Residual 5.7912 5–2−1=2 2.8956
Residual P
n n–k−1 SSR/
SSE = ðyi ^yi Þ2
(n – k −
Total 118.8 5−1=4 29.7
i¼1
1)
Total P
n n−1 SST/
SST = ðyi yÞ2
i¼1 (n − 1)

so the adjusted coefﬁcient of determination

variation, and column (4) the mean squares. Note can be redefined as
that alternative mean squares represent alterna-
tive variance estimates. Mean square due to the 2 unexplained variance
R ¼1 ð2:160 Þ
regression is also called explained variance; total variance
mean square due to the residuals is also called
unexplained variance; and mean square due to Using the example of the last section, we can
the total variation can also be called variance of calculate the analysis of variance of Table 2.4 as
the dependent variable. Using those estimates, shown in Table 2.5.
2
we can obtain an adjusted (or corrected) coeffi- From Table 2.5, we can calculate R2 and R .
2
cient of determination R .
R2 ¼ 113:0088=118:8 ¼ 0:95125
2 SSE/ðn k 1Þ 2 2:8956
R ¼1 R ¼1 ¼ 1 0:09749 ¼ 0:90251
SST/ðn 1Þ 29:7
n1
¼ 1 ð1 R2 Þ ð2:16Þ 2
nk1 Both R2 and R imply that more than 90% of
the variation of annual salary can be explained by
2 2
The difference between R2 and R is that R is years of education and years of work experience.
adjusted for degrees of freedom for both SSE and 2
However, R is 4.874% smaller than that of R2 .
2 2
SST. R is always smaller than R . If the sample
2
size becomes large, however, R approaches R2 .
2
R can generally help us avoid overestimating the
2.5 Tests on Sets and Individual
goodness of fit for a regression relationship by
Regression Coefficients
adding more independent variables (relevant or
After having estimated the regression model, we
not) to a regression equation. Note that the
would like to know whether the dependent
standard error of estimate (Eq. 2.14) also has
variable is related to the independent variables.
been adjured for the degrees of freedom
To find out, we can test whether an individual
(n – k − 1).
regression coefficient or a set of regression
If we divide components in Eq. (2.13) by their
coefficients is significantly different from zero.
related degrees of freedom, then it can be shown
The t-statistic is to test an individual coefficient,
that
and the F-statistic to test linear restrictions on the
Pn Pn Pn parameters or regression coefficients. For this
ðyi yÞ2 ðyi ^yi Þ2 ð^yi yÞ2
i¼1
n1 6¼ i¼1
nk1 þ i¼1
k purpose, we need to assume that ei is normally
Total Unexplained Explained distributed.
Variance Variance Variance Logically, we perform the joint test first. If the
ð2:17Þ joint test is not significant, then there is no need
2.5 Tests on Sets and Individual Regression Coefficients 27

for the individual tests, and we normally abandon the ^yi will be close or equal to y, so the F-value
or modify the model. If the joint test is rejected, will be smaller or close to zero. Thus, we cannot
we must find out which regression coefficients reject the null hypothesis that ail regression
are significant, so we perform individual tests. coefficients are insignificantly different from
zero.
Test on Sets of Regression Coefficients
Substituting related data from Table 2.5 into
Until now, our discussion has been limited to
Eq. (2.19), we obtain
point estimation of multiple regression coeffi-
cients, the coefficient of determination, and the 113:0088=2 56:5044
standard error of estimate. Now we will discuss F¼ ¼
5:7912=2 2:8956
how to use the F-statistic to test whether all true
population regression (slope) coefficients equal ¼ 19:514
zero. The F-test rather than the t-test is used. The
null hypothesis for our case is
From F-distribution critical value table, we
find that the critical value for a significance level
H0 : b1 ¼ b2 ¼ ¼ bk ¼ 0 of a = 0.05 is F0.05,2.2 = 19.0, which is smaller
ð2:18Þ
H1 : At least one b is not zero: than 19.514. Therefore, we can conclude that at
least one of the regression coefficients is signif-
If the null hypothesis is not true, then each ^yi icantly different from zero. Thus, there is a
will differ from y substantially, and the explained regression relationship in the population, and the
P
variation ni¼1 ð^yi yÞ2 will be large relative to improvement of explanatory power achieved by
the unexplained residual variation fitting a regression plane is not due to chance. In
Pn other words, the null hypothesis that years of
i¼1 ðyi ^
2
yi Þ . In other words, the R indicated
2

in Eq. (2.15) is relatively large. Thus, we can education and years of work experience con-
construct the F ratio as indicated in Eq. (2.19) to tribute nothing to an individual’s annual salary is
test whether the null hypothesis can be rejected. rejected at a 5 percent level of significance.
Finally, the relationship between the R2 indi-
Pn cated in Eq. (2.15) and the F-statistic in
ð^yi yÞ2 =k Eq. (2.19) can be shown to be6
Fk;nk1 ¼ Pn i¼1
ð2:19Þ
i¼1 ðyi ^yi Þ2 =ðn k 1Þ
n k 1 R2
Fk;nk1 ¼ :
The F ratio we have constructed is the ratio of k 1 R2
two mean square errors, as we noted in the last
section, and they are two unbiased estimates of Hypothesis Tests for Individual Regression
variances. Following the definition of the F dis- Coefficients
tribution, we know that the F ratio has an F dis- In the last section, we used the F-statistic to do a
tribution with k and (n – k − 1) degrees of joint test about a regression relationship. Now we
freedom. This F ratio enables us to test whether want to use the t-statistic to test whether multiple
at least one of the regression coefficients is sig- regression coefficients are significantly different
nificantly different from zero. from zero.
Consider the case k = 2. If there is no
regression relationship (i.e., if b1 = b2 = 0) and
because
Because R2 ¼ 1SSE=SST ¼ SSR=SST,
6

^yi ¼ a þ b1 x1i þ b2 x2i

R2 SSR=SST SSR
¼ y þ b1 ðx1i x1 Þ þ b2 ðx2i x2 Þ ¼ ¼
1 R2 SSE=SST SSE
28 2 Multiple Linear Regression

Hypothesis Testing Specification influence of all other independent variables in the

We follow the procedure of the last chapter to model is taken into account.
define the null hypothesis and alternative
hypothesis for testing individual multiple Performing the t-test for Multiple Regression
regression coefficients. Slopes
To perform the t-test for multiple regression
1. Two-tailed test coefficients b1 and b2 , we estimate the sample
variance of the coefficients b1 and b2 in accor-
H0 : bj ¼ 0 ðj ¼ 1; 2; . . .; kÞ dance with Eqs. (2.23) and (2.24)7
ð2:20Þ
H1 : bj 6¼ 0
s2e
Var(b1 Þ ¼ s2b1 ¼ P
2. One-tailed test ð1 r 2 Þ i¼1 ðx1i x1 Þ2
n
Pn 02
s2e i¼1 x2i
H0 : bj ¼ 0 ðj ¼ 1; 2; . . .; kÞ ¼ Pn P Pn 0 0 2
ð2:21Þ 02 n 02
i¼1 x1i i¼1 x2i i¼1 x1i x2i
H1 : bj [ 0 or bj \0
ð2:23Þ
Let us look at Eq. (2.12) as an example. For s2e
Varðb2 Þ ¼ s2b2 ¼ P
convenience, the estimated regression line is ð1 r 2 Þ i¼1 ðx2i x2 Þ2
n
Pn 02
repeated here. s2e i¼1 x1i
¼ Pn 02 Pn 02 Pn 0 0 2
i¼1 x1i i¼1 x2i i¼1 x1i x2i
^yi ¼ 0:980 þ 1:2385x1 þ 0:9925x2
ð2:24Þ
In this equation, besides the estimated inter-
cept (a) and slopes (b1 and b2 ), we have esti- where r represents the correlation coefficient
mated the standard error of estimate for yi as between x1i and x2i . If the magnitude of r is great,
se = 1.7016. To perform the null hypothesis test, a collinearity problem might exist. This issue will
we need to know the sample distribution of bj be discussed in detail in the next chapter.
and the t-statistic as defined in the equation Substituting the required numerical values
obtained from Tables 2.2 and 2.3, we calculate
tnk1 ¼ ðbj 0Þ=sbj ð2:22Þ sample variances of b1 and b2 for Eq. (2.16).

ð2:8956Þð50Þ
where tnk1 represents a t-statistic with S2b1 ¼
ðn k 1Þ degrees of freedom, k = the number ð59:2Þð50Þ ð11Þ2
of independent variables, and Sbj represents the ð2:8956Þð50Þ
¼
standard error associated with bj. The concepts 2839
¼ 0:05100
and procedure used to calculate Sbj are similar to
those used for simple regression. However, Sbj is
and
quite tedious to calculate by hand; fortunately, its
value is readily available in the computer output ð2:8956Þð59:2Þ
of any standard regression analysis program. s2b2 ¼
2839
Thus in practice, we find t simply by finding the ¼ 0:06038
ratio of the coefficient to its estimated standard
error. When the calculated value of t exceeds the
critical value ta;nk1 indicated in the t-distribu- 7
Derivations of Eqs. (2.23) and (2.24) can be found in
tion table, the null hypothesis of no significance Appendix 1. Note that these two equations are generally
can be rejected. We conclude that the jth inde- estimated by computer packages (see Sect. 2.8). Manual
approaches are presented here to show how sample
pendent variable xj does have an important variances of multiple regression slopes are actually
influence on the dependent variable yi after the calculated.
2.5 Tests on Sets and Individual Regression Coefficients 29

Then Sb1 ¼ 0:2258 and Sb2 ¼ 0:2457. Divid- ta;nk1 ¼ t0:05;2 ¼ 2:920
ing b1 and b2 by Sb1 and Sb2 , we obtain t-values
for b1 and b2 . We choose a one-tailed test because a priori
theoretical propositions were that both x1 and x2
1:2385
tb0 1 ¼ ¼ 5:4849 were positively related to y. Comparing 5.4849
0:2258 and 4.0395 with 2.920, we conclude that both
0:9925
tb2 ¼ ¼ 4:0395 years of education and years of work experience
0:2457 are significantly related to an individual’s annual
Because n ¼ 5 and k ¼ 2, from t-distribution salary.
critical value table, the critical value for a Figure 2.2 presents all the estimates and
one-tailed test on either coefficient (at a signifi- hypothesis testing information we have discussed
cance level of a = 0.05) is in the last three sections. This example certainly

Fig. 2.2 MINITAB output of multiple regression in terms of data given in Table 2.1
30 2 Multiple Linear Regression

proves that multiple regression analysis can be ^yn þ 1 ¼ a þ b1 x1;n þ 1 þ b2 x2;n þ 1 ð2:27Þ
more efficiently performed by using the MINI-
TAB computer program. This is the best point estimate for both con-
ditional expectation and actual-value forecasts. In
other words, the forecast of conditional expec-
2.6 Confidence Interval tation value is equal to the forecast of actual
for the Mean Response value. However, the forecasts are interpreted
and Prediction Interval differently. The importance of these different
for the Individual Response interpretations will emerge when we investigate
the process of making interval estimates.
Point Estimates of the Mean and the Indi-
vidual Responses One of the important uses of Interval Estimates of Forecasts
the multiple regression line is to obtain predic- To construct a confidence interval for forecasts, it
tions and forecasts for the dependent variable, is necessary to know the distribution, mean, and
given an assumed set of values of the indepen- variance of ^yn þ 1 . The distribution of ^yn þ 1 is a t-
dent variables. This kind of prediction is called distribution with (n − 3) degrees of freedom. The
the conditional prediction (forecast), just as in variance associated with ^yn þ 1 may be classified
simple regression. Suppose the independent into three cases. First, we deal with a case in
variables are equal to some specified values which the conditional mean (^yn þ 1 ) is equal to the
x1;n þ 1 and x2;n þ 1 , and that the linear relationship unconditional mean (y). In the second and third
among yn , x1;n , and x2;n continues to hold.8 Then cases, we deal with the conditional mean. How-
the corresponding value of the dependent vari- ever, case 2 involves the mean response and case
able Yn þ 1;i is 3 the individual response.

Yn þ 1;i ¼ a þ b1 x1;n þ 1;i þ b2 x2;n þ 1;i þ en þ 1;i Case 2.1 [Conditional Expectation (Mean
ð2:25Þ Response) with x1;n þ 1 ¼ x1 and x2;n þ 1 ¼ x2 ]
From the deﬁnitions of the intercept of a
which, given x1;n þ 1 and x2;n þ 1 , has expectation regression and the sample regression line, we have

EðYn þ 1 jx1;n þ 1; x2;n þ 1 Þ ¼ a þ b1 x1;n þ 1 þ b2 x2;n þ 1 ^yn þ 1 ¼ ðy b1x1 b2x2 Þ þ b1 x1;n þ 1 þ b2 x2;n þ 1
ð2:26Þ ¼ y þ b1 ðx1;n þ 1 x1 Þ þ b2 ðx2;n þ 1 x2 Þ

Equation (2.26) yields the mean response If x1;n ¼ x1 and x2;n ¼ x2 , then ^yn þ 1 ¼ y. We
EðYn þ 1 jx1;n þ 1; x2;n þ 1 Þ that we want to estimate can obtain the estimate of the variance for yn þ 1
when the independent variables are fixed at as
x1;n þ 1 and x2;n þ 1 . Equation (2.25) yields the
actual value (or individual response) that we s2 ð^yn þ 1 Þ ¼ s2 ðyÞ ¼ s2e =n ð2:28Þ
want to predict.
To obtain the best point estimate, we first
estimate the sample regression line as defined in Case 2.2 [Conditional Expectation (Mean
Eq. (2.2). Then we substitute the given values Response) with x1;n þ 1 6¼ x1 and x2;n þ 1 6¼ x2 ]
x1;n þ 1 and x2;n þ 1 into the estimated Eq. (2.12), In this case, the forecast value can be defined as
obtaining
^yn þ 1 ¼ y þ b1 ðx1;n þ 1 x1 Þ þ b2 ðx2;n þ 1 x2 Þ
ð2:29Þ
8
x1;n þ 1 , and x2;n þ 1 , can be either given values or
forecasted values. When a regression is used to describe
a time-series relationship, they are forecasted values.
2.6 Confidence Interval for the Mean Response … 31

We therefore obtain the estimate of the vari- ^yn þ 1 ðtn3;a=2 ÞS1 ð2:33Þ
ance for ^yn þ 1 in terms of sample standard vari-
ance of estimates S2e as where s1 is defined in Eq. (2.30).
3. For prediction of the actual value yn þ 1 , the
s21 ¼ s2 ð^yn þ 1 Þ
" prediction interval is
2 1 ðx1;n þ 1 x1 Þ2 ðx2;n þ 1 x2 Þ2
¼ se þ þ ^yn þ 1;i ðtn3;a=2 ÞS2 ð2:34Þ
n ð1 r 2 ÞC12 ð1 r 2 ÞC22

2ðx1;n þ 1 x1 Þðx2;n þ 1 x2 Þr
where S2 is defined in Eq. (2.31).
ð1 r 2 ÞC1 C2 To show how Eq. (2.34) is applied in con-
ð2:30Þ structing the confidence interval for forecasting
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi the actual value of yn þ 1 , let us use the annual
Pn salary example (Table 2.2) to find the 95% pre-
where C1 ¼ i¼1 ðx1;j x1 Þ2 C2 ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn diction interval for annual salary, yn þ 1 , when a
i¼1 ðx2;i x2 Þ2 and r = correlation coeffi- person has 6 years of education and 5 years of
cient between x1;i and x2;i work experience. The predicted annual salary can
be computed from Eq. (2.12).
Case 2.3 [Actual Value (Individual Response) of
yn þ 1 ] ^yn þ 1;i ¼ 0:980 þ ð1:2385Þð6Þ þ ð0:9925Þð5Þ
After we have derived the sample variance for ¼ 13:3735 ðin thousands of dollarsÞ
^yn þ 1 , we derive the sample variance for indi-
vidual response (observation), yn þ 1;i (which From Table 2.2, we have
deviates from ^yn þ 1 by a random error ei ).
X
n
ðx1i x1 Þ2 ¼ 59:2;
yn þ 1;j ¼ ^yn þ 1 þ ei i¼1
X
n
The variance of an individual observation, ðx2i x2 Þ2 ¼ 50; n¼5
yn þ 1;i , includes the variance of the observation i¼1
Xn
about the regression line s2e as well as
ðx1i xi Þðx2i x2 Þ ¼ 11;
s2 ð^yn þ 1;i Þ. Because ^yn þ 1 and ei are independent, i¼1
s2 ðyn þ 1;j Þ ¼ s2 ð^yn þ 1 Þ þ s2e . More explicitly, x1 ¼ 10:4; x2 ¼ 8
s2 ð^yn þ 1;i Þ ¼ s21 þ s2e ¼ s22 ð2:31Þ
Using this information, we calculate
where s21 is defined in Eq. (2.30). pffiffiffiffiffiffiffiffiffi pffiffiffiffiffi
C1 ¼ 59:2 ¼ 7:6942; C2 ¼ 50 ¼ 7:0711
Using Eqs. (2.28), (2.30), and (2.31), we can obtain a Pn
i¼1 ðx1i x1 Þðx2i x2 Þ
confidence interval for prediction as follows: r ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn ffi
2 Pn
i¼1 ðx 1i
x 1 Þ i¼1 ðx 2i
x 2 Þ 2

1. For prediction of the conditional expectation

11
with x1;n þ 1 ¼ x1 and x2;n þ 1 ¼ x2 , the confi- ¼
ð7:6942Þð7:0711Þ
dence interval is
¼ 0:2022; r 2 ¼ 0:0409
se
^yn þ 1 tn3;a=2 pffiffiffi ð2:32Þ ðx1;n þ 1 x1 Þ2 ¼ ð6 10:4Þ2 ¼ 19:36
n
ðx2;n þ 1 x2 Þ2 ¼ ð5 8Þ2 ¼ 9
2. For prediction of the conditional expectation
with x1;n þ 1 6¼ x2 and x2;n þ 1 ¼ x2 , the confi-
dence interval is
32 2 Multiple Linear Regression
pffiffiffiffiffiffiffiffiffiffiffiffiffiffi
From Table 2.3, we have S2e ¼ 5:7912= We will use n = 5, s2 ¼ 5:2989 ¼ 2:3019,
ð5 3Þ ¼ 2:8956. Substituting this information and ^yn þ 1;i ¼ 13:3735. From t-distribution critical
into Eq. (2.31) yields value table, we have t2,025 = 4.303. Substituting
this information into Eq. (2.34), we find that the
1 19:36 annual salary is predicted with 95% confidence
s22 ¼ ð2:8956Þ 1 þ þ
5 ð1 0:0409Þð59:2Þ
by the interval
9 9
þ þ
ð1 0:0409Þð50Þ ð1 0:0409Þð50Þ
2ð0:2022Þð4:4Þð3Þ
13:3735 ð4:303Þð2:3019Þ ¼ 13:3735 9:9051

ð1 0:0409Þð7:6942Þð7:0711Þ 3:4684 yn þ 1;i 23:2786
¼ ð2:8956Þð1:83Þ ¼ 5:2989

Fig. 2.3 MINITAB output of yn þ 1;i

2.6 Confidence Interval for the Mean Response … 33

When n is large, we can modify this expres- Bobko and Donnelly estimated this multiple
sion by replacing t with the appropriate normal regression model using data obtained from
deviate z. interviews. Their regression results are presented
MINITAB output showing prediction results in Table 2.6. As would be expected, performance
of x1;n þ 1;i¼6 and x2;n þ 1;i¼5 is presented in level was the single best predictor of 95 estimates
Fig. 2.3. The prediction interval shown in the last of judgments of overall worth. The other
row of Fig. 2.3 is (3.466, 23.280), which is job-level correlates were combat probability,
similar to what we calculated before. enlistment bonus, reenlistment bonus, aptitude,
In the next two sections, we will explore cost of error, and task variety. The first six of
applications of multiple regression in business these predictors had statistically significant
and economics. Section 2.8 explicitly treats the regression weights (coefficients), p-value < 0.05,
use of SAS and MINITAB computer programs to indicating their unique contribution to the pre-
do multiple regression analyses. diction of overall worth estimates. However, task
variety was not statistically significant.

2.7 Business and Economic Application 2.2 The Relationship Between

Applications Individual Stock Rates of Return, Payout
Ratio, and Market Rates of Return
Multiple regression analysis has been widely used To demonstrate multiple regression analysis, a
in decision making in business and economics. time-series regression for 1970–2009 is run, the
Five examples are discussed in this section. dependent variable being the rate of return for the
JNJ stock (Rj,t) and the independent variables being
Application 2.1 Overall Job-Worth of Perfor-
the payout ratio (dividend per share/earnings per
mance for Certain Army Jobs
share) for JNJ (Pj,j) and the rates of return on the
Bobko and Donnelly (1988) employed multiple
S&P 500 index, Rm,t. The results are as follows:
regression to estimate overall job-worth to the
army of certain army jobs from attributes of those Rj;t ¼ aj þ cj Pj;t þ bj Rm;t þ j;t
jobs.9 Their final regression prediction model is
Fortunately, the results do not have to be cal-
yi ¼ b0 þ b1 x1i þ b2 x2i þ b3 x3i þ b4 x4i þ b5 x5i
culated by hand but can be obtained by using
þ b6 x6i þ b7 x7i þ ei MINITAB. The MINITAB results are presented in
Table 2.7. The parameter value for the market
where rates of return is 0.7329, which is called the beta
coefficient. A 1% increase in the market rate of
yi ¼ job-value judgments of overall worth for the return will lead to a 0.7329% change in the rate of
ith individual return of the JNJ stock, given the payout ratio, that
x1i ¼ performance level for the ith job is, the rate of return of JNJ stock is less volatile
x2i ¼ combat probability for the ith job than that of the market. The payout ratio has a
x3i ¼ enlistment bonus for the ith job coefficient of −0.2133. This result implies that a
x4i ¼ reenlistment bonus for the ith job 1% percent increase in the payout ratio will lead to
x5i ¼ aptitude required for entry into the ith job a 0.2133 percent decrease in the mean rate of
x6i ¼ cost of error for the ith job return on JNJ stock, given the market rate of return.
x7i ¼ job variety for the ith job The independent variables are statistically
significant at the 5% level. The t-value for the
market is 2.0880, and the associated p-value is
9
P. Bobko and L. Donnelly (1988), “Identifying Corre- 0.0437, which means that the lowest level of
lations of Job-Level, Overall Worth Estimates: Applica-
significance at which the null hypothesis can be
tion in a Public Sector Organization,” Human
Performance 3, 187–204. rejected is 4.37%. This suggests that the
34 2 Multiple Linear Regression

Table 2.6 Best subset regression of overall worth on job-level predictors

Source df Sum of squares F P
Regression 7 16.007 274.12 0.0001
Error (residual) 87 0.726
Variable Regression weight t-ratio p
Performance level 0.013 1666.22 0.0001
Combat probability 0.039 21.19 0.0001
Enlistment bonus 0.034 18.52 0.0001
Re-enlistment bonus 0.016 15.73 0.0001
Aptitude 0.013 26.01 0.0001
Cost of error 0.029 5.61 0.0201
Task variety 0.016 2.51 0.1166
Source Bobko and Donnelly (1988), Human Performance
Note Adjusted R2 = 0.953; n = 95 mean estimates of overall worth

Table 2.7 Rj,t = aj + cjPj,t + bjRm,t + ej,t

Variable Coefﬁcient Standard error t-value p-value
Constant 0.0777 0.2049 0.3793 0.7066
Payout ratio −0.2133 0.5613 −0.3800 0.7061
Market rate of return 0.7329 0.3510 2.0880 0.0437
R2 = 0.106
2
R ¼ 0:057
F-value = 2.184
Observations 40

population coefficient for the market is not equal table, we find that the critical value for the F-test
to zero. The t-statistic for the payout ratio, which is F0.01,2,30 = 5.39 and F0.01,2,40 = 5.18. Because
is calculated by dividing the parameter value the F-value for the regression is less than the
(−0.2133) by the standard error (0.5613), is critical value 5.39, the null hypothesis cannot be
−0.3800. Its p-value is 0.7061; thus, the null rejected.
hypothesis cannot be rejected. Application 2.3 Analyzing the Determination
R2 for the regression is 0.106. In other words, of Price per Share To further demonstrate
the independent variables explain about 10.6% of multiple regression techniques, let us say that a
the variation in the rate of return on JNJ stock. cross-sectional regression is run. In a
2
The adjusted R-square, R , which takes into cross-sectional regression, all data come from a
account overfitting in the sample, is equal to single period. The dependent variable in this
0.057. regression is the price per share (Pj) of the 30
The F-value, which tests the hypothesis that firms used to compile the Dow Jones Industrial
the population coefficients of the independent Average for the year 2009. The independent
variables are both zero against the alternative that variables are the dividend per share (DPSj) and
they are not, is equal to 2.184. The degrees of the retained earnings per share (EPSj) for the 30
freedom associated with this F-value are v1 = 2 firms. (Retained earnings per share is defined as
and v2 = 37. From F-distribution critical value earnings per share minus dividend per share.
2.7 Business and Economic Applications 35

Table 2.8 Pj = a + b1DPSj + b2 EPSj + ej fluctuations in the price per share. The adjusted
2
Variable Coefficient Standard t- p- R-square, R , is 0.703. The F-value for the
error value value
regression is 35.35. The numbers of degrees of
Constant 12.800 5.084 2.518 0.018
freedom for the regression and residual are 2 and
DPS 12.836 4.657 2.756 0.010
27, respectively. The critical value for F at a 1%
EPS 0.978 0.218 4.478 0.000 level of significance is 5.49. Because the
R2 = 0.724 regression F-value is greater than the critical
2
R ¼ 0:703 value, the null hypothesis that the coefficients are
F-value = 35.35 equal to zero is rejected.
Observations 30 The relationship among price per share, divi-
dend per share, and retained earning per share
will be discussed in Appendix 2.
Price per share is the close price of the end of Application 2.4 Multiple Regression Approach
year 2009; dividend per share and retained to Evaluating Real Estate Property
earnings per share are based on 2009 annual To show how the multiple regression technique
balance sheet and income statement.) The sample can be used by real estate appraisers, Andrews
regression relationship is and Ferguson (1986) used the data in Table 2.9
to do the multiple regression analysis
Pj ¼ a þ b1 DPSj þ b2 EPSj þ ej
ðj ¼ 1; 2; . . .; 30Þ yi ¼ b0 þ b1 x1i þ b2 x2i þ ei

where
Empirical results are presented in Table 2.8.
The constant term is significant with a t-value of yi ¼ sale price for ith house
2.518. This result means that the intercept term is x1i ¼ home size for ith house
statistically different from zero and the null x2i ¼ condition rating for ith house
hypothesis can be rejected at both a 10 and a 5%
level. The retained earnings per share variable is
highly significant with a t-value of 4.478 and a p- Table 2.9 Sale price, house size, and condition rating
value of 0.000. Thus, we can reject the null
Sale price, Home size, x1 Condition
hypothesis that the coefficient is equal to zero y (thousands of (hundreds of sq. rating, x2 (1–
and accept the alternative hypothesis that it dif- dollars) ft.) 10)
fers from zero and makes a contribution to price 60.0 23 5
per share. The coefficient for this variable is
32.7 11 2
0.978; mean price per share increases $0.978
57.7 20 9
when the retained earnings per share increases by
$1.00, given the dividend. 45.5 17 3
The coefficient for the dividend per share 47.0 15 8
variable has a t-value of 2.756 and a p-value of 55.3 21 4
0.010. This is the lowest level of significance at 64.5 24 7
which the null hypothesis can be rejected; thus, 42.6 13 6
the null hypothesis is rejected at both a 10 and a 54.5 19 7
5% level. The coefficient for dividend per share
57.5 25 2
is 12.836. When the dividend increases by $1.00,
Source R. L. Andrews and J. T. Ferguson, “Integrating
the price per share tends to rise by $12.836. Judgment with a Regression Appraisal.” The Real Estate
The value of R2 is 0.724, which means that the Appraiser and Analyst, Vol. 52, No. 2, Spring 1986
model explains 72.4% of the observed (Table 1)
36 2 Multiple Linear Regression

Fig. 2.4 MINITAB output for Table 2.9

MINITAB regression outputs in terms of ^yi ¼ 9:782 þ 1:87094x1i þ 1:2781x2i

Table 2.9 are presented in Fig. 2.4. From this ð6:00Þ ð24:56Þ ð8:85Þ
output, the estimated regression is
2.7 Business and Economic Applications 37

t-Values are in parenthesis. Application 2.5 Multiple Regression Approach

From t-distribution critical value table, we to Doing Cost Analysis
find that t0.025,7 = 2.365. Because t-values for To show how the multiple regression technique
three regression parameters are larger than 2.365, can be used to do cost analysis by accountants,
all estimated parameters are significantly differ- we look at Benston’s research. Benston (1966)
ent from 0 at a = 0.05. This estimated regression used a set of sample data (as shown in
can be used to estimate the sale price for a house. Table 2.10) from a firm’s accounting and pro-
For example, if x1 ¼ 18 and x2 ¼ 5, the pre- duction records to provide cost information about
dicted sale price is the firm’s shipping department to do the multiple
regression analysis
^yi ¼ 9:781 þ ð1:87094Þð18Þ þ ð1:2781Þð5Þ
¼ 49:8484 yt ¼ b0 þ b1 x1t þ b2 x2t þ b3 x3t þ et

This implies that the estimated sale price is where

$49,848.4 if the home size is 18,000 square feet
and the condition rating is 5. yt ¼ hours of labor in tth week
x1t ¼ thousands of pounds shipped in tth week

Table 2.10 Hours of labor and related factors cause costs to be incurred
Week Hours of Thousands of pounds Percentage of units shipped Average number of pounds per
labor, y shipped, x1 by truck, x2 shipment, x3
1 100 5.1 90 20
2 85 3.8 99 22
3 108 5.3 58 19
4 116 7.5 16 15
5 92 4.5 54 20
6 63 3.3 42 26
7 79 5.3 12 25
8 101 5.9 32 21
9 88 4.0 56 24
10 71 4.2 64 29
11 122 6.8 78 10
12 85 3.9 90 30
13 50 3.8 74 28
14 114 7.5 89 14
15 104 4.5 90 21
16 111 6.0 40 20
17 110 8.1 55 16
18 100 2.9 64 19
19 82 4.0 35 23
20 85 4.8 58 25
Source G. J. Benston (1966), “Multiple Regression Analysis of Cost Behavior,” Accounting Review, Vol. 41, No. 4,
657–672. Reprinted by permission of the publisher
38 2 Multiple Linear Regression

x2t ¼ percentage of units shipped by truck in tth MINITAB regression output is presented in
week Fig. 2.5. From p-values indicated in Fig. 2.5,
x3t ¼ average number of pounds per shipment in we ﬁnd that b0 and b3 are signiﬁcantly different
tth week from 0 at a = 0.01. Hence, we can conclude

Fig. 2.5 MINITAB output for Application 2.5

2.7 Business and Economic Applications 39

that the only important variable in determining Computer outputs of Fig. 2.6a–c present the
the hours of labor required in the shipping following results of simple and multiple
department is the average number of pounds per regression.
shipment.
1. Estimated intercept and slopes;
2. F-values for the whole regression;
2.8 Using Computer Programs 3. t-Values for individual regression
to Do Multiple Regression coefficients;
Analyses 4. ANOVA of regression;
2
5. R2 and R
2.8.1 SAS Program for Multiple 6. p-Values;
Regression Analysis 7. Durbin–Watson D and first-order
autocorrelation;
In an example taken from Churchill’s Marketing 8. Standard error of residual estimate (mean
Research, data for the sales of click ballpoint square of error);
pens (y), advertising (x1 , measured in TV spots pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
9. Root MSE ¼ MSE error. For example, for
per month), number of sales representatives (x2 ), pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Eq. (b), Root MSE ¼ 2039:85310 ¼
and a wholesaler efficiency index (x3 ) were pre-
45:16473. The root MSE estimate can be
sented in Table 2.11.
used to measure the performance of
We investigated not only the relationship
prediction.
between two variables (y and x1 , y and x2 , and
y and x3 ). Now we will expand that analysis by
These SAS regression outputs give us almost
using the following three regression models:10
all the sample statistics we have examined so far.
yi ¼ a þ b1 x1i þ ei ðaÞ Now let us consider the practical implications of
Eqs. (a–c).
yi ¼ a þ b1 x1i þ b2 x2i þ ei ðbÞ Equation (b) specifies a regression model in
yi ¼ a þ b1 x1i þ b2 x2i þ b3 x3i þ et ðcÞ which sales is the dependent variable and the
independent variables are number of TV spots x1
Equation (a) can be used to investigate the and number of sales representatives x2 . The fitted
relationship between y and x1. regression equation is
Equation (b) can be used to analyze whether
^y ¼ 69:3 þ 14:2x1 þ 37:5x2 F ¼ 128:141
the second explanatory variable, x2 , improves the
ð2:994Þ ð5:315Þ ð5:393Þ
equation’s power to explain the variation of
sales. Equation (c) can be used to analyze whe-
ther the third explanatory variable, x3 , further Here t-values are indicated in parentheses.
improves that explanatory power. Part of the This regression indicates that when the num-
output of the SAS program for Eqs. (a–c) is ber of TV spots increases by 1 unit, sales
presented in Fig. 2.6a–c. Figure 2.6a shows the increase by $14,200 on average. And when the
regression results of Eq. (a), Fig. 2.6b the number of sales representatives increases by 1
regression results of Eq. (b), and Fig. 2.6c the person, sales increase by $37,500 on average.
regression results of Eq. (c). Using these results, The F-value for the regression of Eq. (b) is
we will review and summarize simple regression 128.141. There are 40 observations and 2 inde-
and multiple regression results. pendent variables, so the number of degrees of
freedom in the model is 40 − 2 − 1 = 37. By
interpolation, it can be shown that the critical value
10
In these regressions, we hold the price of a ballpoint pen of F0.05,2,37 is 3.25 (F-distribution critical value
and the income of a consumer constant, because this is a table). Because the F-value for the regression is
set of cross-sectional data.
40 2 Multiple Linear Regression

Table 2.11 Territory data for click ballpoint pens

Territory Sales, Y (thousands) Advertising, X1 Number of sales Wholesaler efﬁciency
(TV spots per month) representatives, X2 index, X3
005 260.3 5 3 4
019 286.1 7 5 2
033 279.4 6 3 3
039 410.8 9 4 4
061 438.2 12 6 1
082 315.3 8 3 4
091 565.1 11 7 3
101 570.0 16 8 2
115 426.1 13 4 3
118 315.0 7 3 4
133 403.6 10 6 1
149 220.5 4 4 1
162 343.6 9 4 3
164 644.6 17 8 4
178 520.4 19 7 2
187 329.5 9 3 2
189 426.0 11 6 4
205 343.2 8 3 3
222 450.4 13 5 4
237 421.8 14 5 2
242 245.6 7 4 4
251 503.3 16 6 3
260 375.7 9 5 3
266 265.5 5 3 3
279 620.6 18 6 4
298 450.5 18 5 3
306 270.1 5 3 2
332 368.0 7 6 2
347 556.1 12 7 1
358 570.0 13 6 4
362 318.5 8 4 3
370 260.2 6 3 2
391 667.0 16 8 2
408 618.3 19 8 2
412 525.3 17 7 4
430 332.2 10 4 3
442 393.2 12 5 3
467 283.5 8 3 3
471 376.2 10 5 4
488 481.8 12 5 2
Source C. A. Gilbert, in G. A. Churchill, Jr., Marketing Research: Methodological Foundations, 3rd ed., 1983, p. 563. Copyright © 1983 by the
Dryden Press, reprinted by permission of the publisher
2.8 Using Computer Programs to Do Multiple Regression Analyses 41

greater than the critical value, the hypothesis that table, by interpolation, we find that
the coefficients are equal to zero is rejected. From F0.05,3,36 = 2.88. F = 89.051 is much larger
the t-values associated with estimated regression than 2.88. This implies that we reject the fol-
coefficients, we find that the estimated intercept lowing null hypothesis of our joint test
and slopes are significant at a = 0 0.01. H0 : b1 ¼ b2 ¼ b3 ¼ 0
Because the t-values of b2 are significantly Now we can use t-statistics to test which
different from zero, we conclude that adding the individual coefficient is significantly different
number of sales representatives improves the from zero. From t-distribution critical value
equation’s power to explain sales. This conclu- table, by interpolation, we find that the critical
2 value of t-statistic is t0.005,36 = 2.72. By com-
sion can also be drawn from the fact that R has
increased from 0.7687 to 0.8670. paring this critical value with 4.738, 5.666, and
The fitted regression of Eq. (c) is 1.498, we conclude that b1 and b2 are signifi-
cantly different from zero and that b3 is not sig-
^y ¼ 31:1504 þ 12:9682x1 þ 41:2456x2 nificantly different from zero at a = 0.01. In
ð0:911Þ ð4:738Þ ð5:666Þ other words, the wholesaler efficiency index does
þ 11:5243x3 F ¼ 89:051 not increase the explanatory power of Eq. (c).
ð1:498Þ MINITAB Program for Multiple Regression
Prediction
Again, t-values are indicated in parentheses. MINITAB is used to run the regression defined
Following Sect. 2.5, we first test the whole in Fig. 2.6c and presented in Fig. 2.7. Besides
set of regression coefficients in terms of the F- regression parameters, we also predict y by
statistic. From F-distribution critical value assuming x1 ¼ 13, x2 ¼ 9, and x3 ¼ 5. The

Fig. 2.6 a SAS output for regression results of yi = a + b1x1i + ei, b SAS output for regression results of
yi ¼ a þ b1 x1i þ b2 x2i þ ei , c SAS output for regression results of yi ¼ a þ b1 x1i þ b2 x2i þ b3 x3i þ ei
42 2 Multiple Linear Regression

Fig. 2.6 (continued)

2.8 Using Computer Programs to Do Multiple Regression Analyses 43

Fig. 2.7 MINITAB output of yi ¼ a þ b1 x1i þ b2 x2i þ b3 x3i þ ei

44 2 Multiple Linear Regression

Fig. 2.7 (continued)

2.8 Using Computer Programs to Do Multiple Regression Analyses 45

Fig. 2.7 (continued)

results are listed in the last row of Fig. 2.7. the statistical model to explain the sales. The
They are stepwise regression method suggests the follow-
ing steps.
1. ^yn þ 1;i ¼ 628:57
2. sð^yn þ 1 Þ ¼ 34:92 Step 1:
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
3. sð^yn þ 1;i Þ ¼ s2 ð^yn þ 1 Run simple regression on each explanatory
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi variable, and choose the model that explains the
þ s2e Þ ¼ ð34:92Þ2 þ 1973 ¼ 56:50
highest amount of variation in y. The R2 -value in
4. 95% confidence interval: (557.73, 699.40); each computer report is used to determine which
5. 95% prediction interval: (513.94, 743.19). variable enters the model first. Upon comparing
R2 values for the three models, we conclude that
Stepwise Regression Analysis x2 , which has the highest R2 -value (0.7775),
In this example, we want to use stepwise should enter the model first.
regression to establish a statistical model to
predict the sales of click ballpoint pens (y). We Independent variable R2 F-value
are considering three possible explanatory vari-
x1 0.7747 130.644
ables: advertising (x1 ) measured in TV spots per
x2 0.7775 132.811
month, the number of sales representatives (x2 ),
and a wholesaler efficiency index (x3 ). The x3 0.0000 0.000
question is what variables should be included in
46 2 Multiple Linear Regression

Step 2: Step 3:
The second variable to enter should be the vari- In this step, we want to decide whether another
able that, in conjunction with the ﬁrst variable, variable should enter the model to explain y. Note
explains the greatest amount of variation in y. that every time an additional variable is included in
a model, R2 increases. The question is whether the
Independent R2 F-value increase in R2 justiﬁes inclusion of the variable.
variables
We apply an F-test to answer this question.
x2 x1 0.8738 128.141
x2 x3 0.807 77.46 ðR2f R2R Þðkf kR Þ
F¼
ð1 R2R Þ=ðN kf 1Þ
The R2-values and F-values in the foregoing
table are obtained from Fig. 2.6b and Fig. 2.8. where
The table shows the results when x1 and x3 are R2f ¼ R2 of the model with the new variable
combined with x2 to explain the variation in R2R ¼ R2 of the model without the new variable
y. The combination of x1 and x2 clearly yields a kf ¼ number of the variables in the model with
higher R2 (0.8738). This suggests that x1 should the new variable
be the second variable to enter.

Fig. 2.8 MINITAB output yi ¼ a þ b1 x1i þ b2 x2i þ b3 x3i þ ei

2.8 Using Computer Programs to Do Multiple Regression Analyses 47

kR ¼ number of the variables in the model Figure 2.9 shows the output of a stepwise
without the new variable regression analysis using MINITAB.
To determine whether x3 should be included
in the model, we need to compare the R2 of the
model with x3 and R2 of the model without x3 . 2.9 Conclusion
Independent variables R2 In this chapter, we examined multiple regression
x2 x1 0.8738 analysis, which describes the relationship
x2 x1 x3 0.8812 between a dependent variable and two or more
independent variables. Methods of estimating
Using the foregoing formula, we compute multiple regression (slope) coefficients and their
standard errors were discussed in depth. The
ð0:8812 0:8738Þ=ð3 2Þ residual standard error and the coefficient of
F¼
ð1 0:8812Þ=ð40 3 1Þ determination were also explored in some detail.
¼ 2:24\F0:05;1;36 ¼ 4:11 Both t-tests and F-tests for testing regression
relationships were discussed in this chapter. We
Because including x3 does not increase R2 investigated the confidence interval for the mean
significantly, the null hypothesis that x3 should response and the prediction interval for the
not be included is not rejected in this case. Our individual response. And finally, we saw how
conclusion from the stepwise regression analysis multiple regression analyses can be used in
is that the best model should include only x1 and business and economics decision making. In the
x2 as explanatory variables. next chapter, we will discuss other topics in
Some computer packages are programmed to applied regression analysis. Examples of applied
perform the whole complicated stepwise regres- multiple regression analysis in both business and
sion in response to one simple command. finance will be discussed in some detail.

Fig. 2.9 Stepwise regression

analysis
48 2 Multiple Linear Regression

Appendix 1: Derivation Similarly,

of the Sampling Variance
!
of the Least Squares Slope X ðx2i x2 Þ rC
C1 ðx1i
xi Þ
n 2

Estimations b2 ¼ yi
i¼1
ð1 r 2 ÞC22
We can obtain the correlation coefficient between Xn
¼ B2i yi
x1 and x2 as i¼1
Pn ð2:39Þ
ðx1i x1 Þðx2i x2 Þ
r¼ i¼1
ð2:35Þ
C1 C2 Substituting Eq. (2.2) into Eqs. (2.38) and
(2.39), we get
where
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X
n X
n X
n X
n
X n b1 ¼ a B1i þ b1 B1i x1i þ b2 B1i x2i þ B1i ei
C1 ¼ ðx1i x1 Þ2 ð2:36aÞ i¼1 i¼1 i¼1 i¼1

i¼1
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi and
X n
C2 ¼ ðx2i x2 Þ2 ð2:36bÞ
i¼1 X
n X
n X
n
b2 ¼ a B2i þ b1 B2i x1i þ b2 B2i x2i
i¼1 i¼1 i¼1
Substituting (2.35), (2.36a), and (2.36b) into
Xn
Eq. (2.7) yields þ B2i ei
Pn Pn i¼1
C 2
i¼1 ðx1i x1 Þðy0i Þ rC1 C2 x2 Þðy0i Þ
i¼1 ðx2i
b1 ¼ 2
C12 C22 r 2 C12 C22
Pn
It can be easily seen that B ¼ 0,
X ðx1i x1 Þðy Þ ðrC1 =C2 Þðx2i x2 Þðy0 Þ
Pn Pn Pn i¼1 1i
n 0
B ¼ 0, B x ¼ 1, i¼1 2i x2i ¼ 1,
B
¼ i i
Pi¼1 2i i¼1 1i P1i
ð1 r 2 ÞC12 n n
i¼1 i¼1 B2i x1i ¼ 0, and i¼1 B1i x2i ¼ 0. There-
Xn
fore, these two equations imply that
ðx1i x1 Þ ðrC1 =C2 Þðx2i x2 Þ 0
¼ yi
ð1 r ÞC12 2
X
i¼1
n
ð2:37Þ b1 Eðb1 Þ ¼ b1 b1 ¼ B1i ei ð2:40Þ
i¼1
Substituting
and
X
n X
n
ðx1i x1 Þy0i ¼ ðx1i xi Þyi X
n
i¼1 i¼1 b2 Eðb2 Þ ¼ b2 b2 ¼ B2i ei ð2:41Þ
i¼1
and
From Eq. (2.40), we obtain
X
n X
n
2
ðx2i x2 Þy0i ¼ ðx2i x2 Þyi !2 3 " !# 2
X
n X
n
i¼1 i¼1 Varðb1 Þ ¼ E4 B1i ei 5 E B1i ei
i¼1 i¼1
!2
into Eq. (2.37), and letting the coefficient of yi X
n X
n X
n

equal B1i , we obtain ¼ B1i B1j Eðei ej Þ B1i ½Eðei Þ2

j¼1 j¼1 i¼1

X
n ð2:42Þ
b1 ¼ B1i yi ð2:38Þ
i¼1
Appendix 1: Derivation of the Sampling Variance … 49

In Eq. (2.8), the last equality holds because This implies that the sample variance of
Eðei Þ ¼ 0 and Eðei ej Þ ¼ 0 when i 6¼ j. multiple regression slopes reduces to a simple
From the deﬁnition of B1i in Eq. (2.38), we regression case.
have

ðx1i x1 Þ2 þ r 2 C12 =C22 ðx2i x2 Þ2 2rðC1 =C2 Þðx1i x1 Þðx2i x2 Þ
B21i ¼ ð2:43Þ
ð1 r 2 Þ2 C14

And from Eqs. (2.35), (2.36a), (2.36b), and

(2.43) Appendix 2: Cross-sectional
Relationship Among Price Per Share,
X
n
Ci2 þ r 2 ðC12 =C22 ÞðC22 Þ 2rðC 1 =C2 Þðr1 C1 C2 Þ Dividend Per Share, and Return
B21i ¼
i¼1 ð1 r 2 Þ2 C14 Earning Per Share
C12 ½1 2r þ r 2
¼ Based upon Lee and Lee (2017), we discuss the
ð1 r 2 Þ2 =C14
1 cross-sectional relationship among price per share,
¼ dividend per share, and return earning per share.
ð1 r 2 ÞC12
ð2:44Þ First we discuss Gordon’s empirical work and its
extension; then we discuss M&M’s empirical work.
Substituting Eq. (2.44) into Eq. (2.42) yields Gordon (1959) was interested in the determi-
nants of stock price per share. His method of
s2 testing for these determinants was one of
Varðb1 Þ ¼ Pn e : ð2:45Þ cross-sectional regression, testing the relative
ð1 r2 Þ i¼1 ðx1i x1 Þ2
importance of dividend per share and retained
Similarly, it can be proved that earnings per share by comparing the magnitude
regression coefficient associated with dividend
s2 per share and retained earnings per share.
Varðb2 Þ ¼ Pn e : ð2:46Þ Gordon first sought to uncover what it was the
ð1 r 2 Þ i¼1 ðx2i x2 Þ2
market actually capitalized in arriving at stock val-
Equations (2.45) and (2.46) are Eqs. (2.23) ues, both dividends and earnings—the dividends
and (2.24), respectively. plus the growth opportunities, where the retained
If the correlation coefficient between x1 and earnings figure was used as a proxy for growth
x2 —that is, r—is equal to zero, then Varðb1 Þ and potential, or simply the earnings by themselves. His
Varðb2 Þ reduce to sample was taken from four industries at two sepa-
rate points in time, which allowed for analysis of the
s2e coefficient’s stability and significance over time and
Varðb1 Þ ¼ Pn and
i¼1 ðx1i x1 Þ2 across industries, as well as the analysis of the
s2e multiple correlations across time. He sought to make
Varðb2 Þ ¼ Pn only some generalizations, although the results were
i¼1 ðx2i x2 Þ2
to some extent inconclusive.
50 2 Multiple Linear Regression

The dividends and earnings model as shown consistent manner, which, given this evidence,
in Eq. (2.46) show the price of a share of stock to indicates that dividends are three to four times as
be linearly dependent upon dividends, earnings, important in the valuation procedure as are growth
and some constant term: opportunities. Once again, multicollinearity is a
problem that is left unquestioned.
P ¼ a0 þ a1 D þ a2 Y: ð2:47Þ
Gordon dismisses the pure-earnings approach
The a1 and a2 terms are the point of interest; if with the realization that investors would be
a1 is greater than a2, then dividends have a rel- indifferent between dividends and increases in
atively larger impact in the valuation procedure market value of their holdings if they valued only
than do earnings, and vice versa. The multiple earnings, dismissing market imperfections for
correlations for these eight regressions were quite now. If this were so, then the two relevant
high, averaging 0.905, with a range from 0.86 to coefficients would tend to be equal, and since
0.94. Of the dividend coefficients, seven were they were found to be drastically different in the
statistically significant at a 5% level (the only second set of regressions, he discounts this as a
negative value being the insignificant value), viable valuation format.
while two of the earnings coefficients were not In refining his model, it appears that Gordon
statistically significant. In each case, except that was primarily interested in not allowing
one where the dividend coefficient is insignifi- short-term aberrations to influence the parameter
cant, the dividend coefficient is greater in mag- estimates, although he also adjusted all variables
nitude than the earnings coefficient. This in accordance with a size factor, which, in some
evidence is shaky proof for dividend preference if respects, could be interpreted as the inclusion of
one recognizes the multicollinearity problem and a previously omitted variable that could affect the
the apparent lack of consistency of the coeffi- stability of some of the coefficients. The new
cients across industries or stability over time model appears as:
periods, even in any relative sense. þ B3 ðgÞ þ E4 ðg gÞ;
P ¼ B0 þ B1 d þ B2 ðd dÞ
The dividend-plus-investment-opportunities
ð2:49Þ
approach, which, as we said earlier, uses the
retained-earnings figure as a growth proxy, where
hinting again at an all internally financed firm, is
symbolized by Eq. (2.47) below: P = price per share/book value;
P ¼ a0 þ a1 D þ a2 ðY DÞ: ð2:48Þ d ¼ five-year average dividend/book value;
d = current year’s dividend/book value;
The multiple correlation figures and the con- g ¼ five-year average retained earnings/book
stant terms are the same as in the previous test; this value;
might have been expected, since no new variables g = current year’s retained earnings/book value
entered, and none left the formulation, but the
relative dividend and retained-earnings coeffi- The multiple correlations from the use of this
cients do change in relative values. Actually, the model are slightly lower than those in the
retained-earnings coefficients were the same as the dividend-plus-growth case presented earlier, but
earnings coefficients in the earlier trial, but it is most noticeably the constant terms are now all
really the difference, or the ratio, of the two that we quite small, whereas, with the previous models,
are interested in. The dividend coefficients also they were at times quite large, although statistical
appear to be relatively more consistent from significance was not discussed. Only the
industry to industry and stable in relative terms five-year dividend is significant at a 5% level for
through time. The new-found stability and con- all eight regressions with six of the current less
sistency is appealing because we would like to the long-run average being significant and five of
believe stocks are priced in some logical and each of the growth coefficients being significant
Appendix 2: Cross-sectional Relationship … 51

also, although not as much so as the dividend paper. The first trial was run using Eq. (2.48) as
coefficients. the model, without the F-coefficient. For each
All things considered, the dividend factors industry, the coefficients are relatively stable
appear to be the predominant influences on share across time periods, and in two instances the
values, although there are certain individual retained-earnings coefficient is greater than the
exceptions among the industries surveyed. The dividend coefficient. In particular, the electrical
evidence presented here in the revised model and utility industry is seen to be such a case, and the
in the models presented earlier must be inter- work was then redone using logarithms, to see if
preted as supporting Gordon’s contention that the linearity assumption was responsible for this
dividends do matter; in fact, given the methods result. Unsurprisingly, the coefficients change in
used here, they are detected as being the most relative magnitudes; this is due to a combination
important variable that could be included in the of the utility industry’s high dividend yield and
valuation formula. the nature of the logarithmic transformation.
Friend and Puckett (1964) were concerned Friend and Puckett concede there is no reason to
with unveiling the limitations to the type of prefer one method versus the other, and leave the
analysis performed by Gordon and others. They functional form issue at that.
pointed out a number of potential problems, the Returning to the omitted-variable issue,
first being the accounting earnings and Friend and Puckett first included the previous
retained-earnings figures and the high degree of period’s price as the firm-specific variable. As a
measurement error incurred when using these result, the retained-earnings coefficients are
values as proxies for economic or true earnings. greater than the dividend coefficients, the latter
Again, it is assumed that investors value eco- being negative in six of ten instances. The mul-
nomic earnings and not accounting earnings per tiple R-squared statistics are quite high but, by
se. Risk measures are also missing from the and large, the significance levels of the
analysis, as is any awareness of the dynamic retained-earnings coefficients are not. Besides
nature of corporate dividend policy. In short, we having very little economic rationale for the
must realize not only a multicollinearity problem inclusion of a lagged price variable, the statistics
and potential specification errors, but that omit- leave sizable uncertainties as to the validity of
ted variables must also be accounted for, if the this approach.
analysis is to be complete. Recognizing that accounting numbers over the
In a general sense, a composite firm-specific short run are subject to sizable measurement error,
variable, E, could be included in the standard the authors normalized earnings over ten years
model, which contains dividends and retained and then ran the regressions again without a
earnings, yielding: firm-specific variable. This rendered all coeffi-
cients statistically significant at a 5% level, and
P ¼ a0 þ a1 D þ a2 R þ F: ð2:50Þ the dividend coefficients were consistently higher
than the retained-earnings coefficients. This sug-
But to have any true economic meaning, this gests the same conclusions and interpretation as
variable should be specified, so multiple statis- were offered by Gordon. Adding the normalized-
tical trials were run to see which economic earnings price ratio lagged one period, it was
variable would be best to include in the analysis. found that the intercept or constant terms get very
Attempts were also made to alter or adjust the large, as do the dividend coefficients relative to the
retained-earnings figure so as to more accurately retained-earnings coefficients, and the R-squared
reflect its “true” value, as discussed in a pre- term for each regression is exceptionally high. The
ceding paragraph. earnings/price coefficient was found to be highly
In running the statistical tests, five industry negative; all these findings yield questionable
groups were analyzed at two points in time, results. In a separate analysis of the chemical
along the same lines of reasoning as in Gordon’s industry using dividends and normalized earnings,
52 2 Multiple Linear Regression

and then further adding the normalized dividend and retained-earnings effects, it was
earnings/price ratios, the retained-earnings coeffi- found that in only four years was there any dif-
cient was seen to dominate the dividend coeffi- ference between the two effects; this leads to the
cient. Again, though, the other coefficients were conclusion that all models developed in linear or
given, and certainly appear to merit some log linear form, and used to test this particular
attention. industry (and possibly other industries as well)
In summation, Friend and Puckett argue for are probably misspecified. This is a serious
longer-term measures for variables of accounting problem, in that the importance of dividends is
nature and earnings-to-price ratios as a sort of muddied because the model is not correct. The
risk or firm-specific variable, in that it shows the true value of dividends can be inferred only if the
rate at which the market capitalizes the firm’s model is correctly specified.
earnings. In using this variable, it should be Part of the problem here is due to the nature of
recognized that the dependent variable is being the industry, where high payouts are common
used to explain itself. This tends to show the and external financing is great and unaccounted
retained-earnings figure as important as or more for. These two factors serve to bias the dividend
important in valuation than dividends, thus sup- effect downward in the linear-form models.
posedly invalidating previous results to the con- The logarithmic-form models reduce the
trary. The real question appears to be which problem of weighting of regression coefficients
approach possesses the best economic ground- due to size disparities among firms, a problem
ing. We would like to think theory precedes the noticeably existent in the electric utility industry.
empiricism; and to pursue this, one must have an However, it does have the disadvantage of being
appropriate theory, a question to which Friend unable to cope with negative retained-earnings
and Puckett do not address themselves. figures, a phenomenon that is a reality in the
Lee (1976) cited the Friend and Puckett evi- current environment.
dence in attacking the issue of functional form, These caveats to both methods of analysis
concentrating on the electric utility industry should evoke more concern from empirical
where the risk differential between firms is often researchers in the future, since misspecification
seen to be negligible, thereby eliminating a large can have drastic effects on the conclusions
portion of the firm-specific effects. Using the reached.
generalized functional form developed by
Box and Cox (1964), the linear and log linear
forms can be tested on a purely statistical basis, Bibliography
and the log linear form can be tested against a
nonlinear form. Box, G. E., & Cox, D. R. (1964). An analysis of
Quite interestingly, Lee found that the log transformations. Journal of the Royal Statistical
linear form was statistically superior to the Society: Series B (Methodological), 26(2), 211–243.
linear-form model in explaining the dividend Fogler, H. R., & Ganapathy, S. (1982). Financial
econometrics. Englewood Cliffs, NJ: Prentice-Hall.
effect. The results of this comparison were Friend, I., & Puckett, M. E. (1964). Dividends and stock
essentially the same as in Friend and Puckett’s prices. American Economic Review, 656–682.
study. Using the linear form, nine of the ten years Ghosh, S. K. (1991). Econometrics: Theory and applica-
of data examined showed a stronger tions. Englewood Cliffs, NJ: Prentice Hall.
Gordon, M. J. (1959). Dividends earnings and stock
retained-earnings effect than that for dividends, prices. Review of Economics and Statistics,
while, in the log linear trial, all ten years showed 99–105.
stronger dividend effects. At this point, the only Greene, W. H. (2017). Econometric analysis (8th ed.).
question remaining is whether either of these New Jersey: Prentice Hall.
Gujarati, D., & Porter, D. (2011). Basic econometrics (5th
models accurately depicts the true functional form. ed.). MHE.
Using the true functional form from the gen- Johnston, J., & Dinardo, J. (1996). Econometrics methods
eralized functional form method to compare the (4th ed.). New York: McGraw-Hill.
Bibliography 53

Lee, C. F. (1976). Functional form and the dividend effect in Lee, C. F., & Lee, J. C. (2015). Handbook of financial
the electric utility industry. Journal of Finance, 1481–1486. econometrics and statistics. New York, NY:
Lee, A. C., Lee, J. C., & Lee, C. F. (2017). Financial Springer.
analysis, planning and forecasting: Theory and Lee, C.-F., Lee, J., & Lee, A. C. (2013b). Statistics for
application (3rd ed.). Singapore: World Scientific. business and financial economics (3rd ed). Berlin:
Lee, C. F., Finnerty, J., Lee, J., Lee, A. C., & Wort, D. Springer.
(2013a). Security analysis, portfolio management, and Theil, H. (1971). Principles of econometrics. Toronto,
financial derivatives. World Scientific. NY: Wiley.
Lee, C. F., & Lee, A. C. (2013). Encyclopedia of finance Wooldrige, J. M. (2010). Econometric analysis of cross
(2nd ed.). New York, NY: Springer. section and panel data (2nd ed.). The MIT Press.