++cooke, T. E. (1998) - Regression Analysis in Accounting Disclosure Studies.

This article was downloaded by: [Erciyes University]
On: 03 January 2015, At: 11:09

Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office:
Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Accounting and Business Research

Publication details, including instructions for authors and subscription
information:
http://www.tandfonline.com/loi/rabr20
Regression Analysis in Accounting Disclosure

Studies
a
T. E. Cooke
a
University of Exeter
Published online: 27 Feb 2012.
To cite this article: T. E. Cooke (1998) Regression Analysis in Accounting Disclosure Studies, Accounting and
Business Research, 28:3, 209-224, DOI: 10.1080/00014788.1998.9728910
To link to this article: http://dx.doi.org/10.1080/00014788.1998.9728910
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”)
contained in the publications on our platform. However, Taylor & Francis, our agents, and our
licensors make no representations or warranties whatsoever as to the accuracy, completeness, or
suitability for any purpose of the Content. Any opinions and views expressed in this publication
are the opinions and views of the authors, and are not the views of or endorsed by Taylor &
Francis. The accuracy of the Content should not be relied upon and should be independently
verified with primary sources of information. Taylor and Francis shall not be liable for any
losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities
whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or
arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial
or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or
distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use
can be found at http://www.tandfonline.com/page/terms-and-conditions
Arrounling and Business Research, Vol. 28. No. 3. pp. 209-224. Summer 1998 209
Regression Analysis in Accounting

Disclosure
T. E. Cooke*
Abstract-A problem that sometimes occurs in undertaking empirical research in accounting and finance is that
the theoretically correct form of the relation between the dependent and independent variables is not known,
although often thought or assumed to be monotonic. In addition, transformations of disclosure measures and
independent variables are proxies for underlying constructs and hence, while theory may specify a functional form
for the underlying theoretical construct, it is unlikely to hold for empirical proxies. In order to cope with this
problem a number of accounting disclosure studies have transformed variables so that the statistical analysis is
more meaningful. One approach that has been advocated in such circumstances is to rank the data and then apply
regression techniques, a method that has been used recently in a number of accounting disclosure studies. This
paper reviews a number of transformations including the Rank Regression procedure. Because of the inherent
Downloaded by [Erciyes University] at 11:09 03 January 2015
properties of ranks and their use in regression analysis, an extension is proposed that provides an alternative
mapping that replaces the data with their normal scores. The normal scores approach retains the advantages of
using ranks but has other beneficial characteristics, particularly in hypothesis testing. Regressions based on untrans-
formed data, on the log odds ratio of the dependent variable, on ranks and regression using normal scores, are
applied to data on the disclosure of information in the annual reports of companies in Japan and Saudi Arabia. It
is found that regression using normal scores has some advantages over ranks that, in part, depend on the structure
of the data. However, the case studies demonstrate that no one procedure is best but that multiple approaches are
helpful to ensure the results are robust across methods.
1. Introduction problems of skewness and kurtosis, as well as

problems of outliers and non-linearity.
This paper addresses the problem of empirically As well as data problems, another complication
estimating the relation between accounting varia- that sometimes emerges when undertaking empiri-
bles and examines the application of rank and nor- cal work in accounting and finance is that the the-
mal scores regression in accounting disclosure oretically correct form of the relation between the
studies. The focus is on data-analytical techniques dependent and independent variables is not
based on transformations (Draper, 1988).' The pa- known. This problem sometimes arises in disclo-
per argues that scholars undertaking research on sure studies when a researcher attempts to explain
disclosure issues should pay attention to the struc- the variability in disclosure indexes. Furthermore,
ture of the data and, where necessary, consider the a problem encountered in many disclosure studies
appropriateness of transformations. Data should is that the disclosure measures and independent
be screened to assess the impact of distribution variables are proxies for underlying constructs
and, hence, while theory may specify a functional
*The author is professor of accounting at the University of form for the underlying theoretical construct, it is
Exeter. He wishes to acknowledge the helpful comments and/ unlikely to hold for empirical proxies.
or discussions with K.Abadir, F. Oliver, B. Pearson, K.Read, A recent development in dealing with such prob-
C. Roberts, L. Skerratt, M. Timbrell, M. Tippett, and R. S. 0.
Wallace as well as two anonymous referees and the editor. lems is to transform the data and use Rank Re-
Correspondence should be addressed to Professor Cooke at the gression rather than conventional OLS. The ad-
Department of Accounting, School of Business and Economics, vantage of Rank Regression is that it yields
University of Exeter, Streatham Court, Exeter EX4 4PU. This distribution-free test statistics (non-parametric)
paper was first submitted in November 1996 and the final ver-
sion accepted in February 1998.
and is therefore potentially useful when accounting
I Draper (1988) identifies four basic approaches to deal with datasets reveal non-linear monotonic relationships
violations of the technical assumptions of classical linear between independent and dependent variables.
regression: The first example of this procedure being used in
a. The do-nothing approach.
b. The data-analytic approach investigates influential observa-
disclosure studies was Lang and Lundholm (1993),
tions and transformations. followed by Wallace et al. (1994); Wallace and Na-
c. The model expansion approach focuses on departures from ser (1995); Lang and Lundholm (1996).
assumptions once found and departures are modelled directly This paper considers a number of transforma-
on the raw data scale by broadening the parametric model. tions including Rank Regression and extends the
d. The robust approach which uses non-classical techniques so
that deviations from the classical assumptions are not crucial, latter by mapping observations on to the normal
e.g. M - , R - , L - estimators. distribution rather than on to the positive inte-
210 ACCOUNTING AND BUSINESS RESEARCH
gers.2 The transformation proposed is achieved by vides information on the distribution of observed
dividing the normal distribution into the number val~es.~
of observations plus one segments on the basis Many statistical tests are based on the assump-
that each segment has equal probability (van der tion that the data come from a normal distribution
Waerden, 1952, 1953). In effect, the ranks of the or that a sufficiently large sample is available to
data are substituted by scores on the normal dis- appeal to asymptotically normality of the test sta-
tribution and so the normal scores approach may tistic. For example, in regression analysis an as-
be considered to represent an extension of the rank sumption is that the error term is normally dis-
method. tributed. These issues are considered later when a
It should be noted that the various transfor- comparison is made between OLS and Rank
mations and approaches discussed in this paper Regression.
are not mutually exclusive. In practice, several ap- One approach to assessing the normality of the
proaches can be undertaken to try to ensure that data is the normal probability plot in which the
the results are not method-driven but are robust observed values are matched with expected values
across methods. from the normal distribution. Visual inspection of
The rest of the paper is divided into four sec- data can be supported by statistical tests such as
tions. In the next section, consideration is given to standard tests on skewness and kurtosis (Stuart
data examination and transformations. Section 3 and Ord, 1983)6,the Kolmogorov-Smirnov (K-S)
considers some of the transformations in the con- test and its modification by Shapiro-Wilks and
text of disclosure studies. The procedure for Rank Lilliefors.
Regression is outlined in more detail and the ad- Transformation of data is useful in regression
vantages and disadvantages of this approach are analysis when the relationship between the de-
reviewed. The use of normal scores is then pro- pendent and independent variables is inherently
posed as an alternative to Rank Regression. Sec- non-linear, when the distribution of the errors is
tion 4 provides an examination of two small da- not approximately normal, and where there are
tabases on information disclosed in the annual problems of heteroscedasticity or non-independ-
reports of companies in Japan and Saudi Arabia ence of the error terms. Where possible, the deter-
and includes an analysis using standard OLS, mining factor should be based on the underlying
Rank Regression, regression using normal scores, theoretical relationship. It should be appreciated
and regression using a log odds ratio transforma- that transformations of disclosure measures and
t i ~ nThe
. ~ two case studies serve to illustrate some independent variables are proxies for underlying
of the points made earlier in the paper. Section 5
provides a summary and conclusions. constructs and hence, while theory may specify a
functional form for the underlying theoretical con-
struct, it is unlikely to hold for empirical proxies.
In determining linearity, the important factor is
the functional form of the relationship. The par-
ameters must be linear so that the independent
variables can be transformed to produce a linear
2. Data examination and transformations model.
Once data is collected it is important to review it There are other circumstances when transfor-
carefully, regardless of the type of analysis that is mations may also be considered. For example,
being proposed. The examination may be under- Iman and Conover (1979500) have argued that
taken in a number of ways. For example, a his- ‘the rank transform approach has an obvious ad-
togram of the observed values is one approach in vantage when the dependent variable is a mono-
which the values are divided into intervals of equal tonic function of the independent variable(s) and
size and each column shows the number of cases this monotonic relationship is non-linear in na-
within each i n t e ~ a l .Such
~ analysis may be ex- ture’. However, when non-linear monotonic rela-
tended to identify outliers. The examination pro- tionships constitute a problem it is possible and
sometimes desirable to undertake transformations
* It is recognised that the mapping could be on to an alter- of the data other than by using ranks.
native distribution, but in the context of explaining disclosure Given normality and independence of the error
scores in corporate annual reports the normal distribution is term, the F-statistic can be used since large F-val-
required when linear regression is considered an appropriate
tool of analysis.
I am grateful to Jawaher A1 Modahki who allowed me to An alternative or additional approach is the boxplot which
use some of the raw data she collected on Saudi Arabia as part is useful in summarising the distribution of the observed values
of her doctoral thesis that she successfully completed in 1996. and identifying outliers.
Rank Regression and use of normal scores in regression did 6A check of the third and fourth moments against the mo-
not form part of her doctoral thesis. ments for the normal distribution is a useful apprach because
A modification of this approach is the stem-and-leaf plot, I...the effect of lack of normality makes itself felt most often
providing additional information to the histogram. through these measures’ (Zaman, 1996:181).
SUMMER 1998 21 1
ues suggest linearity. This can be further investi-

gated with a scatterplot of the independent and
dependent variables which should provide evi- Figure 1
dence of linearity. Further checks can be made by Determining a transformation to linearity by the
firstly plotting the residuals against the predicted bulging rule
values and secondly by plotting the residuals
against the values of the independent variable. Y UP
If non-linearity is suspected, transformations Pe P
may be considered, particularly in terms of pow-
ers, roots and logs. Tukey (1977) refers to a curve t1 - f
being convex from above (below) if the middle of
three points on the curve is below (above) the line
joining the other two. It is important to identify
the form of the non-linearity since the object is to
%
-. * UP
place the middle point on to the line joining the
other two points.
Monotone non-linear relationships may be
straightened by moving along the ladder of powers

and roots which would include, for example, y3,
y2, y, yf, ln(y), -y-f, -y-’, - Y - ~ . Tukey (1977) p down
suggests that if the three points are hollow upward foa 1. - 1 h
then look down the ladder for linearity and if the
hollow is downward look up the ladder. The same Lewis-Beck, M. S. (1993, editor). Regression
rules apply to explanatory variables x, so that if analysis. London: Sage Publications.
the curve bulges towards large x move up the lad-
der, and move down the ladder when the curve
bulges toward small x. The bulging rule is shown a discussion of the advantages and disadvantages
in Figure 1. of this approach.
Note that it is preferable to transform the in-
dependent variables rather than the dependent 3. Transformation in accounting disclosure
variable, because the latter disturbs the relation- studies
ship between the dependent variable and the other
regressors and because the error distribution is 3.1. The Rank Regression Procedure
changed (Fox, 1984). In many areas of the social sciences, individuals
If the transformations shown in Figure 1 are ap- are asked to rank their preferences. In accounting
plied and non-linearity persists then an alternative, in particular, users of financial accounts have been
which may be preferable in certain circumstances, asked to express their opinion on the usefulness of
would be to consider trimming the distribution. accounting information (see, for example, work by
Since the arithmetic mean can be heavily influ- Epstein, 1975; Lee and Tweedie, 1975; Chang and
enced by outliers there may be an argument in fa- Most, 1977; Anderson, 1981; Hines, 1982). When
vour of using the median. Estimators of location a researcher suspects that the underlying scale is
may be ‘robustified’ where assumptions about the only ordinal then one approach is to use ranks.7
underlying distribution are not restrictive. Ranks have been found to have a number of useful
An example of such an approach would be to applications. For example, in the econometric
eliminate the top 10% and bottom 10% of the dis- literature, ranks have been used to develop tests of
tribution so that only 80% of the data values are misspecification, particularly when survey data
used to establish a trimmed mean. Consideration have endogenous and exogenous variables that are
should be given to using M-estimators (generalised considered exchangeable (McCabe, 1989). The ad-
maximum likelihood estimators) methods that do vantage in this case is that tests based on ranks,
not exclude extreme values of the distribution but being distribution-free, do not require normality
instead give less weight to extremes. Different assumptions and so the ranks can be used to
develop tests of heteroscedasticity and serial cor-
weighting systems have also been proposed, such
relation (McCabe, 1989).
as those by Andrew, Hampel, Huber, and Tukey In the case of disclosure studies, the dependent
(see Hoaglin et al. 1983). variable is a metric ratio and therefore can be le-
The next section considers the Rank trans-
formation in some detail because it is much less
Another approach would be to use Theil’s method based
well known than the transformations mentioned on medians. However, in the case of multiple linear regression
above. The procedure for Rank Regression is out- this approach is computationally cumbersome. For further de-
lined and compared with OLS and is followed by tails see Hollander and Wolfe (1973).
gitimately transformed, where necessary, and used Alternatively, the regression model may be
in regression analysis. One transformation is to specified in terms of the conditional mean of the
rank the dependent and independent variables. dependent variable E(y(x,, ...xK) which is deter-
The rank transform procedure has been stated mined conditionally on observations on the inde-
by Iman and Conover (1979). Given a dependent pendent variables. The regression model assumes:
variable y with n observations, the observations
are placed in order and ranked from 1 to n (from E(Yilxl,i, x2, i, * . . xk, i) = a + * * * BkXk, i (3)
smallest to largest). The procedure is to rank both and for homoscedasticity,
dependent and independent variables so that with
Var (Yilxl,i,. . . xk, i) = .
2
R(yi) being the rank assigned to the ith smallest
value of Y, each of the independent variables (X, In order to derive equation (3) the joint distri-
(i = 1, ..., k) is replaced with their corresponding bution of Y and vector X must be known, but to
ranks 1 to n. Tied values are conventionally as- estimate the parameters the exact form of the joint
signed the mean of the ranks for which they are distribution is not required. In order to perform
tied. tests of hypotheses about the parameters, the form
The regression is undertaken on the ranks. For of the distribution is required. It is commonly as-
example, a bivariate relationship expressing &yi) sumed that the joint distribution of Y and the in-
in terms o,f R(x,) would result in a regression of dependent variables X, is multivariate normal,
the form R(y,) = &+ &xi). Given that the OLS though there are other joint distributions which
regression line passes through the mean points also have the property that the conditional mean
(R(x)), (R(y)) = ([n + 11/2, [n +I] /2) conventional of Y given the Xj is linear in Xj-for example, the
ordinary least squares may be applied giving the Pareto di~tribution.~
estimated value of the coefficients as & = [(n + 1) In practice, estimators of the parameters in lin-
121 (1 -0) and = 1-([6Z(R(yi) - (R(x~))~)/ear regression, with the joint normality of Y with
[n(n2- l)]. Where no ties exist, Spearman’s Rank the Xs assumption, have been found to be quite
correlation coefficient (rho) is the Pearson corre- robust even when the normality assumption does
lation coefficient applied to ranks and can be not quite hold. In using Rank Regression the
found as: requirement of random samples still holds, but the
normal distribution of Y does not.
1 - 6Zd:/n (n2- 1) (1)
where di = R(Yi)- R(Xi). (In the presence of ties 3.2. Advantagesldisadvantages of Rank Regression
this formula is merely approximate.) in accounting research
For the case of more than one independent vari- In the accounting literature, rank transforma-
able the multiple regression is found by fitting: tions of the residuals and forecasts from a linear
R(Yi) = a + Bl R(x1i) + BzR(x2i) + regression model were used by Beaver et al. (1979)
. . .+ &R(Xki) + €, (2) in assessing the relationship between unexpected
earnings and risk-adjusted returns. As a technique
by least squares. it did not become particularly popular, but has
In effect, the Rank Regression specification been used recently by Cheng et al. (1992) to evalu-
shown above is an application of standard multi- ate the specification of the cross-sectional OLS
ple regression and, as such, the data must fulfil two model that related unexpected earnings to risk-ad-
main conditions for hypothesis testing: normality justed security returns. Cheng et al. (1992) argued
of the errors and constant variance (homoscedas- that the rank transformation is invariant to power
ticity). Appropriate tests to assess the hypothesis transformations that preserve order.
that the observations on the dependent variable In other words, it is not necessary to standar-
are normally distributed have been indicated in the dise, log or undertake any power transformation
previous section. or any monotonic transformation because they re-
The homoscedastic assumption m a y be assessed sult in the same assignment of ranks. Rank trans-
by visual inspection of the residuals, or by using a formations are also relatively insensitive to outli-
specific test such as the Goldfeld-Quandt test, the ers. They found that ‘.. the use of power or rank
Breusch-Pagan test or the White test (see Ken- transformation of the forecast error variable pro-
nedy, 1992: 117-119).8 It follows from these as- duces a substantial improvement in R2’ (Cheng et
al. 1992: 589). The appropriateness of using R2 in
sumptions that the regressors xj and the disturb- such circumstances is considered later in the paper.
ance term are statistically independent, implying
cov (Xj,E) = 0.
Note that in the classical linear model, the experimental
values of the independent variables are fixed, whereas in the
In a time series context ARCH or GARCH models of het- linear Regression model they are the result of a random
eroscedasticity may be appropriate. process.
SUMMER 1998 213
In 1993, Lang and Lundholm used rank trans- servations of ranks 8 and 9 is required in order to
formations in their paper on cross-sectional deter- proceed to estimate Pi.
minants of analysts' ratings of corporate disclo- In the case of a bivariate Rank Regression, the
sures. They argue that the technique of Rank Spearman correlation coefficient may be tested for
Regression has value when the theoretical relation- significance using an exact test for which tables are
ship between the dependent and independent var- published. In the case of multiple regression, one
iables is not known but monotonic.I0 Rank Re- of the main disadvantages of Rank Regression is
gression is also useful when the relationship that of testing the significance of the estimated
between the dependent and independent variables coefficients.
is not strictly linear and there is no theoretical ba- To undertake statistical tests of a hypothesis
sis for suggesting a relationship between Y and X. there needs to be knowledge of the distribution of
Wallace et al. (1994) used Rank Regression in the dependent variable, Y,or the joint distribution
their study of the relationship between the com- of Y and the independent variables. The form of
prehensiveness of corporate annual reports and the distribution, for samples of moderate size, de-
firm characteristics in Spanish listed companies. A termines the type of statistical test that can be per-
sample of 50 companies was selected to produce a formed, and without knowledge of the distribution
disclosure index and 60% of the variability of the the correct significance levels cannot be deter-
indexes was explained by nine independent varia- mined. Given (joint) normality of Y and X in bi-
bles. Wallace et al. (1994 47) suggest that: variate regression, the analysis of variance F-test
'... there is no theoretically correct way of may be used to test the null hypothesis that there
describing the association between the de- is no linear relationship between the two variables.
pendent and the explanatory variables. In In testing the hypothesis about D in the bivariate
such a circumstance, Lang and Lundholm case the t test of 13 (Ho: S=O), or equivalently of
(1993) have suggested the use of rank (OLS) r, is the same as the F test. In the multivariate case
regression as a powerful method of coping where Y is normally distributed, the F statistic
with data sets with non-linear but monotonic tests the null hypothesis that all the coefficients ex-
relations between dependent and independ- cept the constant term are zero, i.e. D, = 13, = .....
= 0, = 0.11
ent variables. If a dependent variable
changes in just one direction (either up or Since ranks are distribution-free, testing for
down) as the explanatory variable increases significance using the F and t-tests are not appro-
(i.e. if the relationship between them is mon- priate. An additional concern with Rank Regres-
otonic) a higher-ranked independent variable sion is that the error structure cannot be normal
will correspond to a higher-ranked variable, and the mapping of individual observations to
regardless of the precise relation between the ranks is a somewhat arbitrary transformation.
two unranked variables.' Another feature of using ranks is that the data
after transformation are ordinal rather than inter-
A similar approach was adopted in Wallace and val and therefore the tests are effectively non-para-
Naser (1995). As well as these advantages Rank metric and as such are weaker than parametric
Regression is conceptually simple, preserves order, tests. This may be important when the sample size
and is effective in modelling monotonic relation- is small-a characteristic of many disclosure
ships where outliers are a serious problem. studies.
A weakness of Rank Regression is that it is dif- Siege1 and Castellan (1988) point out that para-
ficult to interpret 13, as the effect on Yi of a mar- metric tests are better than non-parametric tests
ginal increase of l in X,. For values of Di of - l when the assumptions of a parametric statistical
or +1, Oidoes have definite interpretation and for model, in terms of the data, are met. This is be-
zero there is, of course, no association. However, cause the power-efficiency of the parametric test is
within the range - 1 to +1, ignoring zero, inter- much greater than for non-parametric tests. In the
pretation is difficult. For example, if in a sample context of disclosure studies where data collection
Rank Regression D, = 0.7 then an increase in the is often onerous, parametric tests have obvious
rank of the right-hand-size variable increases the advantages.
rank of the left-hand-side variable by 0.7, but the In summary, Rank Regression has a number of
rank of the left hand side variable are integer. If advantages as well as some inherent weaknesses.
the rank of the left-hand-side variable is estimated
by the regression to be 8.7 does this imply the 'I There is a disagreement in the statistical literature about
value of the left hand size variable is close to the violations of the regression assumptions. For example, Kerlin-
value that has rank 9? An additional assumption ger and Pedhazar (1973) argue that OLS is 'robust' to viola-
of (linear) interpolation between the values of ob- tions whereas Bibby (1977) maintains that violations render the
technique almost worthless. In practice, the F-test for equality
of variances is sensitive to departures from normality,although
lo An alternative approach is non-parametric regression. t-tests are less sensitive.
214 ACCOUNTING A N D BUSINESS RESEARCH
Consequently, the use of normal scores is advo- mined, (b) the F and t-tests are meaningful and (c)
cated in this paper as an additional approach to the power of the F and t-tests may be used. In
the use of ranks. Normal scores effectively extend addition, the regression coefficients derived using
the rank approach to eliminate some of the weak- normal scores are meaningful, whereas Di from
nesses while retaining the advantages. Rank Regression is difficult to interpret for most
values.
The Appendix shows the derivation of the nor-
3.3. Development of the normal scores approach mal scores measure and its application to test the
An alternative to Rank Regression and other null hypothesis that all samples are identical.
data transformations when non-linearity is a A further characteristic of normal scores is that
problem is proposed here and is based on normal the approach offers a means whereby a non-nor-
scores. The transformation proposed is from ac- ma1 dependent variable may be transformed into
tual observations to the normal distribution by di- a normal one and as such offers a further advan-
viding the distribution into the number of obser- tage over ranks. A normally distributed dependent
vations plus one regions on the basis that each variable implies that the errors are also normally
region has equal probability. This method is re- distributed by the assumptions of OLS.
ferred to as the van der Waerden approach (van The normal scores approach has the same ad-
der Waerden, 1952, 1953).12In effect, the ranks are vantages as ranks when there are problems of
being substituted by scores on the normal distri- monotonicity and non-linearity. Normal scores
bution and so the normal scores approach may be preserve monotonicity in relationships as do ranks,
considered to represent an extension of the rank with higher-ranked values of the independent var-
method. For example, if there are six observations iables being associated with higher-ranked values
the normal distribution would be divided into of the dependent variables (the converse is also
seven equally probable parts so that the original true). In addition, when there is non-linearity with
values are replaced by normal scores (here data concentration, normal scores disperse that
-1.0676, -0.5659, -0.1800, 0.1800, 0.5659, concentration, an advantage also gained when us-
1.0676) rather than the ranks 1, 2 ..., 6.13 The re- ing ranks. Whether normal scores are better than
gression analysis would then proceed using the ranks in dealing with problems of monotonicity
normal scores as the dependent variable. In addi- and non-linearity depends on the structure of the
tion, continuous independent variables may also data.
be transformed to normal scores. An implication of the use of normal scores is
In 1938, Fisher and Yates first suggested the that if an additional case is added to the sample,
replacement of the original observations in stan- all the normal scores would need to be recom-
dard normal theory tests. Tables of normal scores puted. However, the same will also usually be true
were developed for different size samples by David of ranks: if an additional case is added to the sam-
et al. (1968), Harter (1961, 1969) and Owen ple then some of the ranks will change unless the
(1962).14 The proposal suggested here is to utilise new observation is ranked higher than all existing
knowledge about normal scores to transform re- observations. The probability of such an event is
gression variables. The main advantage of replac- low since the observation would be in the tail of
ing the ranks by normal scores is that the resulting the distribution. Thus, the recomputing dis-
tests would have exact statistical properties be- advantage when new observations are added ap-
cause (a) significance levels can now be deter- plies to ranks in many circumstances, such that the
disadvantage of normal scores relative to ranks is
negligible.
lZ The van der Waerden approach may be summarised as = An issue that so far has not been considered is
r/(n + 1). Alternatives include Blom = (r - 3/8)/(n + 1/4), whether the transformation to normal scores
Rankit = (r - 1/2)/n, and Tukey = (r - 1/3)/(n + 113). should involve only the dependent variable or
"These figures were derived using SPSS and represents one should relate to both the dependent and independ-
approach to deriving normal scores. An alternative approach
would be based on expected values, such that in this case the ent variables. In most instances it should be the
normal distribution would be divided into six parts and the latter, although if the only reason to use the ap-
normal score would be taken as the expected value of each part. proach is to normalise the dependent variable,
Thus, the normal score for the first observation may be then the former might be considered. Changing
calculated as follows:
only the dependent variable implies changing the
u, = E ( X J = r - xf (x) dx which is the p.d.f. of N ( p , a,) relationship between the dependent variable and
where P (X < c) = Iln all independent variables.
This requires substantial computation but tables are available
such as in Lindley and Scott (1984).
l4 I am grateful to Dr K. Read who informed me that var-
iations on the suggestion made in this paper, with respect to 3.4. Log of the odds ratio
normal scores, have been used in the medical research literaure An alternative transformation to those already
(e.g. Morgan (1992)). outlined is the log of the odds ratio. When it is
SUMMER 1998 215
assumed that the assumptions of the classical lin-

ear regression model hold, then a possible trans-
formation in disclosure studies is the log of the Figure 2
odds ratio of the dependent variable. A problem Stem-and-leaf plot of the untransformed voluntary
in disclosure studies is that the dependent vari- disclosure scores (Japan)
able-the extent of disclosure in corporate annual
reports-is bounded in the sense that no disclosure I Frequency Stem & Leaf
by a company will receive a zero and disclosure 3.00 0 . 778
leads to a positive index that approaches one 8.00 1 * 11334444
(100%) when there is full disclosure. As a result, it 11.oo 1 . 55567778899
is theoretically possible to have an estimated dis- 3.00 2 * 000
closure index outside the zero-one range. One ap- 2.00 2 . 89
proach to overcome this problem is to use the log 3.00 3 * 223
of the odds ratio (In [disclosure index/ (1 - disclo- 4.00 3 . 5779
1.oo 4 * 1
sure index)]), to ensure that the range is that of a Stem width: 0.10
normal distribution from - = to +or thereby over- Each leaf: 1 cases@)
coming the biased prediction problems that may I
affect truncated variables (Ahmed and Nichols,
1994). an explanation our regression model provides. The

This has the additional merit that, given the as- closer the regression line to actual points, the bet-
sumptions of the classical linear regression model, ter the equation ‘fits’ the data. In assessing meas-
a normally distributed dependent variable implies ures of best fit it could be argued that the coeffi-
that the distribution of the errors will also be nor- cient of determination, R2,is a possibility. Perhaps
mal. In most disclosure studies prediction is not RZis not the ideal measure of best fit for judging
the purpose of the study, but rather an explanation differences in right-hand-side variables because it
of the variability of the disclosure scores is sought is not invariant to changes in parameterizations of
and so the problem is of limited importance. left-hand side variables. In this case, it is preferable
for the mean square error (MSE)to be minimised
4. Application of transformations to (MSE = l/n Z(yi - 9J2),yi being calculated from
disclosure studies (Japan and S. Arabia) the appropriate inverse transformation of the re-
gression equation. The MSE is used to compare
This section reports the results using data from regression equations.
disclosure studies on Japan and Saudi Arabia. An
examination of the data is undertaken and is fol-
lowed by regressions using untransformed data, 4.1. Case I: model based on Japanese data
the log odds ratio transformation of the dependent The data on Japanese companies form part of a
variable, ranks, and normal scores (first, with only dataset used by Cooke (1991 and 1992). The data
the dependent variable transformed and then with used here relate to voluntary disclosure in the 1988
all continuous variables being transformed). The annual reports of Japanese listed corporations.
models are reported upon for comparison pur- Figure 2 shows the stem-and-leaf plot of the un-
poses and should be considered in the context of transformed voluntary disclosure score. The stem
the previous discussions. The two countries used indicates that the disclosure index scores range
are those for which the researcher has access and from less than 10% to 41%, with the majority of
in both cases one of the original objectives of the cases having a score of between 15% and 19%. Fig-
studies was to explain the variability in disclosure ure 3 shows the boxplot, which reveals a median
indexes. of 17%, a 25th percentile of 11% and a 75th per-
Japan is a developed country with the second centile of 29%. The smallest observed value that is
largest economy and stock exchange in the world not an outlier is 7% and the largest observed value
after the US.Saudi Arabia, in contrast, is a small that is not an outlier is 41%. Since this is the com-
developing country but the world’s largest ex- plete range no extreme values are reported, i.e.
porter of oil.l5 The two datasets provide contrast there are no cases with values between 1.5 and 3
and help to highlight the point that the data box lengths from the edge of the box.
should be examined in detail before deciding upon Figures 4 and 5 show the normal plot and de-
an appropriate statistical method. trended normal plot of the dependent variable.
An issue not yet considered is how to assess the The normal plot shows that the observations do
regression models. In postulating relationships not cluster around a straight line and the devia-
among variables, we want to know how powerful tions from a straight line are not randomly dis-
tributed around zero. The visual interpretation
l5 The number of disclosure studies on developing countries may be supported by some statistical tests. The
is increasing considerably; e.g. Nicholls and Ahmed, 1995. descriptive statistics are shown in Table 1 and
transformed data is negative because all the dis-

closure indexes are less than 0.5. Standard tests on
Figure 3 skewness and kurtosis reveal a problem in terms
Boxplot of the untransfomed voluntary disclosure of the latter. Since kurtosis exceeds 3, a larger pro-
scores (Japan) portion of cases falls into the tails of the distri-
bution than those of a normal distribution. The
0.48
Lilliefors test is significant at the 5% level, sug-
gesting non-normality.
When the dependent variable is transformed to
ranks there is no longer an apparent problem of
non-normality, though clearly the ranked data are
0.32 -. non-normal, even though both standard tests on
skewness and kurtosis are satisfactory and the Lil-
liefors test statistic suggests normality (K-Sgreater
than 5%).17 The standard tests are satisfactory for
0.16 -. the normal scores since the transformation is to
the standard normal distribution "(0, 1)].18 The
transformation leads to a mean and standard de-
I 0.00 + I
viation of approximately 0 and 1 respectively (see
Table 1).
4.2. Independent variables

Listing status
standard tests on skewness and kurtosis reveal a A distinction is made between companies that
somewhat skewed dependent variable. The non- have multiple listings and those that are listed only
parametric Lilliefors test is significant at the 5% on the Tokyo Stock Exchange (TSE). The measure
level revealing evidence of non-normality.I6 is expressed as a dummy dichotomous variable
A log odds ratio transformation of the depend- such that the expected relationship with the inde-
ent variable leads to a mean of -1.461 and stan- pendent variable is that disclosure will increase the
dard deviation of 0.602. The mean of the greater the exposure to foreign capital markets (see
I6 A variable that takes only positive values cannot be nor- l 7 SPSS provides a significance level up to 0.200 but after
mal since by definition, it must lie in the range - a to +=. For that reports >0.200.
a random variable with small variance and relatively large Where the dependent variable is converted into normal
mean, the probability that it falls less than zero may be scores necessarily the Kolmogorov-Smirnov statistic will be
minimal. unity.
Figure 4
N o d plot of the dependent variable (Japan)
2.40
1.60 -- *
**
0.80 -- ***
* **
0.00 -- *
**
-0.80 -- *
**
-1.60 -- *
*
-2.40 --
0.00 0.16 0.32 0.48

SUMMER 1998 217
I mue5
Detrended normal plot of the dependent variable (Japan)
0.48
**
0.32 -- * **
*
0.16 -- * ** *
* *
0.00 --
*
-0.16 --
*
-0.32 -- *
*
-0.48 -- *
I I I 1
I I
Singvi and Desai, 1971; Spero, 1979; Firth, 1979; relationship is somewhat uncertain (Wallace et al.,
Cooke, 1989a). Thus, multiple listed companies 1994).
are expected to disclose more information than
those listed only on the TSE, particularly where
the extent of disclosure in foreign stock markets is
greater than that in Japan (see for example, Biddle
and Saudagaran, 1991). Borrowing ratio
This ratio measures the proportion of total as-
sets financed by bank borrowings. It has been hy-
Industry sector pothesised that companies with a higher propor-
This variable is included as a dummy in which tion of their assets financed by bank borrowings
a distinction is made between manufacturing en- will disclose more information in their annual re-
terprises and non-manufacturing companies. The ports to meet some of the needs of their lenders
expectation based on the literature is that manu- (see Jensen and Meckling, 1976; Myers, 1977;
facturing companies disclose more information Schipper, 1981; Leftwich et al., 1981; Belkaoui and
than non-manufacturing enterprises (see Stanga, Kahl, 1978; Malone et al., 1993; and Wallace et
1976; Cooke, 1989c), although the direction of the al., 1994).
I I
Table 1
Descriptive statistics of the voluntary disclosure indexes Japan-
Data sources
Untransformed Log o d d ratio Rank Normal scores
Mean 0.204 - 1.461 18.000 0.002
Standard deviation 0.907 0.602 10.226 0.920
Minimum 0.070 - 2.590 1.500 - 1.732
Maximum 0.410 - 0.360 35.00 1.915
Skewness 0.760 0.183 0.004 0.024
S.E.Skewness 0.397 0.398 0.398 0.398
Kurtosis 2.404 2.410 1.799 2.478
S.E.Kurtosis 0.777 0.778 0.778 0.778
-
Z test skewness 1.914 0.460 0.010 0.060
Z test - skewness - 0.767 3.098 - 1.544 - 0.671
Kolmogorov-Smirnov (Lilliefors- signif.) 0.000 0.017 >0.200 >0.200
Turnover formed into normal scores (Model 5, Table 2),

Turnover is a measure of size and the expected only listing status was found not to be significant,
relationship is that disclosure will be higher for a result consistent with the Rank Regression ap-
larger companies (see Buzby, 1972; Choi, 1973; proach. When no transformations are undertaken
Firth, 1979; Schipper, 1981; Cooke, 1989a; Cooke, and when the dependent variable is the log of the
1989b). odds ratio, the results are consistent with the
However, there is a theoretical argument that ranked data and with the approach that trans-
large firms are more visible than smaller firms and forms both the dependent and continuous inde-
as a result are more exposed to political attacks pendent variables into normal scores.
(Jensen and Meckling, 1976). It is possible that Thus, non-manufacturing companies disclose
firms will respond by reducing the extent of dis- significantly less information than manufacturing
closure in corporate annual reports. Thus, the corporations in all the models. In addition, higher
theoretical relationship is somewhat uncertain al- levels of gearing are associated with higher levels
though assumed to be monotonic (Lang and Lun- of disclosure, and multiple listed companies do not
dholm, 1993; Wallace et al., 1994; Wallace and voluntarily disclose more information than those
Naser, 1995). companies listed only on the TSE.
However, the findings with respect to the size
Model specijication
variable, turnover, and the constant term is not the
same for all models. When the dependent variable

The full specification of the regression is: is normalised both the constant term and turnover
Yi = PO+ PI XIi + P2X2i + P 3 X3i + 8 4 X4i + Li are found not to be significant. When both the de-
where pendent and continuous independent variables are
normalised the constant term is not significant.
Y = disclosure index scores Using different models on this dataset suggests
Listing status that, in the main, the same variables are significant
X , = 1 if the company is multiple listed although with differing coefficients. The coeffi-
0 if otherwise cients attached to the constant and to turnover are
significant in most of the models, but not all.
Industry sector With regard to the use of ranks, in estimating Y
X2 = 1 if the company is a manufacturing given R(Y) it is necessary to undertake a linear
enterprise interpolation. In this dataset the untransformed in-
0 if otherwise dex is constrained to bc 0 s index s!; then for an
Continuous variables observation for which R(Y) < 1 or R(Y) >n inter-
X, = the proportion of total assets financed polation is possible taking R(0) = 0 and
by bank borrowings R(l)=n+l. This procedure is not available for
variables with unconstrained limits, which is parti-
X4 = turnover cularly important for variables such as sales when
the upper limit is unconstrained (i.e. it wpdd be
L = error term difficult to establish a suitable value for Y when
i = 1 , . . . . 35 R(Y) exceeds the sample size).
P = parameters (where the constant Po The MSE based on ranks (with interpolation)
adjusts for any excluded dummy was the lowest (0.0032), followed by the log odds
variables) ratio of the dependent variable (0.0034), the nor-
The information provided in Table 2 should be mal scores of the dependent variable (0.0035), the
read in the context of the cautionary notes dis- unadjusted dependent variable (0.0039), and fi-
cussed previously, particularly the significance of nally by the normal scores of both the dependent
the t-tests when using Rank Regression. Using and independent continuous variables (0.0057).
ranked data (Model 3, Table 2), all the variables For information putposes, the R2 based on
included in the model were found to be significant Rank Regression is highest (0.66346) followed by
at the 5% level with the exception that multiple the normal scores for both the dependent and in-
listed companies did not disclose significantly dependent variables (0.64148), the log odds ratio
more voluntary information than those corpora- transform of the dependent variable (0.62443), the
tions with only a listing on the TSE. This finding dependent variable converted to normal scores
is consistent with that of Biddle and Saudagaran (0.61926), and the untransformed model (0.58658).
(1991) and Wallace and Naser (1995).
When the dependent variable was transformed
(Model 4, Table 2) into normal scores, both turn- 4.3. Case II: model based on Saudi Arabian data
over and listing status were found not to be signifi- The data on Saudi Arabian companies used here
cant. However, when the dependent variable and relates to disclosure, voluntary and mandatory, in
the continuous independent variables were trans- the 1990 annual reports of Saudi Arabian listed
SUMMER 1998 219
Table 2
Regression analyses of determinants of disclosure scores by Japanese corporations
(4) (5)
Independent Variable
Non manufacturing - 0.084687 -0.632783 - 13.911085 - 1.060822 - 1.226670
companies
(- 3.354)* ( - 4.245)* ( - 5.874)* (-4.628)* (- 5.572)*
Gearing 0.165240 1.038538 0.324748 1.732437 0.308971
(2.549)* (2.714)* (3.168)* (2.944)* (2.934)*
Turnover 3.265241E-08 1.844500E-07 0.322267 2.416326E-07 0.287893
(2.465)* (2.359)* (2.304)* (2.010) (2.035)'
Multiple listed 0.059111 0.339115 2.993745 0.476006 0.344652
(1.962) (1.907) (0.939) (1.741) (1.183)
Constant 0.156742 - 1.710596 9.075517 - 0.344931 0.217481
(7.751)* (- 14.328)* (3.361) (- 1.879) (1.502)
MSE# 0.00391 0.00337 0.00319 0.00354 0.00565

R* 0.58658 0.62443 0.66346 0.61926 0.64148
Standard error 0.06252 0.36908 5.93236 0.56754 0.55073
F 13.06005 15.13198 17.75725 14.82511 16.20827
The upper figures for each variable are coefficients and the lower figures are the t-statistics. The coefficients of
the excluded dummy variables are all 1.00000 since they act as benchmarks for the included dummies.
*=significant at the 5% level.
# = l/n (Yi-Yi)z,see text for further discussion.
Model 1 Regression using untransformed data.
Model 2 Regression using the log odds ratio @n(x/(l-x)].
Model 3 Regression using ranked data.
Model 4 Regression using a transformed dependent variable to normal scores.
Model 5 Regression using normal scores for the dependent and independent variables.
corporations. The disclosure data on 33 companies ing a score of between 80% and 83%. Figure 7
is chosen because the dependent variable reveals shows the boxplot which reveals a median of 8O%,
characteristics of non-normality. a 25th percentile of 68% and a 75th percentile of
Figure 6 shows the stem-and-leaf plot of the un- 75%. The smallest observed value that is not an
transformed voluntary disclosure scores. The stem outlier is 66% and the largest observed value that
indicates that the disclosure index scores range is not an outlier is 91%. The extreme scores of 33%
from 33% to 95%, with the majority of cases hav- and 95% may be considered to be outliers.
Figures 8 and 9 show the normal plot and de-
trended normal plot of the dependent variable.
The normal plot shows that the observations do
Figure 6 not cluster around a straight line and the devia-
Stem-and-leaf plot of the untransformed voluntary tions from a straight line are not randomly dis-
disclosure scores (Saudi Arabia) tributed around zero. The visual interpretation
may be supported by some statistical tests. The
Frequency Stem & Leaf descriptive statistics are shown in Table 3. The
3.00 Extremes (0.33), (0.54), (0.54) mean of the untransformed data is 0.766 with a
3.00 6 . 668 standard deviation of 0.117 and a range from
3.00 7* 124 0.333 to 0.953. Standard tests on skewness and
6.00 7 . 556679 kurtosis reveal problems of skewness and kurtosis
14.00 8 * oooO1111112223 of the dependent variable. The non-parametric Lil-
1 .00 8 . 5 liefors test is significant at the 5% level, revealing
2.00 9* 01
1.00 Extremes (.95) considerable evidence of non-normality. When the
Stem width 0.10 log odds ratio of the dependent variable is used
Each leaf: 1 case@) the mean is 1.278 with a standard deviation of
0.653. Standard tests on skewness and kurtosis re-
Figure 7
Boxplot of the untransfomed voluntary disclosure scores (Saudi Arabia)
1.20 --
0.90 --
0.60 --
0.30 t
veal a problem in terms of the latter. Since kurtosis tory and the Kolmogorov-Smirnov test statistic
exceeds 3, a smaller proportion of cases falls into suggests normality (K-S greater than 5%).19
the upper tail of the distribution than those of a
normal distribution. The Lilliefors test is signifi- l9 Other transformations in terms of powers, roots and logs
cant at the 5% level, confirming problems of were used in an attempt to convert the data to approximately
non-normality. normal. None of these transformations was able to correct for
both skewness and kurtosis. Thus, the rank and normal scores
When the dependent variable is ranked or trans- approaches
formed into normal scores there is no longer any formations. have advantages in this instance over other trans-
It should be noted that joint tests of skewness and
apparent problem of non-normality. Both stan- kurtosis are sometimes thought to be biased (see Doornik and
dard tests on skewness and kurtosis are satisfac- Hendry, 1994).
Figure 8
Normnl plot of the dependent variable (Saudi Arabia)
2*40
1.60 --I
*
*
0.80 -- ***
*
0.00 -- *
**
-0.80 -- *
**
-1.60 -- *
*
-2.40 --
I I 1 I
SUMMER 1998 221
Figure 9
Detrended normal plot of the dependent variable (Saudi Arabia)
1.80
I
1.20 --
0.60
*****
0.00 - * **
* ****
- 0.60 - * ****
- 1.20 --
- 1.80 + *
I I I I
0.25 0.50 0.75 1.00
4.4. Independent variables Model specification

Regressors that have typically been found to be The regression specification is:
significant in similar disclosure studies on coun-
tries around the world were found not to be signifi- Yi = BO+ B1 Xli + 8 2 X,i + ci
cant in the case of Saudi Arabia. The size variable where
chosen was in terms of share capital (Capital). Be-
cause of the uncertainty of the theoretical relation- Y = disclosure index scores
ship between size and the dependent variable, it XI = government holdings in share capital
was decided, as one alternative, to transform de-
pendent and independent variables. In addition, X, = share capital
the extent of government holdings in listed com- c = error term
panies (government investment) was thought to be i = 1 , . . . . 33
a possible explanatory variable because the B = parameters
government would have access to inside informa-
tion. It was therefore hypothesised that the greater The level of government holdings was found not
the government holding, the lower the level of to be significant in any of the models. With respect
public disclosure in annual reports. to the size variable, share capital, it was found to
Table 3
-
Descriptive statistics of the disclosure indexes Saudi Arabia
Data sources
Untransformed Log odds ratio Rank Normal scores
Mean 0.766 1.278 17.000 O.OO0
Standard deviation 0.117 0.653 9.669 0.922
Minimum 0.333 - 0.690 1.000 - 1.890
MaximUll 0.953 3.010 33.000 1.890
Skewness - 1.887 - 0.411 - 0.001 -0.001
S.E.Skewness 0.409 0.409 0.409 0.409
Kurtosis 8.206 5.870 1.799 2.513
S.E.Kurtosis 0.798 0.798 0.798 0.798
Z test - skewness -4.619 1.005 - 0.002 - 0.002
-
Z test kurtosis 6.520 3.596 - 1.505 - 0.610
Kolmogrov-Smirnov (Lilliefors- signif.) 0.000 0.000 >0.200 >0.200
be significant at the 5% level in the untransformed Using normal scores for both the dependent and
model, the log odds of the dependent variable independent variables (Model 5, Table 4) gives a
model and when the dependent variable was trans- negative l?* although the coefficient of determina-
formed to normal scores. In the other two models, tion itself was of course positive. The fi2 based on
ranked transformation and all continuous varia- transformation of the dependent variable into nor-
bles transformed to normal scores, these two var- mal scores is highest (0.14626) followed by the un-
iables were found not to be significant (Table 4). transformed model (0.09932), the log odds ratio
The constant term was found to be significant transform of the dependent variable (0.09544),
in the untransformed model, the log odds of the Rank Regression (0.00345), and finally the regres-
dependent variable model, and the rank trans- sion that uses transforms of both dependent and
formation model. Thus, the coefficients attached independent variables to normal scores
to certain variables and whether they are signifi- ( - 0.017 12).
cant depends both on the data and on the type of
transformation undertaken.
In the case of Saudi Arabia, the measure of best
5. Summary and conclusions
fit used was again the MSE. The log odds ratio of This paper has reviewed some possible transfor-
the dependent variable had the lowest MSE mations that attempt to deal with theoretical re-
(0.0119), followed by the normal scores of both the lationships that are not well known or where
dependent and independent continuous variables measures are merely proxies for underlying con-
(0.0128), the normal scores of the dependent vari- structs. The transformations include Rank Regres-
able (0.0283), the MSE based on the ranks with sion in disclosure studies when dealing with non-
interpolation (0.0381), and finally by the unad- linear and linear relationships when such
justed dependent variable (0.1239). The fact that relationships are, by hypothesis, monotonic.
the MSE of the regression based on an unadjusted The shortcomings have been identified and an
dependent variable was substantially different extension based on normal scores proposed. The
from the others indicate advantages of other ap- normal scores approach, like the rank method,
proaches, such as the normal scores method. preserves monotonicity and with non-linear rela-
In terms of the coefficient of determination, the tionships disperses the concentration of data.
l?* was found to be very low using ranked data. However, the normal scores method has a number
rable 4
Regression analyses of determinants of disclosure scores by Saudi Arabian corporations
Model
(1) (2) (3) (4) (5)
lndependent Variable
Government 8.896046E-04 0.005696 -0.118695 0.012505 0.028172
investment
(0.868) (0.996) (0.610) (1S94) (0.131)
Capital -2.41870E-05 - 1.36207E-04 -0.275854 -2.29157E-04 -0.225778
( - 2.194)* (- 2.214)* (-1.451) ( - 2.714)* (- 1.140)
Constant 0.776554 1.323254 19.671706 0.010300 -9.62725E-04
(30.270)* (9.242)* (4.868)* (0.052) (- 0.006)
MSE# 0.12390 0.01188 0.03805 0.02831 0.01276

R 2 0.09932 0.09544 0.00345 0.14626 - 0.01712
Standard error 0.1 1130 0.621 18 9.65204 0.85232 0.93030
F 2.76438 2.68823 1.05539 3.74 101 0.73065
The upper figures for each variable are coefficients and the lower figures are the t-statistics. The coefficients of
the excluded dummy variables are all 1.00000 since they act as benchmarks for the included dummies.
* =significant at the 5% level.
# = lln (Yi-'?J2,see text for further discussion.
Model 1 Regression using untransformed data.
Model 2 Regression using the log odds ratio @n(x/(l-x)].
Model 3 Regression using ranked data.
Model 4 Regression using a transformed dependent variable to normal scores.
Model 5 Regression using normal scores for the dependent and independent variables.
SUMMER 1998 223
of advantages over Rank Regression, namely (a)

that a normally distributed dependent variable im-
plies the same property for the distribution of the
errors (b) that the significance tests are meaningful
and have greater power than when using ranks (c) where 4-1 (x) is the inverse of the N (0,l)cumu-
the coefficients obtained when using the normal lative density function. This method of obtaining
scores approach are more meaningful than for Normal Scores is the approach adopted by SPSS.
Rank Regression.
Some of the identified transformations were ap-
plied to two datasets of information disclosure in References
the annual reports of companies in Japan, a
Ahmed, K. and Nicholls, D. (1994). ‘The impact of non-finan-
developed country, and Saudi Arabia, a develop- cia1 company characteristics on mandatory disclosure compli-
ing country. Using two datasets has the advantage ance in developing countries: the case of Bangladesh’. Inter-
that the outcomes of transformations can be con- national Journal of Accounting, 29(1):62-77.
trasted and are not specific to one dataset. This A1 Modahki, J. (1996). ‘An empirical study of accounting dis-
closure development in the Kingdom of Saudi Arabia’. Uni-
approach helps to highlight the point that the data versity of Exeter.
should be examined in detail before deciding upon Anderson, R. (1981). ‘The usefulness of accounting and other
an appropriate statistical method. The models pro- information disclosures in corporate annual reports to insti-
duced some differences in terms of significance of tutional investors in Australia’. Accounting and Business Re-
variables and magnitude of the coefficients. As a search, 11 (Autumn): 259-265
Beaver, W. H., Clarke, R. and Wright, W. F. (1979). ‘The
result, it is possible that biased estimates may oc- association between unsystematic security returns and the
cur and substantive errors could go unnoticed if magnitude of earnings forecast errors’. Journal of Accounting
the data is not analysed appropriately. Research, Autumn.
The important lesson to be learned from the two Belkaoui, A. and Kahl, A. (1978). Corporate Financial Disclo-
sure in Canada. Research Monograph No. 1 of the Canadian
case studies is that the success of the transforma- Certified General Accountants Association. Vancouver: Ca-
tions in improving the fit of the model is depend- nadian Certified General Accountants Association.
ent on the structure of the data. In the case of the Bibby, J. (1977). ‘The General Linear Model-A Cautionary
data on Japan, the MSE based on ranked data Tale’. OMuircheartaigh, C. A. and Payne, C. (eds.), The
Analysis of Survey Data, 2: Model Fitting. New York: Wiley.
with interpolation was found to be best. In the Biddle, G. and Saudagaran, S. (1991). ‘Foreign stock listings:
case of the data on companies in Saudi Arabia, it benefits, costs and the accounting policy dilemma’. Accounting
was found that the log odds ratio of the dependent Horizons, September.
variable provided the best fit and using normal Buzby, S . L. (1972). ‘An empirical investigation of the relation-
ship between the extent of disclosure in corporate annual re-
scores (both models) provided a better fit than us- ports and two company characteristics’. Unpublished doctoral
ing ranks. dissertation, Pennsylvania State University.
In conclusion, the normal scores approach has Chang, L. and Most, K. S. (1977). ‘Investor uses of financial
theoretical advantages over the use of pure ranks statements: an empirical study’. Singapore Accountant.
although when applied to the two case studies Cheng, C. S., Hopwood, W.S . and McKeown, J. C. (1992).
“on-linearity and specification problems in unexpected earn-
there was no overwhelming case for one particular ings response regression model’. The Accounting Review, 67
approach. This emphasises the point that it is im- (July): 579-598.
portant to examine the structure of the data and Choi, F. D. S. (1973). ‘Financial disclosure and entry to the
the relationships between the dependent and in- European capital market’. Journal of Accounting Research, 11
(Autumn): 159-175.
dependent variables if errors in interpretation are Cooke, T. E. (1989a). ‘Disclosure in the corporate annual re-
to be avoided. ports of Swedish companies’. Accounting and Business Re-
search (Spring): 113-124.
Cooke, T. E. (1989b). An empirical study offinancial disclosures
Appendix by Swedish companies. New York: Garland Publishing.
Cooke, T. E. (1989~).‘Voluntary corporate disclosure by Swed-
Suppose data XJ,,. ..X, are ordered from smallest ish companies’. Journal of International Financial Management
to largest to give the order statistics X&Y&... X(,,,. and Accounting, 2 (Summer): 171-195.
Cooke, T. E. (1991). ‘An assessment of voluntary disclosure in
Following Lehman (1975) mapping the n ordered the annual reports of Japanese corporations’. The Inter-
observations to a N(0.I) distribution with density national Journal of Accounting, 26(3): 174189.
function 4(x) by taking the expected value of q,, Cooke, T. E. (1992). ‘The impact of size, stock market listing
as the Normal Score as in and industry type on disclosure in the annual reports of J a p
anese listed corporations’. Accounting and Business Research
zi= E+
[I
Since the natural estimate in (1) is difficult to com-
(Summer): 229-237.
David, F. N., Barton, D. E., Ganeshalingam, S., Harter, H. L.,
Kim, P. J. and Merrington, M. (1968). Normal centroidr, me-
dian and scores for ordinal data. Cambridge: Cambridge Uni-
versity Press, London.
pute for a given dataset an alternative Normal Doornik, J. A. and Hendry, D. F. (1994). ‘A practical test for
Score proposed by van der Waerden (1952, 1953) univariate and multivariate normality’. Discussion paper. Ox-
is to redace the exDectations in (1) bv ford: Nuffield Collcee.
Draper, D. (1988). ‘Rank-based robust analysis of linear mod- Lindley, D. V. and Scott, W. F. (1984). New Cambridge Ele-
els’. Statistical Science, May. mentary Statistical Tables. Cambridge: Cambridge University
Epstein, M. (1975). ‘The usefulness of annual reports to cor- Press.
porate stockholders’. California State University. Los Ang- Malone, D., Fries, C. and Jones, T. (1993). ‘An empirical
eles: Bureau of Business and Economic Research. investigation of the extent of corporate financial disclosure in
Firth, M. (1979). ‘The impact of size, stock market listing and the oil and gas industry’ Journal of Accounting Auditing and
auditors on voluntary disclosure in corporate annual reports’. Finance, 8, new series, (Summer): 249-273.
Accounting and Business Research, 9 (Autumn): 273-280. McCabe, B. P. M. (1989). ‘Misspecification tests in economet-
Fisher, R. A. and Yates, F. (1938). Statistical tables for bio- rics: based on ranks’. Journal of Econometrics, 40.
logical agricultural and medical research. Edinburgh: Oliver Morgan, B. J. T. (1992). Analysis ofquantal response data. Lon-
and Boyd. don: Chapman and Hall.
Fox, J. (1984). Linear statistical models and related methods. Myers, S. C. (1977). ‘Determinants of corporate borrowing’.
New York John Wiley. Journal of Financial Economics, 4 (November).
Harter, H. L. (1961). ‘Expected values of normal order statis- Nicholls, D. and Ahmed, K. (1995). ‘Disclosure quality in cor-
tics’. Biometrika, 48: 151-165. porate annual reports of non-financial companies in Bangla-
Harter, H. L. (1969). Order statistics and their use in testing and desh’. Research in Accounting in Emerging Economies, 3:
estimation. Washington DC: US Government Printing Office. 149-70.
Hines, R. D. (1982). ‘The usefulness of annual reports: the Owen, D. B. (1962). Handbook of staristical tables. Reading,
anomaly between the efficient markets hypothesis and share- Massachusetts: Addison-Wesley.
holder surveys’. The Accounting Review (Autumn): 296-309. Schipper, K. (1981). ‘Discussion of voluntary corporate disclo-
Hoaglin, D. C.,Mosteller, F. and Tukey, J. W. (1983). Under- sure: the case of interim reporting’. Journal of Accounting Re-
standing robust and exploratory data analysis. New York: John search, supplement.
Wiley. Siegel, S . and Castellan, N. J. (1988). Non-parametric statistics

Hollander, M. and Wolfe, D. A. (1973). Non-parametric statis- for the behavioral sciences. New York: McGraw-Hill.
tical methodr. New York: John Wiley. Singhvi, S. S. and Desai, H. (1971). ‘An empirical analysis of
Iman, R. L. and Conover, W. J. (1979). ‘The use of rank trans- the quality of corporate financial disclosure’. The Accounting
form in regression’. Technometrics, November. Review, January.
Jensen, M. and Meckling, W. H. (1976). ‘Theory of the firm: Spero, L. L. (1979). ‘The extent and causes of voluntary dis-
managerial behavior, agency costs and ownership structure.’ closure of financial information in three European capital
Journal of Financial Economics, 3 (October): 305-360. markets: an explanatory study’ (unpublished doctoral disser-
Kennedy, P. (1992). A guide to econometrics. Oxford Blackwell tation. Harvard University).
Publishing. Stanga, K. (1976). ‘Disclosure in published annual reports’, Fi-
Kerlinger, F. N. and Pedhazar, E. J. (1973). Multiple regression nancial Management, Winter.
in behavioral research. New York Harper and Row. Stuart, A. and Ord, J. K. (1983). Kendull’s advanced theory of
Lang, M. and Lundholm, R. (1993). ‘Cross-sectional determi- statistics. van der Waerden, B. L. (1952, 1953). ‘Order tests
nants of analyst ratings of corporate disclosures’. Journal of for the Two-sample and their power’. Indagationes Mathe-
Accounting Research, 31 (Autumn): 246-271. matics, 14, 15. ‘Errata’. ibid. (1953).
Lang, M. and Lundholm, R. (1996). ‘Corporate disclosure pol- Tukey, J. W. (1977). Exploring data analysis. Reading, Mass:
icy and analyst behavior’. The Accounting Review, 7 (Octo- Addison-Wesley.
ber): 467-492. Wallace, R. S. 0. and Naser, K. (1995). ‘Firm-specific deter-
Lee, T. A. and Tweedie, D. (1975). ‘Accounting information: minants of the comprehensiveness of mandatory disclosure in
an investigation of private shareholder usage’. Accounting and the corporate annual reports of firms listed on the Stock
Business Research, 31 (Autumn): 246-271. Exchange of Hong Kong’. Accounting and Public Policy, 14:
Leftwich, R. W., Watts, R. L. and Zimmerman, J. L. (1981). 41-53
‘Voluntary corporate disclosure: the case of interim reporting’. Wallace, R. S. O., Naser, K. and Mora, A. (1994). ‘The
Journal of Accounting Research, 19, supplement. relationship between the comprehensiveness of corporate an-
Lehmann, E. L. (1975). Non-parametrics: statistical methods nual reports and firm characteristics in Spain’. Accounting and
based on ranks. San Francisco: Holden-Day. Business Research, 25 (Winter): 41-53.
Lewis-Beck, M. S. (1993). Regression Analysis. London: Sage Zaman. A. I1 996). Staristical foundations for econometric tech-
Publications. niques. London:’ Academic hess.

++cooke, T. E. (1998) - Regression Analysis in Accounting Disclosure Studies.

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

++cooke, T. E. (1998) - Regression Analysis in Accounting Disclosure Studies.

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

++cooke, T. E. (1998) - Regression Analysis in Accounting Disclosure Studies.

Uploaded by

Copyright:

Available Formats

This article was downloaded by: [Erciyes University]

On: 03 January 2015, At: 11:09

Accounting and Business Research

Regression Analysis in Accounting Disclosure

To link to this article: http://dx.doi.org/10.1080/00014788.1998.9728910

PLEASE SCROLL DOWN FOR ARTICLE

Regression Analysis in Accounting

1. Introduction problems of skewness and kurtosis, as well as

ues suggest linearity. This can be further investi-

straightened by moving along the ladder of powers

assumed that the assumptions of the classical lin-

1994). an explanation our regression model provides. The

transformed data is negative because all the dis-

4.2. Independent variables

0.00 0.16 0.32 0.48

Turnover formed into normal scores (Model 5, Table 2),

same for all models. When the dependent variable

MSE# 0.00391 0.00337 0.00319 0.00354 0.00565

0.25 0.50 0.75 1.00

4.4. Independent variables Model specification

MSE# 0.12390 0.01188 0.03805 0.02831 0.01276

of advantages over Rank Regression, namely (a)

Wiley. Siegel, S . and Castellan, N. J. (1988). Non-parametric statistics

You might also like