Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
34 views

Fundamental Analysis Via Machine Learning

We examine the efficacy of machine learning in a central task of fundamental analysis: forecasting corporate earnings.

Uploaded by

Tolu Akinrimisi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

Fundamental Analysis Via Machine Learning

We examine the efficacy of machine learning in a central task of fundamental analysis: forecasting corporate earnings.

Uploaded by

Tolu Akinrimisi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Financial Analysts Journal

ISSN: (Print) (Online) Journal homepage: www.tandfonline.com/journals/ufaj20

Fundamental Analysis via Machine Learning

Kai Cao & Haifeng You

To cite this article: Kai Cao & Haifeng You (2024) Fundamental Analysis via Machine Learning,
Financial Analysts Journal, 80:2, 74-98, DOI: 10.1080/0015198X.2024.2313692
To link to this article: https://doi.org/10.1080/0015198X.2024.2313692

View supplementary material

Published online: 21 Mar 2024.

Submit your article to this journal

Article views: 5517

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=ufaj20
Research Financial Analysts Journal | A Publication of CFA Institute
https://doi.org/10.1080/0015198X.2024.2313692

Fundamental Analysis via


Machine Learning
Kai Cao and Haifeng You, CFA
Kai Cao is an independent researcher. Haifeng You, CFA, is a chair professor of accounting in the School of Economics and Management
and the Shenzhen International Graduate School at Tsinghua University. Send correspondence to Haifeng You at hyou@sem.tsinghua.
edu.cn.

We examine the efficacy of


machine learning in a central task
of fundamental analysis: forecast-
ing corporate earnings. We find Introduction
that machine learning models not Fundamental analysis is a cornerstone of capital market operations. It
only generate significantly more requires the assimilation and processing of vast and varied data, often
accurate and informative out-of- incurring considerable costs. The primary objective of this paper is to
sample forecasts than the state- assess the potential of machine learning (ML) to enhance a central
of-the-art models in the literature task of fundamental analysis: the forecasting of corporate earnings
but also perform better compared (e.g., Penman 1992; Lee 1999; Richardson, Tuna, and Wysocki 2010;
to analysts’ consensus forecasts. Monahan 2018). The literature has laid a substantial foundation with
This superior performance various earnings prediction models; however, their performance often
appears attributable to the ability does not surpass that of the rudimentary random walk (RW) approach
of machine learning to uncover (Easton, Kelly, and Neuhierl 2018; Monahan 2018). We posit that the
new information through identify-
shortcomings of these models, particularly their methodological con-
ing economically important pre-
straints, could be mitigated by leveraging ML technologies.
dictors and capturing nonlinear
relationships. The new informa- Corporate earnings are the cumulative result of a myriad of transac-
tion uncovered by machine learn- tions, each reflected within various financial statement items that can
ing models is of considerable have disparate impacts on future earnings. The intricate nature of
economic value to investors. It these transactions, supported by both economic theory and empirical
has significant predictive power findings, suggests that the relationships between financial statement
with respect to future stock items and subsequent earnings are often nonlinear.1 ML algorithms,
returns, with stocks in the most with their inherent design to process high-dimensional data and dis-
favorable new information quintile cern complex nonlinear and interaction effects (e.g., Gu, Kelly, and Xiu
outperforming those in the least 2020), are potentially well suited to capture these nuanced effects
favorable quintile by approxi- and intricate patterns. However, the intricacy of ML models can also
mately 34 to 77 bps per month lead to overfitting. Therefore, the true efficacy of ML in providing
on a risk-adjusted basis. superior earnings forecasts remains an empirical question.
Keywords: earnings forecasts; equity To answer this research question, we develop an ML earnings forecast-
valuation; fundamental analysis;
ing model that combines three popular ML algorithms: two algorithms
machine learning; market efficiency
based on decision trees (i.e., random forest [RF] and gradient boosting
Disclosure: No potential conflict of
interest was reported by the author(s).
regression [GBR]) and one based on artificial neural networks (ANNs).
We supply these algorithms with both the level and first-order difference
Research funding support from Hong Kong RGC project no. T31-604/18-N is highly
PL Credits: 2.0 appreciated.

74 © 2024 CFA Institute. All rights reserved. Volume 80, Number 2


Fundamental Analysis via Machine Learning

of a comprehensive set of financial statement items,2 fail to consider even though our ML forecasts are
let the algorithms “learn” from the historical data to based on financial statement data only.
determine the underlying relationships, and generate
Our paper contributes to the burgeoning literature
out-of-sample forecasts for 134,154 firm-year observa-
on the application of ML in finance and accounting
tions from 1975 to 2019. We compare the accuracy,
research (e.g., Chen et al. 2022; Gu, Kelly, and Xiu
information content, and investment value of the out-
2020; Bao et al. 2020; Bertomeu et al. 2021; Ding
of-sample ML forecasts to the forecasts obtained from
et al. 2020) by demonstrating the decisional useful-
the RW model and five other models developed in the
ness of ML technology in one of the most important
extant literature: the (first-order) autoregressive model
tasks in fundamental analysis: corporate earnings
(AR), two models (HVZ and SO) developed by Hou, van
forecasting. We show that by identifying economi-
Dijk, and Zhang (2012) and So (2013), respectively; and
cally important predictors and capturing subtle non-
the earnings persistence (EP) and residual income (RI)
linear relationships, ML can better utilize financial
models proposed by Li and Mohanram (2014).
statement information and produce significantly more
We find that the ML forecasts are significantly more accurate and informative earnings forecasts.
accurate than not only the RW model but also all
Our paper is also of interest to investment professio-
extant models. The mean absolute forecast errors of
nals. First, our model can be used to generate earn-
the extant models are approximately 7.79% to
ings forecasts at a low cost even for firms with a
26.53% greater than that of the ML model. Cross-
short history and those without analyst coverage. It
sectional analyses indicate that the ML model leads
not only offers a less biased alternative to analyst
to even greater accuracy improvements among firms
forecasts as a valuation input by stock pickers (e.g.,
with more difficult-to-forecast earnings. The ML fore-
Ohlson 1995; Ohlson and Juettner-Nauroth 2005)
cast also has greater information content as measured
but also can be directly used to value many firms
by its predictive power with respect to future actual
without analyst coverage. Second, investors may also
earnings changes (ECH). Forecasted earnings changes
easily modify our models to forecast other funda-
(FECH) based on the ML forecast explain 18.61% of
mental variables (e.g., sales, gross profit) that are of
the variation in ECH, whereas FECH based on the
considerable importance for investors. Third, our
extant models only explain about 8.07% to 12.22%.
paper also offers a potential systematic trading strat-
We then test whether the new information uncov- egy to quantitative investors, who can build on our
ered by the ML model can lead to significant study and refine the model by, for example, forecast-
improvements in investment decision-making. To this ing future quarterly earnings and other fundamentals
end, we orthogonalize the ML forecast against the or by incorporating non–financial statement informa-
contemporaneous forecasts from the extant models tion to further improve returns to the strategy.
and use the residuals to measure the new informa-
tion uncovered by the ML forecasts (beyond the
extant models). The results show that the new infor- Related Literature and Extant
mation component has significant predictive power Earnings Forecasting Models
with respect to future stock returns. The top quintile As Monahan (2018) summarizes, early research
of stocks with the most favorable new information mostly adopted the time-series approach to forecast
significantly outperforms the least favorable new future earnings (e.g., Ball and Watts 1972; Watts and
information quintile by approximately 34 to 77 bps Leftwich 1977). Overall, their results suggest that the
per month on a risk-adjusted basis. simple RW model, which predicts expected future
earnings to equal current earnings, generates more
We also find that the ML forecasts perform well
accurate out-of-sample forecasts than the more
against analyst consensus forecasts, even though ana-
sophisticated autoregressive integrated moving aver-
lysts have access to much more information than the
age (ARIMA) models (e.g., Brown 1993; Kothari
financial statements. First, the ML forecast has signifi-
2001).3 The superiority and simplicity of the RW
cantly lower mean absolute forecast errors than analyst
model make it a natural benchmark in evaluating
consensus earnings forecasts for all three-year forecast
other earnings forecasting models.
horizons. Second, the ML forecast has greater relative
information content than analyst forecasts in predicting There are several potential reasons for the poor per-
future earnings changes. Third, the ML forecast con- formance associated with the ARIMA models. First,
tains a significant amount of information that analysts these models require a long time series to yield

Volume 80, Number 2 75


Financial Analysts Journal | A Publication of CFA Institute

reliable parameter estimates, but the earnings pro- assets; NDDi, t indicates a zero dividend; DIVi, t is
cess may not be stationary over a long period. dividends per share; BTMi, t is the book-to-market
Second, these firm-specific time-series models ignore ratio; and Pricei, t is the stock price at the end of
the rich information in other financial statement line the third month after the end of fiscal year t. In
items. To overcome these limitations, subsequent addition to financial statement items, the SO model
studies turn to cross-sectional or panel-data uses the stock price and book-to-market ratio,
approaches, which use a pooled cross-section of allowing it to utilize more forward-looking
firms to estimate forecasting models. These models information.
are considered as state-of-the-art earnings prediction
models in the literature (e.g., Gerakos and Gramacy The final two cross-sectional models are proposed by
2013; Call et al. 2016).4 We therefore employ them Li and Mohanram (2014):
as alternative benchmarks. We discuss these models Ei, tþ1 ¼ a0 þ a1 NegEi, t þ a2 Ei, t þ a3 NegEi, t  Ei, t þ ei, tþ1
below and summarize them with detailed variable (4)
definitions in Appendix 1 Panel A.
Ei, tþ1 ¼ a0 þ a1 NegEi, t þ a2 Ei, t þ a3 NegEi, t  Ei, t
The first cross-sectional model that we examine is þ a4 BVEi, t þ a5 TACCi, t þ ei, tþ1 (5)
the first-order AR model:
Ei, tþ1 ¼ a0 þ a1 Ei, t þ ei, tþ1 (1) Equation (4) allows loss firms to have different
earnings persistence (the EP model) from profitable
where Ei, t is firm i’s earnings in year t. Gerakos and firms. Equation (5) is based on the residual income
Gramacy (2013) show that the AR model performs model (the RI model) proposed by Feltham and
well relative to the RW model and is more accurate Ohlson (1996), which further augments the EP
than many sophisticated models. model with the book value of equity (BVEi, t ) and
total accruals (TACCi, t ) from Richardson et al.
The second model, the HVZ model, proposed by (2005).
Hou, van Dijk, and Zhang (2012), extends the Fama
and French (2000, 2006) model and forecasts future Although the above models are the state-of-the-art
earnings as follows: models in the literature, they fail to consistently out-
perform the RW model (e.g., Monahan 2018; Easton,
Ei, tþ1 ¼ a0 þ a1 Ai, t þ a2 Di, t þ a3 DDi, t þ a4 Ei, t
Kelly, and Neuhierl 2018). Taken at face value, these
þ a5 NegEi, t þ a6 ACi, t þ ei, tþ1 (2) results seem to suggest that there is not much incre-
mental information in financial statements beyond
This model forecasts future earnings with total assets the bottom-line earnings, which contradicts both the
(Ai, t ), the dividend payment (Di, t ), the dividend-paying conventional wisdom and the fundamental tenet of
dummy (DDi, t ), historical earnings (Ei, t ), and an indica- financial statement analysis. Given that “the question
tor variable for negative earnings (NegEi, t ) and of whether historical accounting numbers are useful for
accruals (ACi, t ). forecasting earnings is central to accounting research”
(p. 183, Monahan 2018), both Monahan (2018) and
The third model, the SO model, is developed by So Easton, Kelly, and Neuhierl (2018) call for further
(2013). So modifies the Fama and French (2006) research in this area. We posit that, rather than indi-
model and forecasts future earnings per share (EPS) cating the lack of the decisional usefulness of finan-
using the following model: cial statements or fundamental analysis, the above
results might be driven by the methodological limita-
EPSi, tþ1 ¼ a0 þ a1 EPSþ − þ
i, t þ a2 NegEi, t þ a3 ACi, t þ a4 ACi, t tions of the extant models, which can be overcome
þ a5 AGi, t þ a6 NDDi, t þ a7 DIVi, t þ a8 BTMi, t by ML algorithms.
þ a9 Pricei, t þ ei, tþ1
(3) ML-Based Earnings Forecasts
where EPSþ
is the EPS for positive earnings and is
i, t The extant models do not make the best use of
zero otherwise; NegEi, t is an indicator variable for information in financial statements to forecast future
negative earnings; AC−i, t is accruals per share for earnings. First, the extant models focus on a small
negative accruals and is zero otherwise; ACþ i, t is number of aggregate financial statement items, such
accruals per share for positive accruals and is zero as bottom-line earnings and total assets, and fail to
otherwise; AGi, t is the percentage change in total fully consider many other financial statement line

76
Fundamental Analysis via Machine Learning

items that could be highly valuable for earnings pre- their inclusion are provided in Appendix 1 Panel B.
diction (e.g., Fairfield, Sweeney, and Yohn 1996; The input predictors (a total of 60) can be broadly
Chen, Miao, and Shevlin 2015). Second, even categorized as follows:
though economic theories and empirical evidence
suggest the prevalence of nonlinear relationships i. Historical earnings and their major compo-
between historical accounting information and nents (8). We include earnings components,
future earnings (e.g., Lev 1983; Freeman and Tse as prior literature has shown that disaggregat-
1992; Baginski et al. 1999; Chen and Zhang 2007), ing earnings provides additional information
these models mostly adopt linear functional forms and improves earnings forecasts (Fairfield,
(or with some simple interactions) and are therefore Sweeney, and Yohn 1996; Chen, Miao, and
unlikely to capture these subtle yet important rela- Shevlin 2015).
tionships. ML algorithms are designed to handle ii. Other important individual income statement
high-dimensional data and are rather flexible with items (5). Specifically, we include advertising
respect to the functional forms of the underlying and R&D expenses, as they tend to generate
relationships. Thus, they can potentially overcome long-term future benefits (e.g., Lev and
the above limitations and generate better earnings Sougiannis 1996; Dou et al. 2021). We also
forecasts. Below we describe the development of include special and extraordinary items and dis-
our ML models. continued operations, as firms may shift core
expenses to these items to manipulate core
Financial Statement Line Items as earnings (e.g., Barnea, Ronen, and Sadan 1976;
Predictors. Both the financial ratios and raw val- McVay 2006). Finally, we include common divi-
ues of financial statement items can be used in dends, as they may signal firms’ future earning
forecasting future earnings.5 A priori, it is unclear power (e.g., Nissim and Ziv 2001).
which choice is better. On the one hand, financial iii. Summary and individual balance sheet items
ratios are grounded in economic theories, which (16). The balance sheet summarizes resources
may help filter out the noise in the raw values of with potential future economic benefits and
financial items and better capture the key drivers of may contain incremental information. For exam-
future earnings. On the other hand, the transforma- ple, the literature shows that the book value of
tion from raw values to ratios may cause a loss or equity is an important driver of future earnings
distortion of information. In a recent study, Bao (e.g., Feltham and Ohlson 1995; Li and
et al. (2020) compare the predictive power of ML Mohanram 2014), and Ball et al. (2020) argue
models based on raw accounting numbers versus that retained earnings may measure average
those based on financial ratios with respect to earning power better than the book value of
accounting fraud. They find that raw value–based equity.
models perform significantly better. Following their iv. Operating cash flows (1). We include cash flows
study, we rely on raw accounting values as our pri- from operating activities, as prior studies have
mary set of predictors and examine alternative pre- found that the cash flow component of earn-
dictor sets of financial ratios in additional analyses. ings is more persistent than the accrual compo-
Furthermore, we follow prior literature (e.g., Li and nent, and separating them helps predict future
Mohanram 2014) and scale all raw accounting val- earnings (e.g., Sloan 1996; Call et al. 2016).
ues (including both the input features and the tar- v. The first-order differences of the above items
get variable) by the number of common shares (30). We include them, as prior studies show
outstanding. that changes in financial statement items often
contain incremental information beyond the
As Richardson, Tuna, and Wysocki (2010) suggest, a levels (e.g., Kothari 1992; Ohlson and Shroff
concern with the early literature of fundamental anal- 1992; Richardson et al. 2005).
ysis is the in-sample identification of predictive varia-
bles. To mitigate this concern, we select a ML Algorithms. Following prior studies (e.g.,
comprehensive list of key financial statement items Rasekhschaffe and Jones 2019; Cao et al. 2024),
and let ML algorithms “learn” from historical data in we develop a ML earnings prediction model by
terms of how to optimally select and combine these ensembling the out-of-sample earnings forecasts
items. The resulting models are then used to gener- generated from several popular ML algorithms.
ate out-of-sample earnings predictions. The detailed Among them, two algorithms are based on decision
list of financial statement items and the rationale for trees and one is based on ANNs. Our two decision

Volume 80, Number 2 77


Financial Analysts Journal | A Publication of CFA Institute

tree–based algorithms are the standard RF and the Data, Sample Selection, and Model
GBR algorithms. In implementing our ANN-based
algorithm, we adopt the bootstrap aggregating (i.e.,
Estimation Procedure
bagging) technique by constructing 10 bootstrap Our initial sample comprises 267,777 firm-year
samples, with each sample randomly drawing 60% observations obtained from the intersection of the
of the observations from the training set. Compustat fundamentals annual file7 and the Center
Thereafter, we train an ANN model for each boot- for Research in Security Prices (CRSP) data up to fis-
strapped sample and then average the 10 models cal year 2019. We further impose the following data
to generate predictions.6 requirements: (1) the following financial statement
items must be non-missing: total assets, sales reve-
Cross-Validation and Hyperparameter nue, income before extraordinary items, and common
Tuning. In the implementation of ML, it is impera- shares outstanding; (2) the stocks must be ordinary
tive to select a model with an appropriate level of common shares listed on the NYSE, AMEX, or
complexity because overly simple models tend to NASDAQ; (3) the firms cannot be in the financial (SIC
underfit the data, while overly complex models tend 6000–6999) or regulated utilities (SIC 4900–4999)
to have an overfitting problem, and both lead to poor industries; and (4) the stock prices at the end of the
out-of-sample predictability. The level of model com- third month after the end of the fiscal year must be
plexity is largely determined by the value of certain greater than US$1. Among the remaining firm-year
hyperparameters, which must be set before estimat- observations, we replace missing values of the cash
ing other parameters such as regression coefficients flow from operating activities with the corresponding
and neural network weights. For example, values of numbers computed from the balance sheet approach
the hyperparameters such as the maximum depth of (Sloan 1996). We then set the missing values of the
the decision trees in the RF and GBR models deter- remaining line items to zero before computing the
mine the overall model complexity as well as the first-order differences of the 30 items in Appendix 1.
number/fraction of input features effectively used in This leaves us with a final sample of 156,256 obser-
the model. vations from 1965 to 2019. Because we need data
from the past 10 years to estimate the models, our
We search for the “optimal” hyperparameter values testing sample (i.e., prediction set) starts from 1975
through hyperparameter tuning via cross-validation. and consists of 142,592 firm-year observations.
We use fivefold cross-validation to identify the opti- Table 1 presents the number of firms in the final
mal hyperparameter values that generate the most testing sample by year, where the number of annual
accurate forecasts on the validation samples. observations ranges from 2,299 in 2019 to 4,976 in
Specifically, for each year in our test sample, we use 1997.
data from the previous 10 years to train the ML mod-
els. We randomly split the 10 years of data into five At the third month end after fiscal year t, we gener-
groups/folds of validation sets, with each fold includ- ate the out-of-sample forecasts of one-year-forward
ing 20% of the data. For each fold, we use the earnings Etþ1 for the above testing sample using the
remaining 80% of the firm-year observations as the aforementioned machine learning algorithms and the
training sample. For each of the ML algorithms, we 60 predictors. Following prior literature (e.g., Hou,
provide a set of reasonable candidate values for the van Dijk, and Zhang 2012; Li and Mohanram 2014),
key hyperparameters (see details provided in for each year t between 1975 and 2019, we use all
Appendix 2). For each combination of the hyperpara- observations from the previous 10 years (i.e., year
meters, we compute the mean squared forecast t − 10, t − 9, … , t − 1) to estimate the models. As dis-
errors of the five validation sets using the model esti- cussed earlier, we use fivefold cross-validation to
mated from the remaining 80%, which forms the identify the optimal hyperparameters. Using these
training set. The mean squared forecast errors on the optimal values, we retrain the model using the obser-
validation sets are used as the basis to select the vations from the previous 10 years and then apply
optimal hyperparameters, which are then used to the trained models to the predictors of year t to gen-
train a new model on the training set. We then apply erate earnings forecasts for year t þ 1. For consis-
the model to the current year’s financial statement tency, all extant models are also estimated using the
data to generate out-of-sample earnings forecasts for data of the same previous 10 years, and the resulting
the following year, which are then compared to the linear models are applied to their respective predic-
subsequent actual earnings to evaluate the relative tors in year t to generate earnings forecasts for
performance of various models. year t þ 1.8

78
Fundamental Analysis via Machine Learning

absolute forecast error of RW is approximately


Table 1. Sample Distribution by Year 13.06% (6.93%) and 12.54% (6.16%) higher than that
of the ML model for two- and three-year-ahead fore-
Year # obs Year # obs Year # obs casts, respectively.
1975 2,550 1990 3,029 2005 3,303
We also examine whether the ML model predicts
1976 2,558 1991 3,140 2006 3,259
1977 2,578 1992 3,484 2007 3,166 the rank and relative level of earnings better than
1978 2,593 1993 3,816 2008 2,945 the extant models. For rank prediction, we first rank
1979 2,679 1994 4,236 2009 2,583 both the actual earnings and earnings forecasts
1980 2,694 1995 4,373 2010 2,747 cross-sectionally and normalize the ranks to [0,1] with
1981 2,685 1996 4,690 2011 2,673 the following transformation: RANKnorm ¼
1982 2,689 1997 4,976 2012 2,538 (RANK − RANKmin)/(RANKmax − RANKmin). We then
1983 2,830 1998 4,930 2013 2,499 compute the mean absolute difference in the normal-
1984 2,892 1999 4,620 2014 2,522 ized ranks between the actual earnings and the earn-
1985 3,047 2000 4,540 2015 2,497 ings forecast. The results presented in the first five
1986 3,087 2001 3,969 2016 2,419
columns of Table 2 Panel D show that the ML fore-
1987 3,083 2002 3,595 2017 2,383
cast still has the lowest mean absolute error of
1988 3,219 2003 3,310 2018 2,349
1989 3,140 2004 3,378 2019 2,299 0.1007, which is about 5% to 14.5% lower than that
of the extant model forecasts. For the accuracy of
Notes: This table reports the number of firms with non-missing forecasting the relative level, we first remove from
input features for all models from 1975 to 2019. both the actual and forecasted earnings their respec-
tive cross-sectional median and then compute the
mean squared error (MSE) of the median-adjusted
Empirical Results numbers. The results presented in the last four col-
umns of Table 2 Panel D show that the ML forecast
Comparison of Forecast Accuracy. To eval- again has the highest accuracy, with the MSE being
uate the forecast accuracy of the different models, about 16.5% to 56.9% lower than that of the extant
we compare the mean and median absolute forecast models.
errors. We define the forecast error as the difference
between the predicted and actual earnings deflated Cross-Sectional Analysis. The above results
by the market value of equity at the end of three suggest that the ML models generate significantly
months after the fiscal year end. A larger absolute more accurate earnings forecasts than the RW
forecast error indicates less accurate earnings fore- model. For firms with stable performance, histori-
casts.9 Table 2 reports the time-series average of the cal earnings are quite good indicators for future
out-of-sample annual mean and median absolute earnings. The benefit of considering additional
forecast errors of all models. The ML forecast turns financial statement line items and more complex
out to be the most accurate forecast, with an aver- forms of relationships should be of higher impor-
age mean absolute forecast error of 0.0687 and an tance for firms with more difficult-to-forecast
average median absolute forecast error of 0.0291. earnings. Table 3 reports the percentage improve-
The benchmark RW model, which prior literature ment in the forecast accuracy of the ML model rel-
shows to be very difficult to outperform, has an aver- ative to the benchmark RW model for subgroups
age mean (median) absolute forecast error of 0.0764 partitioned along the following dimensions: Return
(0.0309), approximately 11.20% (6.12%) higher than on Assets (ROA) volatility, magnitude of accruals,
that of the ML forecast. Consistent with the litera- R&D expense, and an indicator variable of loss
firms.
ture, the extant models are not reliably more accu-
rate than the naïve RW model. More importantly, all The results in Panel A of Table 3 convey the follow-
extant forecasts are less accurate than the ML fore- ing key messages: first, the ML forecast is signifi-
casts, with mean (median) absolute forecast errors cantly more accurate than the RW forecast across
approximately 7.79% to 26.53% (5.87% to 19.45%) the board for all subgroups. Second, ML models lead
higher than that of the ML forecast. Panels B and C to significantly greater accuracy improvement among
of Table 2 present the results for two- and three- firms with more difficult-to-forecast earnings. For
year-ahead out-of-sample earnings forecasts, respec- example, relative to the RW model, the ML forecast
tively. The results show that ML remains the most leads to an accuracy improvement of 15.42%,
accurate forecast. For example, the mean (median) 17.80%, 11.46%, and 12.99% among firms in the

Volume 80, Number 2 79


Financial Analysts Journal | A Publication of CFA Institute

highest quintiles of the ROA volatility, magnitude of univariate regression results, confirming the results in
total accruals, R&D expense, and loss makers, respec- Panel A that ML brings greater improvement in accu-
tively. In Panel B of Table 3, we conduct a regression racy among firms with higher earnings volatility, a
analysis of the difference in the forecast accuracy larger magnitude of accruals, more R&D expense,
between RW and ML forecasts on the above deter- and loss makers. The multivariate regression results
minants. Columns (1) through (5) report the in column (6) are largely consistent, except that the

Table 2. Comparison of Forecast Accuracy


A: Accuracy of one-year-ahead earnings forecasts (N = 134,154 firm-years)
Mean Median
Comparison with ML Comparison with ML
Average*100 DIFF t stat. %DIFF Average*100 DIFF t stat. %DIFF

ML 6.87 2.91
RW 7.64 0.77 7.85 11.20% 3.09 0.18 5.73 6.12%
AR 7.55 0.68 8.96 9.93% 3.08 0.17 6.16 5.87%
HVZ 7.43 0.55 8.31 8.07% 3.11 0.20 7.95 6.93%
EP 7.42 0.55 8.06 8.03% 3.13 0.22 7.89 7.63%
RI 7.41 0.54 7.61 7.79% 3.11 0.20 7.82 6.90%
SO 8.70 1.82 12.72 26.53% 3.47 0.57 14.81 19.45%

B: Accuracy of two-year-ahead earnings forecasts (N = 123,576 firm-years)


Mean Median
Comparison with ML Comparison with ML
Average*100 DIFF t stat. %DIFF Average*100 DIFF t stat. %DIFF

ML 9.09 4.42
RW 10.28 1.19 8.07 13.06% 4.73 0.31 5.25 6.93%
AR 10.18 1.09 9.50 11.99% 4.70 0.28 9.76 6.41%
HVZ 9.71 0.62 7.71 6.84% 4.62 0.19 6.68 4.41%
EP 9.64 0.55 6.03 6.06% 4.66 0.23 6.56 5.31%
RI 9.56 0.47 5.29 5.22% 4.60 0.18 6.01 4.14%
SO 10.31 1.22 11.87 13.42% 4.91 0.49 12.80 11.01%

C: Accuracy of three-year-ahead earnings forecasts (N = 113,601 firm-years)


Mean Median
Comparison with ML Comparison with ML
Average*100 DIFF t stat. %DIFF Average*100 DIFF t stat. %DIFF

ML 10.88 5.58
RW 12.25 1.37 7.28 12.54% 5.92 0.34 3.38 6.16%
AR 12.27 1.38 9.86 12.69% 5.93 0.35 8.02 6.28%
HVZ 11.42 0.53 6.96 4.88% 5.73 0.15 4.07 2.64%
EP 11.38 0.50 5.04 4.56% 5.79 0.21 5.03 3.73%
RI 11.21 0.33 3.73 3.03% 5.70 0.12 3.37 2.20%
SO 12.03 1.14 9.62 10.50% 6.11 0.53 11.41 9.47%

continued

80
Fundamental Analysis via Machine Learning

Table 2. (continued)
D: Alternative measures of forecast accuracy (N = 134,154 firm-years)
MAE of Ranks MSE of De-medianed Forecasts
Comparison with ML Comparison with ML
Average*100 DIFF t stat. %DIFF Average*100 DIFF t stat. %DIFF

ML 10.07 1.76
RW 10.58 0.51 10.92 5.03% 2.21 0.45 3.52 25.67%
AR 10.63 0.56 12.46 5.55% 2.13 0.37 3.52 20.87%
HVZ 10.71 0.63 15.04 6.30% 2.06 0.29 3.32 16.66%
EP 10.58 0.51 10.95 5.04% 2.07 0.31 3.72 17.51%
RI 10.63 0.56 12.26 5.53% 2.05 0.29 3.34 16.54%
SO 11.54 1.47 11.79 14.54% 2.77 1.00 8.77 56.90%

Notes: This table compares the accuracy between the machine learning (ML) forecast and the extant models over the sample
period of 1975–2019. Panels A, B, and C report the time-series average of the mean and median absolute forecast errors of one-,
two-, and three-year-ahead earnings forecasts, respectively. The absolute forecast error is calculated as the absolute value of the
difference between the actual future earnings and the earnings forecasts, scaled by the market equity at the end of three months
after the end of the last fiscal year. Panel D reports alternative measures of forecast accuracy, i.e., the time-series average of the
mean absolute errors of the scaled rank of forecasts difference and the mean squared errors of the de-medianed forecasts differ-
ence. DIFF is the time-series average of the difference, calculated as the mean (median) absolute forecast error of each model
minus that of the ML model. The t statistic of DIFF time-series is reported accordingly. The percentage difference (%DIFF) is DIFF
divided by the time-series average of the annual mean (median) absolute forecast error of the ML model.

Table 3. Cross-Sectional Analysis of Improvement in Forecast Accuracy


A: The percentage improvement in accuracy of the ML forecast relative to the RW forecasts, i.e., (|RW forecast errors| −
|ML forecast errors|)/|RW forecast errors| *100% (N = 129,310, with an average of 2,874 firms per year over the 45-
year sample period from 1975 to 2019)
Partitioning variable Low 2 3 4 High

ROA Volatility 4.86 5.77 5.88 9.15 15.42


|Total Accruals|/Total Assets 3.51 4.45 6.37 10.12 17.80
|Working Capital Accruals|/Total Assets 6.67 7.67 7.81 8.68 13.81
MISSING Low 2 3 High
R&D Expense/Total Assets 9.11 8.54 10.24 10.78 11.46
Non-Loss Loss
Loss 6.64 12.99
B: Multivariate regression analysis of the improvement in the accuracy of ML forecasts relative to the RW forecasts (N =
129,310)
Y = |RW forecast errors| − |ML forecast errors|
(1) (2) (3) (4) (5) (6)

ROA Volatility 0.050 0.006


(6.25) (1.45)
|Total Accruals|/Total Assets 0.055 0.038
(7.35) (6.44)
|Working Capital Accruals|/Total Assets 0.095 0.053
(7.92) (7.32)
R&D Expense/Total Assets 0.012 −0.037
(2.73) (−4.74)
Loss 0.020 0.016
(6.39) (5.29)
continued

Volume 80, Number 2 81


Financial Analysts Journal | A Publication of CFA Institute

Table 3. (continued)
B: Multivariate regression analysis of the improvement in the accuracy of ML forecasts relative to the RW forecasts (N =
129,310)
Y = |RW forecast errors| − |ML forecast errors|
(1) (2) (3) (4) (5) (6)

Const 0.005 0.002 0.001 0.007 0.003 −0.003


(6.90) (3.51) (1.01) (7.36) (5.70) (−3.50)
# years 45 45 45 45 45 45
Avg. # firms per year 2,874 2,874 2,874 2,874 2,874 2,874
Avg adj. R2 0.89% 2.07% 1.83% 0.13% 2.81% 5.12%

Notes: This table presents a cross-sectional analysis of the improvement in the forecast accuracy of the machine learning (ML)
model, relative to that of the random walk (RW) model. The percentage improvement in Panel A is defined as the time-series aver-
age of the annual difference in the mean absolute forecast errors (MAFE) between the ML and RW models, divided by the MAFE
of the RW model. A positive number indicates improved accuracy of the ML model. In Panel A, we sort all firms into quintiles for
each year based on the magnitude of the partition variable (ROA volatility, absolute value of total accruals divided by total assets,
and absolute value of working capital accruals divided by total assets, respectively). For R&D expense, we classify all firms with
missing R&D expense into a separate group and sort the remaining firms into quartiles for each year based on their R&D expense
divided by total assets. We also divide all firms into two groups for each year, depending on whether their earnings are negative.
In Panel B, we regress the difference in the forecast accuracy on the determinants each year and report the mean coefficients of
the annual regressions, as well as the corresponding Fama–MacBeth t statistics.

ROA volatility becomes statistically insignificant and ECH on the FECH of different models. To facilitate
the coefficient on R&D expense flips its sign in the the comparison of the coefficients, we follow the lit-
presence of other determinants. erature to standardize FECH so that it has a zero
mean and unit variance each year. The three columns
Information Content Analysis. Forecast accu- in the middle panel of Table 4 show that the coeffi-
racy is not the sole determinant of the decision use- cients on FECH for the extant models range from
fulness of earnings forecasts. For example, although 0.0304 to 0.0480, explaining between 8.07% and
the RW forecast is more accurate than other fore- 12.22% of the cross-sectional variation in realized
casts, it provides no information with respect to earnings changes. In contrast, the FECH based on
future earnings changes. In this section, we evaluate the ML forecast has a regression coefficient of
the information content of various models by investi- 0.0606 and an explanatory power of 18.61%. Next,
gating their (out-of-sample) predictive power with
we estimate a multivariate regression of ECH on
respect to the future earnings change, ECH. ECH is
FECH based on the ML model by controlling for the
computed as the difference between earnings in year
FECH of all extant models. The right panel of Table 4
t þ 1 and those in year t, scaled by market capitaliza-
shows that the coefficient on FECHML is significantly
tion at the end of the third month after the end of
positive, with a t statistic of 17.98. In contrast, most
fiscal year t. We calculate the forecasted earnings
of the FECH coefficients based on the extant models
change, or FECH, as the predicted earnings for year
become statistically insignificant (or have the wrong
t þ 1 minus the actual earnings for year t, scaled by
sign), except for that of the SO model, which also
market capitalization at the third month end after the
uses forward-looking nonfinancial statement predic-
end of fiscal year t.
tors such as the stock price and book-to-market
We first compare the mean correlation coefficients ratio.
between the ECH and FECH calculated from differ-
ent models. The left two panels of Table 4 show that Economic Significance Analysis. The above
the Pearson (Spearman) correlation coefficients results suggest that ML technology helps generate
between FECH and ECH range from 0.199 to 0.321 more accurate earnings forecasts of (statistically) sig-
(0.117 to 0.179) for the extant models, which are all nificant incremental information beyond the extant
lower than the correlations between the ECH and models. However, it is unclear whether such new
FECH of the ML forecast, 0.413 (0.3). We then esti- information is economically significant. To shed light
mate the univariate Fama–MacBeth regression of on the economic significance of the results, we test

82
Fundamental Analysis via Machine Learning

Table 4. Information Content Analysis


Univariate regression Multivariate regression
Correlation with ECH Depvar: ECH Depvar: ECH
2
Pearson Spearman Coeff. t stat. Avg. R Coeff. t stat.

FECHML 0.413 0.300 0.0606 12.01 18.61% 0.0589 17.98


FECHAR 0.199 0.117 0.0304 4.81 8.07% 0.0078 1.59
FECHHVZ 0.283 0.179 0.0422 8.93 9.98% −0.009 −2.48
FECHEP 0.321 0.154 0.0480 9.96 12.22% 0.0067 0.52
FECHRI 0.313 0.148 0.0467 9.95 11.68% −0.0159 −1.54
FECHSO 0.291 0.153 0.0440 10.47 9.66% 0.0132 4.46
Avg. R2 20.87%

Notes: This table performs the information content analysis of the machine learning (ML) forecast against the extant models. The
left panel reports the average annual cross-sectional Pearson (Spearman) correlation coefficients between the forecasted earnings
changes calculated using various models and the actual earnings changes for a total of 134,154 firm-year observations, with an
average of 2,981 firms per year over the 45-year sample period from 1975 to 2019. The middle panel reports the univariate
Fama–MacBeth regression results. In the regression, all forecasted earnings changes are standardized to have a zero mean and
unit variance each year. The right panel reports the multivariate Fama–MacBeth regression results. Specifically, we regress earn-
ings changes (ECH) on the forecasted earnings changes (FECH) of the ML forecasts and control for all earnings changes predicted
using the extant models. All independent variables are standardized to have a zero mean and unit variance each year. All earnings
changes are scaled by the market equity at the end of three months after the end of the last fiscal year. The table presents the
average coefficients, along with the Fama–MacBeth t statistics and the average adjusted R2. The subscripts are omitted for
brevity.

whether the new information in the ML forecasts has total assets) in light of Fama and French (2015). The
(statistically and economically) significant predictive timeline for the computation of EXRET12Mi, tþ1 and
power with respect to future stock returns.10 To cap- ML RESDi, t is provided in Figure 1, and the detailed
ture the new information uncovered by ML models, variable definitions are provided in Panel C of
we orthogonalize the ML-based forecasts against the Appendix 1. If the new information component has
forecasts generated using the RW and extant models. already been fully priced (when the year t financial
Specifically, we run an annual cross-sectional regres- statements are announced), the coefficient on
sion of the ML forecasts on the RW forecasts and ML RESDi, t , i.e., b1 would be statistically
the forecasts of the five extant models each year and insignificant.
use the residual to measure the new information
Table 5 presents the Fama–MacBeth regression
uncovered by the ML models. Then, we estimate the
results of model (6). Column (1) reports the regres-
following models to test whether the residual fore-
sion results using the same set of control variables as
casts predict future stock returns:
in Bartram and Grinblatt (2018). The results show
XS
EXRET12Mi, tþ1 ¼ b0 þ b1 ML RESDi, t þ cX that the new information component of the ML fore-
s¼1 s i, s, t
cast, ML_RESD, is significantly associated with future
þ IndustryFE þ ei, tþ1 12-month excess returns, even after controlling for
(6) various return-predictive factors. In column (2), we
further control for ROE and INV (Fama and French
where EXRET12Mi, tþ1 is the one-year-ahead cumula-
2015). The results remain robust, showing that the
tive excess return starting from the fourth month of
new information component still exhibits a positive
fiscal year t þ 1 for firm i. ML RESDi, t is the residual
and statistically significant association with future
from the regression that orthogonalizes the ML fore-
stock returns.11
cast against the RW model and the five extant mod-
els. Xi, s, t is the end-of-year t value of firm i’s control We also conduct a portfolio analysis. Specifically, at the
characteristics. We follow Bartram and Grinblatt beginning of each month, we estimate the new infor-
(2018) to select the list of control variables and also mation component as the residual from the regression
control for industry fixed effects. Furthermore, we of the ML forecasts on the contemporaneous forecasts
add ROE (return on equity) and INV (growth rate of generated from the RW model and the five extant

Volume 80, Number 2 83


Financial Analysts Journal | A Publication of CFA Institute

Figure 1. Timeline of Return Prediction Analysis

Notes: This figure presents the timeline for the variables used in the future return prediction analysis. Assume that fiscal year t þ 1
(2011) of firm i ends on 12/31/2011. In the future return prediction analysis, we regress EXRET12Mi, tþ1 , which is the excess cumu-
lative return over the period of 04/01/2011 to 03/31/2012 on ML RESDi, t , which is a proxy for the new information component
of a machine learning forecast, EtMODEL ½Xi, tþ1 , estimated on 03/31/2011.

models. We then sort all stocks into quintiles based on hedge portfolio. For example, the hedge portfolio
the residuals for each three-digit SIC industry. We con- yields a Fama–French five-factor alpha of approxi-
struct a hedge portfolio that takes long positions in mately 45 bps per month. In Figure 2, we plot the
quintiles with the most favorable new information and cumulative log returns (value-weighted) for the five
short positions in quintiles with the least favorable new quintiles and the hedge portfolio. The plot shows
information. Table 6 reports the mean monthly return, that the cumulative returns increase monotonically
Sharpe ratio (annualized), CAPM alpha, Fama–French with the quintile rank, and the hedge portfolio
three-factor alpha, Carhart four-factor alpha, Fama– returns are reasonably consistent over time.12
French five-factor alpha (i.e., the three-factor model
plus the Conservative Minus Aggressive (CMA) and
Robust Minus Weak (RMW) factors), and the alpha Additional Analyses
after controlling for all factors in the Fama and French
dataset, for the five new information component quin- The Importance of Nonlinear Effects. We
tiles, as well as the hedge portfolios that take long conduct several additional analyses to better under-
positions in the top quintiles and short positions in the stand the underlying reasons for the superior perfor-
bottom quintiles. mance of the ML forecasts. We first plot the feature
(Gini) importance charts of the RF model to check
Panel A of Table 6 reports the results for the equally whether they use economically sensible features to
weighted portfolios, showing that the mean monthly generate predictions. Figure 3 shows that past earn-
excess returns increase monotonically from 0.50% ings and operating cash flows are important predic-
for the lowest quintile to 1.23% for the highest quin- tors for future earnings, ranked first and third,
tile. The hedge portfolio generates a monthly mean respectively. Interestingly, total income tax and its
return of 0.73%. The Sharpe ratio also increases first-order difference are the second and fourth most
monotonically from 0.25 to 0.69 for the five quin- important features, respectively. These findings are
tiles. The hedge portfolio generates a Sharpe ratio of consistent with recent literature on the important
1.29. Furthermore, we find that the risk-adjusted
role of tax income or expenses in capturing the qual-
returns (or alphas) increase consistently with the
ity of earnings and predicting future fundamentals
quintile rank. Finally, even after controlling for all fac-
and stock returns (e.g., Lev and Nissim 2004; Hanlon
tors in the Fama and French dataset, the hedge port-
2005; Hanlon, Laplante, and Shevlin 2005; Thomas
folio still earns a monthly alpha of 51 bps.
and Zhang 2011, 2014). Panels A through C of
The results for the value-weighted portfolios appear- Figure 4 present the accumulated local effects (ALE)
ing in Panel B of Table 6 are slightly weaker but still plots (Apley and Zhu 2020) of the top five most
significant both statistically and economically. The important features of the RF models13 for 1975,
hedge portfolio generates a monthly mean return of 1995, and 2015, respectively. The figures show obvi-
0.48%, with a Sharpe ratio of 0.6. Furthermore, all ous nonlinear relationships between these input fea-
Fama–French factor alphas are still significant for the tures and future earnings.14

84
Fundamental Analysis via Machine Learning

In order to provide further empirical evidence on the


Table 5. Regression Analysis of Future 12- importance of nonlinear and interaction effects, we
Month Cumulative Excess Returns develop a linear forecast (LR) using the same set of
on the New Information input features and combine the out-of-sample predic-
tions of the OLS, LASSO, and Ridge algorithms.
Component of Machine Learning
Untabulated results suggest that on a standalone
Forecasts basis, the LR forecast is statistically more accurate
Dep ¼ EXRET12M and informative (about future ECH) than the extant
models but is significantly less accurate and informa-
(1) (2)
tive than the ML forecast. Furthermore, Panel A of
ML_RESD 0.807 0.789 Table 7 shows that in the multivariate regression of
(5.55) (5.28) ECH on FECH based on both the LR and ML fore-
Beta −0.005 −0.006 casts, both the coefficients and the t statistics are
(−1.08) (−1.28) greater for the ML forecast than the linear forecasts.
SIZE 0.063 0.061 Finally, to better understand the collective impor-
(5.87) (5.66) tance of nonlinear effects in the ML forecast, we
BM −0.044 −0.041
regress the ML forecast on the 60 inputs using OLS
(−2.35) (−2.29)
to filter out the linear effects. The fitted value of the
MOM −0.246 −0.216
(−5.09) (−4.57) regression (FITTED) captures the linear effect of the
ACC 0.006 0.007 predictors, while the residuals capture the nonlinear
(0.79) (0.94) and interaction effects in the ML forecasts (denoted
ST_Reversal −0.064 −0.067 as NONLR). We then decompose FECHML into
(−1.37) (−1.43) FECHFITTED (¼FITTED − current earnings) and
LT_Reversal −0.013 −0.011 NONLR and examine their predictive information
(−2.58) (−2.16) content. The results reported in Panel B of Table 7
SUE 0.113 0.117 show that NONLR has significant predictive power
(2.62) (2.64)
with respect to ECH in the presence of FECHFITTED
Gross Profitability 0.130 0.113
(7.55) (6.41)
and the FECHs of the extant models, again suggest-
Earnings yield 0.208 0.174 ing that the ability to accommodate nonlinearity and
(5.39) (5.13) interaction effects allows the ML forecast to uncover
ROE 0.042 a significant amount of new information.
(2.85)
INV −0.042 Comparison between ML Forecasts and
(−3.07) Analyst Consensus Forecasts. We compare
Const YES YES the ML forecasts to the consensus analyst forecasts
Industry FE YES YES (Analyst) issued around the same time, that is, the
Number of years 44 44 third month after the fiscal year end. Because
Avg # firms per year 2,009 2,009
Bradshaw et al. (2012) find that the superiority of
Avg. R2 5.59% 4.32%
analyst forecasts over RW varies over forecast hori-
Notes: This table reports the Fama–MacBeth regression results zons, we conduct a comparison for horizons ranging
that regress future one-year-ahead cumulative excess returns from one to three years. Table 8 Panel A shows that
(EXRET12M) on the new information component of the the mean absolute forecast errors of the ML forecast
machine learning (ML) forecast (ML_RESD) and various known
return-predicting factors and industry fixed effects (three-digit are significantly lower than those of the analyst con-
SIC). EXRET12M is the difference between cumulative stock sensus forecasts for all three forecast horizons
returns starting from the fourth month of the next fiscal year (0.0541, 0.0679, and 0.0776 for ML vs. 0.0588,
minus the cumulative risk-free rate (from the Fama and French
database) over the same period. ML_RESD is the residual from 0.0742, and 0.0922 for analyst forecasts). The
the annual cross-sectional regression of the ML forecast on median results are largely similar, except for the one-
forecasts of the random walk (RW) model and the five extant year horizon, where the ML forecast has a slightly
models. The definitions of the control variables are given in
Appendix 1. All independent variables are winsorized at 1% higher median absolute forecast error (0.0219 vs.
and 99% each year. The table presents the average coefficients 0.0202). Panel B of Table 8 further shows that the
with the corresponding Fama–MacBeth t statistics, along with
ML forecast not only has slightly greater relative
the average adjusted R-squares. The subscripts are omitted for
brevity. information content than analyst forecasts (see

Volume 80, Number 2 85


Financial Analysts Journal | A Publication of CFA Institute

Table 6. Portfolio Analysis of the New Information Component of Machine Learning


Forecasts
A: Monthly return/alpha of the equal-weighted portfolios (N = 1,630,370, with an average of 3,088 stocks per month in
the 528-month sample period)
Lowest 2 3 4 Highest Hedge

Mean Return 0.50 0.70 0.84 1.01 1.23 0.73


(1.64) (2.64) (3.36) (4.19) (4.56) (8.53)
Shape Ratio 0.25 0.40 0.51 0.63 0.69 1.29
CAPM Alpha −0.33 −0.09 0.06 0.25 0.44 0.77
(−1.73) (−0.67) (0.52) (2.34) (2.99) (9.14)
FF3 Alpha −0.46 −0.19 −0.02 0.16 0.28 0.74
(−3.2) (−2.34) (−0.4) (2.85) (2.92) (8.97)
Carhart4 Alpha −0.18 −0.03 0.10 0.27 0.47 0.65
(−1.13) (−0.34) (1.73) (4.23) (4.63) (7.05)
FF5 Alpha −0.18 −0.04 0.04 0.18 0.35 0.53
(−1.11) (−0.39) (0.59) (2.55) (3) (6.35)
Alpha (all factors) −0.06 0.02 0.10 0.23 0.45 0.51
(−0.38) (0.2) (1.74) (3.75) (4.33) (5.89)
B: Monthly return/alpha of the value-weighted portfolios (N = 1,630,370, with an average of 3,088 stocks per month in
the 528-month sample period)
Lowest 2 3 4 Highest Hedge

Mean Return 0.48 0.55 0.69 0.77 0.96 0.48


(1.98) (2.68) (3.3) (3.81) (4.68) (3.98)
Shape Ratio 0.30 0.40 0.50 0.57 0.71 0.60
CAPM Alpha −0.30 −0.15 −0.03 0.07 0.28 0.59
(−3.09) (−2.9) (−0.52) (1.81) (3.9) (4.79)
FF3 Alpha −0.37 −0.15 0.02 0.10 0.28 0.65
(−3.88) (−3.13) (0.47) (2.65) (4.11) (5.45)
Carhart4 Alpha −0.20 −0.13 0.03 0.12 0.30 0.50
(−2.2) (−2.85) (0.8) (3.03) (4.14) (4.15)
FF5 Alpha −0.26 −0.14 0.04 0.06 0.18 0.45
(−2.63) (−2.98) (0.85) (1.3) (2.68) (3.78)
Alpha (all factors) −0.14 −0.12 0.06 0.06 0.20 0.34
(−1.48) (−2.69) (1.25) (1.38) (2.99) (2.88)

Notes: This table summarizes the monthly return/alpha to quintiles sorted based on the new information uncovered using the
machine learning (ML) models. At the beginning of each month, we estimate the new information component as the residual from
the cross-sectional regression of ML forecasts on the forecasts generated from the random walk (RW) model and the five extant
models. We sort the stocks into quintiles based on the resulting residuals for each three-digit SIC industry and report the return
performance of the hedge portfolio, which takes long positions in quintiles with the most favorable new information and short
positions in quintiles with the least favorable new information. Panel A reports the results for the equal-weighted portfolios. Panel
B reports the results for the value-weighted hedge portfolios. We report the mean monthly returns and risk-adjusted returns
(alpha) to the portfolios with the corresponding t statistics in the parentheses.

models 1 and 2) but also has significant incremental Transaction Costs and Other
information beyond them (see models 3 and 4 of the Implementation Considerations. In this sec-
panel). These results suggest that (i) analysts do not tion, we discuss the effect of transaction costs and
fully incorporate financial statement information into other implementation issues. Our investment strategy
their forecasts and (ii) investors can benefit from the is based on the ML earnings forecasts derived from
ML forecasts even if they already have access to annual financial statement data and has relatively low
analysts’ forecasts, as the ML forecasts contain a turnover. Untabulated analyses show that for the
considerable amount incremental information beyond value-weighted portfolio, the monthly portfolio turn-
analyst forecasts.15 over is less than 30% most of the time and averages

86
Fundamental Analysis via Machine Learning

Figure 2. Cumulative Value-Weighted Returns to Quantiles of the New Information


Component of the Machine Learning (ML) Forecasts

Notes: This figure plots the cumulative (log) returns to the value-weighted quintile portfolios sorted on the new information compo-
nent of the ML forecasts (i.e., regression residual of the ML forecast on the extant forecasts) and the hedge portfolio, which is the
difference between the two extreme quintiles. The left vertical axis is for the cumulative log returns to the five quintile portfolios
while the right vertical axis is for the cumulative log return to the hedge portfolio (the solid red line).

Figure 3. Top 10 Influential Features of the Random Forest Model

Notes: This figure plots the average feature importance extracted from the fitted models of the random forest (RF) that we train
with data from the 1975–2019 period. The higher the importance score, the more important the feature is. To facilitate representa-
tion, we set the maximum of the y axis at 0.05, while the average feature importance values for earnings (“E”) is 0.7932.

Volume 80, Number 2 87


Financial Analysts Journal | A Publication of CFA Institute

Figure 4. Accumulated Local Effects (ALE) of the Top Five Most Influential Features

Notes: (A) Accumulated local effects (ALE) plots of the random forest (RF) model for 1975. (B) Accumulated local effects (ALE) plots
of the random forest (RF) model for 1995. (C) Accumulated local effects (ALE) plots of the random forest (RF) model for 2015.

88
Fundamental Analysis via Machine Learning

Table 7. The Importance of Nonlinear and Interaction Effects


A. Information content comparison between the ML and linear forecasts (N = 134,154, with an average of 2,982
observation per year over the 45-year sample period from 1975 to 2019)
Model: ECH = β0+β1FECHML+β2FECHLR+β3FECHAR+β4FECHHVZ+β5FECHEP+ β6FECHRI+β7FECHSO+ε
β0 β1 β2 β3 β4 β5 β6 β7 Avg. R2

Model 1 0.0016 0.0515 0.011 0.1913


(0.67) (10.91) (3.62)
Model 2 0.0016 0.0466 0.0163 0.0071 −0.0097 0.0081 −0.02 0.0149 0.2149
(0.67) (12.19) (5.11) (1.48) (−2.66) (0.62) (−1.89) (4.97)
B. Information content analysis on the linear (FITTED) vs. nonlinear (NONLR) components of the ML forecasts (N =
134,154, with an average of 2,982 observation per year over the 45-year sample period from 1975 to 2019)
Model: ECH = β0+β1NONLR + β2FECHFITTED+β3FECHAR+β4FECHHVZ+β5FECHEP+ β6FECHRI+β7FECHSO+ε
β0 β1 β2 β3 β4 β5 β6 β7 Avg. R2

Model 1 0.0016 0.0368 0.0451 0.1765


(0.67) (6.21) (9.29)
Model 2 0.0016 0.0844 0.0525 −0.018 −0.013 −0.0056 −0.0384 0.0153 0.2119
(0.67) (11.95) (13.35) (−3.7) (−3.4) (−0.54) (−4.01) (5.08)

Notes: This table examines the importance of nonlinear and interaction effects in explaining the superior information content in
the machine learning (ML) forecasts. Panel A presents the Fama–MacBeth regression of earnings changes (ECH) on the forecasted
earnings changes (FECH) of both the linear and nonlinear composite forecasts, without (Model 1) and with (Model 2), controlling
for the FECHs of the extant models. Panel B presents the Fama–MacBeth regression of ECH on the nonlinear component of the
ML forecasts (i.e., the residual from the annual cross-sectional regression of the ML forecasts on the 60 input features) and the
FECH of the fitted value from the regression, without (Model 1) and with (Model 2), controlling for the FECHs of the extant mod-
els. All independent variables are standardized to have a zero mean and unit variance each year. All earnings changes are scaled
by the market equity at the end of three months after the end of the last fiscal year. The table presents the average coefficients,
along with the corresponding Fama–MacBeth t statistics in brackets and the average adjusted R-squares. The subscripts are omit-
ted for brevity.

approximately 28.5%. Using an aggressive estimate information, and forecasting quarterly earnings, etc.
of the round-trip cost of 50 bps, the annualized We leave these further refinements to interested
return would be reduced by approximately 7 bps per readers.
month (¼ 50 bps / 2  28.5%).16 However, for retail
investors who trade aggressively with market orders
and buy (sell) stocks at the ask (bid) prices, the aver-
age monthly return to the value-weighted strategies
Conclusions
would drop substantially to about 20 bps, with a
Our exploration of ML in the context of fundamental
Sharpe ratio of 0.25.
analysis, particularly in forecasting corporate earn-
However, it is worth pointing out that simple decile-/ ings, yields enlightening results. ML models outstrip
quintile-ranked portfolios are far from an efficient contemporary earnings prediction models from the
way to implement a quantitative strategy. literature in terms of forecast accuracy and the rich-
Sophisticated investors mostly utilize risk and trans- ness of information provided. This enhanced perfor-
action cost models together with portfolio optimizers mance of ML models is primarily due to their ability
to construct more (risk and t cost) efficient portfolios, to unearth new economically significant information
which tend to yield much better performance (e.g., from the publicly available financial statement data.
Sivaramakrishnan, Brown, and Kasturi 2018). They achieve this by identifying the key predictors
Furthermore, investors may also be able to improve and capturing complex nonlinear relationships that
their strategy performance by using more advanced traditional models and methodologies might overlook
ML technology, incorporating non-accounting or be unable to process effectively.

Volume 80, Number 2 89


Financial Analysts Journal | A Publication of CFA Institute

Table 8. Comparison between the Machine Learning Forecasts and Analyst Forecasts
A. Forecast accuracy comparison
Mean absolute forecast errors*100 Median absolute forecast errors*100
t+1 t+2 t+3 t+1 t+2 t+3

ML 5.41 6.79 7.76 2.19 3.14 3.68


Analyst 5.88 7.42 9.22 2.02 3.30 4.41
ML-Analyst −0.47 −0.63 −1.46 0.17 −0.16 −0.73
(t-stat) (−4.75) (−5.77) (−7.77) (5.46) (−2.99) (−5.74)
# years 35 34 33 35 34 33
Avg # firms per year 2,279 1,839 696 2,279 1,839 696
N 79,766 62,531 22,956 79,766 62,531 22,956
B. Information content comparison (N = 79,766, with an average of 2,279 observations per year in the 35-year period
from 1985 to 2019)
Model: ECH = β0+β1FECHML+β2FECHAnalyst+β3FECHAR+β4FECHHVZ+β5FECHEP+ β6FECHRI+β7FECHSO+ε
β0 β1 β2 β3 β4 β5 β6 β7 Avg. R2

Model 1 −0.0019 0.051 19.36%


(−0.80) (8.99)
Model 2 −0.0019 0.0492 18.92%
(−0.80) (7.53)
Model 3 −0.0019 0.031 0.0289 23.61%
(−0.80) (11.12) (5.14)
Model 4 −0.0019 0.0352 0.0307 0.018 −0.0135 0.0077 −0.0293 0.0124 26.85%
(−0.80) (13.78) (7.10) (2.67) (−2.83) (0.84) (−3.86) (5.94)

Notes: This table compares the composite forecasts based on the machine learning model (ML) and analyst consensus forecasts.
Panel A reports the time-series average of the mean and median absolute forecast errors of the one- to three-year-ahead earnings
forecasts for the ML model and the analyst consensus forecasts issued in the third month after the last fiscal year end. Panel B
reports the Fama–MacBeth regression results. All independent variables are standardized to have a zero mean and unit variance
each year. The panel presents the average coefficients, along with the Fama–MacBeth t statistics in brackets and the average
adjusted R-square.

The new information that ML models bring to light is (ii) ML forecasts contain significant incremental
not merely statistically significant; it bears consider- information beyond analyst consensus forecasts
able economic value for investors. The residuals from even if analysts have access to all the financial
ML forecasts—representing information not captured statements used in ML models (and much more),
by existing models—show a robust predictive rela- suggesting that analysts fail to fully incorporate
tionship with future stock returns. This finding indi- the information in key financial statement items
cates that the market has not fully priced in the into their forecasts.
information revealed by ML models, thus allowing
investors using ML-derived insights to potentially The overarching conclusion is that ML technology
achieve superior returns. holds considerable promise in refining investment
decision-making processes. By more effectively
Additionally, the comparison between ML forecasts extracting and utilizing value-relevant information
and consensus analyst forecasts reveals notewor- from financial statements, ML can play a transfor-
thy findings: (i) ML forecasts are as accurate as mative role in enhancing the accuracy and efficacy
consensus analyst forecasts over a one-year fore- of earnings forecasts and, by extension, in the
cast horizon and are significantly more accurate broader domain of financial analysis and
than them over longer forecast horizons and investment.

90
Fundamental Analysis via Machine Learning

Appendix 1:
Models and Variable Definitions
A: Summary of extant models and variable definitions

Extant Models

RW: Ei, tþ1 ¼ Ei, t þ ei, tþ1


AR: Ei, tþ1 ¼ a0 þ a1 Ei, t þ ei, tþ1
HVZ: Ei, tþ1 ¼ a0 þ a1 Ai, t þ a2 Di, t þ a3 DDi, t þ a4 Ei, t þ a5 NegEi, t þ a6 ACi, t þ ei, tþ1
SO: EPSi, tþ1 ¼ a0 þ a1 EPSþ − þ
i, t þ a2 NegEi, t þ a3 ACi, t þ a4 ACi, t þ a5 AGi, t þ a6 NDDi, t
þa7 DIVi, t þ a8 BTMi, t þ a9 Pricei, t þ ei, tþ1
EP: Ei, tþ1 ¼ a0 þ a1 NegEi, t þ a2 Ei, t þ a3 NegEi, t  Ei, t þ ei, tþ1
RI: Ei, tþ1 ¼ a0 þ a1 NegEi, t þ a2 Ei, t þ a3 NegEi, t  Ei, t þ a4 BVEi, t þ a5 TACCi, t þ ei, tþ1
Variable Definition
Etþ1 Earnings (ib − spi) in year t þ 1
EPStþ1 Earnings (ib − spi) in year t þ 1 scaled by shares outstanding (csho)
Et Earnings (ib − spi) in year t
At Total assets (at)
Dt Dividend payment (dvc)
DDt Dummy variable indicating dividend payment
NegEt Dummy variable indicating negative earnings
ACt Accruals calculated as the change in non-cash current assets (act − che) minus the
change in current liabilities, excluding short-term debt and taxes payable (lct − dlc − txp)
minus depreciation and amortization (dp)
EPSþ t Earnings per share when earnings are positive and zero otherwise
AC−t Accruals per share when accruals are negative and zero otherwise
ACþt Accruals per share when accruals are positive and zero otherwise
AGt Percentage change in total assets
NDDt Dummy variable indicating a zero dividend per share
DIVt Dividend per share (dvpsx_f)
BTMt Book-to-market ratio, calculated as the book value of equity divided by the market equity
as of three months after the end of the last fiscal year
Pricet Stock price as of three months after the end of fiscal year t
BVEt Book value of equity (ceq)
TACCt Total accruals defined in Richardson et al. (2005), which is the sum of the change in
WC (i.e., (act − che) − (lct − dlc)), change in NCO (i.e., (at − act − ivao) − (lt − lct − dltt)),
and change in FIN (i.e., (ivst þ ivao) − (dltt þ dlc þ pstk))

B: Rationales and variables (or predictors) for machine learning models

Variables Definition

Category I: Historical earnings and their major components


Rationale: The random walk model is difficult to beat. It suggests that historical earnings are one of the most
important determinants (Monahan 2018; Gerakos and Gramacy 2013). The disaggregation of earnings provides
additional information about future fundamentals and improves future earnings forecasts (Fairfield, Sweeney,
and Yohn 1996; Chen, Miao, and Shevlin 2015). According to the balancing model of Compustat, earnings
are computed as: E ¼ SALE-COGS-XSGA-DP-XINT þ NOPI-TXT-MII. We include all the components except MII
to avoid perfect multicollinearity.
Et Earnings (ib − spi)
SALEt Sales (sale)
COGSt Cost of goods sold (cogs)
XSGAt Selling, general, and administrative expenses (xsga)
DPt Depreciation and amortization (dp)
continued

Volume 80, Number 2 91


Financial Analysts Journal | A Publication of CFA Institute

Appendix 1: Models and Variable Definitions (continued)


Variables Definition

XINTt Interest and related expense (xint)


NOPIt Non-operating income (expense) (nopi)
TXTt Income taxes (txt)
Category II: Other important individual items on the income statement
Rationale: We also include several individual items on income statements that prior literature has shown to
have different/important implications with regard to future earnings: (i) advertising expense (XAD) and research
and development expense (XRD) may generate long-term future benefits (e.g., Bublitz and Ettredge 1989;
Sougiannis 1994; Lev and Sougiannis 1996; Chan, Lakonishok, and Sougiannis 2001; Vitorino 2014;
Dou et al. 2021); (ii) firms may shift core expenses to special items (SPI), extraordinary items, and discontinued
operations (XIDO) to manipulate/smooth core earnings (e.g., Barnea, Ronen, and Sadan 1976; McVay 2006;
Barua, Lin, and Sbaraglia 2010; Kaplan, Kenchington, and Wenzel 2020); and (iii) a common dividend (DVC)
signals a firm’s future earning power (e.g., Nissim and Ziv 2001).
XADt Advertising expense (xad)
XRDt Research and development (R&D) expense (xrd)
SPIt Special items (spi)
XIDOt Extraordinary items and discontinued operations (xido)
DVCt Common dividend (dvc)
Category III: Summary and individual accounts on the balance sheet
Rationale: A balance sheet summarizes the resources that have potential future economic benefit and may contain
incrementally useful information regarding future earnings. For example, the literature shows that the book value
of equity is one of the most important drivers of future earnings (e.g., Feltham and Ohlson 1995; Li and
Mohanram 2014). Furthermore, Ball et al. (2020) contend that retained earnings may measure average earning
power better than the book value of equity. Finally, the balance sheet is an earnings management constraint
that may affect the reversal of accruals (e.g., Baber, Kang, and Li 2011). Thus, we include the summary balance
sheet accounts of AT, ACT, LCT, LT, and CEQ as well as other important individual balance sheet items: CHE,
INVT, RECT, PPENT, IVO, INTAN, AP, DLC, TXP, DLTT, and RE. We again omit some summary and individual
accounts to avoid perfect multicollinearity.
ATt Total assets (at)
ACTt Total current assets (act)
LCTt Total current liabilities (lct)
LTt Total liabilities (lt)
CEQt Common/ordinary equity (ceq)
CHEt Cash and short-term investments (che)
INVTt Inventories (invt)
RECTt Receivables (rect)
PPENTt Property, plant, and equipment – net (ppent)
IVAt Investment and advances, equity (ivaeq) þ investments and advances, other (ivao)
INTANt Intangible assets (intan)
APt Accounts payable (ap)
DLCt Debt in current liabilities (dlc)
TXPt Income taxes payable (txp)
DLTTt Long-term debt (dltt)
REt Retained earnings (re)
Category IV: Cash flow from operating activities
Rationale: Cash is king. The cash flow component of earnings is more persistent than the accrual component, and
separating them helps predict future earnings (e.g., Sloan 1996; Call et al. 2016). Because cash flow statements
were not available until 1989, we only include the cash flow from operating activities (CFO) and estimate it
using the indirect approach with balance sheet data when it is missing.
CFOt Cash flow from operating activities (oancf − xidoc); if missing, it is computed using the balance
sheet approach (ib − accruals)

continued

92
Fundamental Analysis via Machine Learning

Appendix 1: Models and Variable Definitions (continued)


Variables Definition

Category V: The first-order difference of the above variables


Rationale: The literature shows that changes in these items often contain incremental information beyond
the levels of the financial statement items (e.g., Kothari 1992; Ohlson and Shroff 1992; Richardson et al. 2005).
DEt  DCFOt Computed as the corresponding item in year t minus the same item in year t − 1

C: Variable definitions of return prediction analyses

Dependent variable

EXRET12Mtþ1 One-year-ahead excess return, computed as the 12-month cumulative return less that of
the risk-free rate, starting from the fourth month after the end of the last fiscal year
Main independent variables
ML RESDt The new information uncovered by machine learning models, estimated as the residual by
regressing the one-year-ahead machine learning-based forecasts on the one-year-ahead
earnings forecasts from the RW model and the five extant models in year t
Controls
Betat Annual market beta using the market model calculated in WRDS
SIZEt Logarithm of market capitalization at the end of the third month after the end of the
last fiscal year
BMt Book-to-market ratio, calculated as the book value of equity divided by the market equity
at the end of three months after the end of the last fiscal year
MOMt Momentum calculated as the cumulative return during the 11-month period starting
12 months ago
ACCt Accruals scaled by total assets
ST Reversalt Return in the prior month
LT Reversalt Return in the prior five years, excluding the prior year
SUEt Quarterly unexpected earnings surprise based on a rolling seasonal random walk model
(Livnat and Mendenhall 2006)
Gross Profitabilityt (sale (sale) − cost of goods sold (cogs)) / total assets (at)
Earnings yieldt Earnings to price (Penman et al. 2015)
ROEt Earnings (ib − spi) divided by common equity (ceq)
INVt Growth rate of total assets (att =att−1 − 1)
IndustryFE Three-digit SIC industry fixed effects

Volume 80, Number 2 93


Financial Analysts Journal | A Publication of CFA Institute

Appendix 2:
Tuning of Hyperparameters for the Machine Learning Models

Model Candidate values Algorithms in sklearn

LASSO alphas ¼ np.linspace(1e-3,1e-1,1000) LassoCV(alphas ¼ np.linspace(1e-3,1e-1,1000),


fit_intercept ¼ False,max_iter ¼ 25000,n_jobs¼-1)
Ridge alphas ¼ np.linspace(5e1,1e3,500) RidgeCV(alphas ¼ np.linspace(5e1,1e3,500),
fit_intercept ¼ False,cv ¼ 5)
RF parameters ¼ {'max_features’:['auto’], GridSearchCV(RandomForestRegressor(n_estimators ¼ 500,
'max_depth’:[20,25,30,35], criterion¼'mse’,oob_score ¼ True,n_jobs¼-1,
'min_samples_leaf’:[15,20,25,50]} random_state ¼ 10), parameters, cv ¼ 5, n_jobs¼-1,
scoring¼'neg_mean_squared_error’)
GBR parameters ¼ {'max_features’:['auto’], GridSearchCV(GradientBoostingRegressor(learning_rate ¼ 0.1,
'max_depth’:[1,3,5], n_estimators ¼ 500,loss¼'huber’,alpha ¼ 0.7,
'min_samples_leaf’:[75,100,125,150]} subsample ¼ 0.9,random_state ¼ 10), parameters,
cv ¼ 5, n_jobs¼-1, scoring¼'neg_mean_squared_error’)
ANN parameters ¼ {'activation’:['relu’,'tanh’], GridSearchCV(MLPRegressor(max_iter ¼ 1000,
'hidden_layer_sizes’:[(64,32,16,8),(32,16,8,4), random_state ¼ 10,early_stopping ¼ True,tol ¼ 1e-6),
(16,8,4,2),(64,32,16),(32,16,8),(16,8,4),(8,4,2), parameters, cv ¼ 5, n_jobs¼-1,
(64,32),(32,16),(16,8),(8,4),(4,2),(64,), scoring¼'neg_mean_squared_error’)
(32,),(16,),(8,),(4,)],'alpha’:[1e-4,1e-5]}

Acknowledgments Accounting Design Project of Columbia University, the 4th


Shanghai Edinburgh Fintech Conference, the 2021 Winter
We thank Utpal Bhattacharya, Sean Cao, Kevin Chen, Shuping Research Conference on Machine Learning and Business at
Chen, Peter Chen, Zhihong Chen, Patty Dechow, Pingyang University of Miami, the 4th Annual CFA NY/SQA Data
Gao, Allen Huang, Mingyi Hung, Steve Monahan, Arthur Science in Finance Conference and the 3rd Future of
Morris, Kevin Li, Xinlei Li, Mei Luo, Ting Luo, Scott Murray, Financial Information Conference for their helpful comments
Stephen Penman, David Reeb, Richard Sloan, Ayung Tseng, and suggestions.
Ranik Raaen Wahlstrøm, Kun Wang, Rongfei Wang, Zheng
Wang, Jian Xue, Amy Zang, Guochang Zhang, Yue Zheng, and
participants at workshops at HKUST, Tsinghua University, the

Editor's Note
Submitted 28 September 2022
Accepted 30 January 2024 by William N. Goetzmann

Notes
1. For example, the law of diminishing returns predicts a 3. Subsequent research finds that the RW model performs
nonlinear relationship between capital investment and well even when compared with analyst forecasts. For
future earnings. Prior literature has also shown that the example, Bradshaw et al. (2012) find that analysts’
relationship between current and future earnings is earnings forecasts are not economically more accurate
nonlinear and varies across firms (e.g., Lev 1983; Freeman than the naïve RW forecasts, and those for longer
and Tse 1992; Baginski et al. 1999; Chen and Zhang horizons are even less accurate than the naïve RW
2007). forecasts.

2. The financial statement items include (i) earnings and 4. There was also a flurry of early studies on fundamental
their major components; (ii) other income statement items analysis using financial ratios (see reviews by Kothari [2001]
that prior studies have shown to produce long-term and Richardson et al. [2010]). However, most of these
benefits or be associated with earnings manipulation; (iii) studies examine in-sample associations and are subject to
balance sheet accounts that are important for earnings concerns of the in-sample identification of predictive variables
prediction; and (iv) operating cash flows, which will be (p. 424 of Richardson et al. 2010). They provide little (if any)
discussed in greater detail in Section 3. evidence on the accuracy or informativeness of out-of-

94
Fundamental Analysis via Machine Learning

sample forecasts. Furthermore, these studies focus on the 11. It is possible that the results might not be robust if we control
sample period before the 1990s. It is unclear whether their for a large number of other characteristics that the literature
conclusions still hold in recent years. has discovered. Similarly, although we have controlled for all
factors in the Fama–French database in the subsequent time-
5. We limit the input variables to the financial statement data series test, the risk-adjusted returns may not be significant if
because we are interested in the decision usefulness of we control for other additional factors that are theoretically or
fundamental analysis using financial statement information. empirically related to the new predictor. We thank one of the
Furthermore, we compare the ML models with the extant referees for pointing this out.
models, most of which also only use financial statement
items (with the exception of the SO model). 12. The left vertical axis is for the cumulative log returns to
the five quintile portfolios, while the right vertical axis is
6. These ML algorithms have been widely adopted in the for the cumulative log return to the hedge portfolio. The
literature. For the sake of brevity, we do not discuss the two axes have different scales. It is worth pointing out
technical details about them. Interested readers can refer that the hedge portfolio has a lower volatility than the
to Gu et al. (2020) for these details. We select these highest quintile. The annualized volatility of the two
models rather than other more complex ones, which portfolios is 9.5% and 16.3%, respectively.
incorporate time-series dynamics such as Recurrent
Neural Network (RNN), Long Short-Term Memory (LSTM), 13. The plots for the GBR models are similar but untabulated
or reinforcement learning because all benchmark extant for the sake of brevity.
models are cross-sectional. To the extent that more
14. The nonlinear models also accommodate the interaction
sophisticated models may be able to produce better
effects between predictors. Untabulated analyses show
forecasts and that other inputs such as industry and that the interaction effects between the following
macroeconomic variables might contain additional economically related pairs contribute the most to the
information, our results should be interpreted as a lower explanatory power of the GBR model: the change in sales
bound that ML can attain in earnings forecasting tasks. revenue and change in the cost of goods sold, change in
short-term debts and change in total current liabilities,
7. Ideally, we would like to use the point-in-time data to
sales revenue and accounts payable, depreciation and
avoid any subsequent restatements in accounting
amortization expense and net property, plant and
numbers. However, we do not have access to the data.
equipment, and cost of goods sold and inventories. The
We therefore follow related studies (e.g., Hou et al. 2012;
finding that the ML models pick up the interaction
So 2013; Li and Mohanram 2014) and use the Compustat
between income statement items and the corresponding
fundamental annual file. gross accrual items (e.g., depreciation and amortization
(DA) and net PP&E, COGS and inventories) resonates
8. For each round of training, cross-validation (CV) and
remarkably well with the recent call for research by
prediction, we normalize all independent variables to [0,1] for
Dichev (2020) on the role of gross accruals in determining
the training set. We also transform the independent variables
earnings quality.
in the prediction set using the same scaler obtained from the
training set. Furthermore, to alleviate the influence of outliers, 15. In untabulated analyses, we find that ML forecasts are
we winsorize the input features (i.e., per-share accounting also a useful benchmark to assess the ex-ante bias in
numbers) at the top and bottom 1% for the training sample. analyst forecasts. When analysts’ forecasts are
substantially higher (lower) than the ML forecasts, these
9. We also conduct robustness tests and compare the
forecasts tend to be overly optimistic (pessimistic) than
forecast accuracy for earnings scaled by total assets and
the actual earnings. Furthermore, these stocks tend to
shares outstanding (i.e., EPS), and the inference remains have significantly negative (positive) returns over
the same. Furthermore, we partition the sample into subsequent periods.
subgroups based on the market cap, and the results still
show that the ML forecast is significantly more accurate 16. The literature has yet to arrive at a consensus on the level
than the RW and extant model forecasts in all subgroups. of trading costs. While Novy-Marx and Velikov (2016)
The results are untabulated but available upon request. suggest that “[r]ound-trip transaction costs for typical value-
weighted strategies average in excess of 50 basis points
10. Note that this is a test of two joint hypotheses: (i) the (bps),” Frazzini et al. (2018) argue that this estimate is “an
new information uncovered by the ML model is order of magnitude larger than [what] our model or live
economically significant and (ii) the market does not fully costs suggest.” They suggest that for a large institutional
understand the new information. This test also sheds light trader, “the effective bid-ask spread across all trades
on the potential decision usefulness of fundamental averages less than 0.015% per year,” and the market impact
analysis and financial information. As Fama (1965) is just under nine basis points on average for all trades
suggests, fundamental analysis is only of value if it completed within a day. The impact of the transaction cost
provides new information not yet fully priced. Thus, this would be much lower if we use Frazzini et al.’s estimate.
analysis also sheds light on whether fundamental analysis
is useful for investors.

References
Apley, D. W., and J. Zhu. 2020. “Visualizing the Effects of Journal of the Royal Statistical Society Series B: Statistical
Predictor Variables in Black Box Supervised Learning Models.” Methodology 82 (4): 1059–1086. doi:10.1111/rssb.12377.

Volume 80, Number 2 95


Financial Analysts Journal | A Publication of CFA Institute

Baber, W. R., S. H. Kang, and Y. Li. 2011. “Modeling Chen, P. F., and G. Zhang. 2007. “How Do Accounting
Discretionary Accrual Reversal and the Balance Sheet as an Variables Explain Stock Price Movements? Theory and
Earnings Management Constraint.” The Accounting Review 86 Evidence.” Journal of Accounting and Economics 43 (2-3): 219–
(4): 1189–1212. doi:10.2308/accr-10037. 244. doi:10.1016/j.jacceco.2007.01.001.

Baginski, S. P., K. S. Lorek, G. L. Willinger, and B. C. Branson. Chen, S., B. Miao, and T. Shevlin. 2015. “A New Measure of
1999. “The Relationship between Economic Characteristics and Disclosure Quality: The Level of Disaggregation of Accounting
Alternative Annual Earnings Persistence Measures.” The Data in Annual Reports.” Journal of Accounting Research 53 (5):
Accounting Review 74 (1): 105–120. doi:10.2308/accr.1999.74. 1017–1054. doi:10.1111/1475-679X.12094.
1.105. Chen, X., Y. H. Cho, Y. Dou, and B. Lev. 2022. “Predicting
Ball, R., J. Gerakos, J. T. Linnainmaa, and V. Nikolaev. 2020. Future Earnings Changes Using Machine Learning and Detailed
“Earnings, Retained Earnings, and Book-to-Market in the Cross Financial Data.” Journal of Accounting Research 60 (2): 467–515.
Section of Expected Returns.” Journal of Financial Economics doi:10.1111/1475-679X.12429.
135 (1): 231–254. doi:10.1016/j.jfineco.2019.05.013. Dichev, I. D. 2020. “Fifty Years of Capital Markets Research in
Ball, R., and R. Watts. 1972. “Some Time Series Properties of Accounting: Achievements so Far and Opportunities Ahead.”
Accounting Income.” The Journal of Finance 27 (3): 663–681. China Journal of Accounting Research 13 (3): 237–249. doi:10.
doi:10.1111/j.1540-6261.1972.tb00991.x. 1016/j.cjar.2020.07.005.

Bao, Y., B. Ke, B. Li, Y. J. Yu, and J. Zhang. 2020. “Detecting Ding, K., B. Lev, X. Peng, T. Sun, and M. A. Vasarhelyi. 2020.
Accounting Fraud in Publicly Traded US Firms Using a Machine “Machine Learning Improves Accounting Estimates: Evidence
from Insurance Payments.” Review of Accounting Studies 25 (3):
Learning Approach.” Journal of Accounting Research 58 (1):
1098–1134. doi:10.1007/s11142-020-09546-9.
199–235. doi:10.1111/1475-679X.12292.
Dou, W. W., Y. Ji, D. Reibstein, and W. Wu. 2021. “Inalienable
Barnea, A., J. Ronen, and S. Sadan. 1976. “Classificatory
Customer Capital, Corporate Liquidity, and Stock Returns.” The
Smoothing of Income with Extraordinary Items.” The Accounting
Journal of Finance 76 (1): 211–265. doi:10.1111/jofi.12960.
Review 51 (1): 110–122.
Easton, P. D., P. Kelly, and A. Neuhierl. 2018. “Beating a
Bartram, S. M., and M. Grinblatt. 2018. “Agnostic Fundamental
Random Walk.” Available at SSRN: https://ssrn.com/abstract=
Analysis Works.” Journal of Financial Economics 128 (1): 125–
3040354 or doi:10.2139/ssrn.3040354.
147. doi:10.1016/j.jfineco.2016.11.008.
Fairfield, P. M., R. J. Sweeney, and T. L. Yohn. 1996.
Barua, A., S. Lin, and A. M. Sbaraglia. 2010. “Earnings
“Accounting Classification and the Predictive Content of
Management Using Discontinued Operations.” The Accounting
Earnings.” The Accounting Review 71 (3): 337–355.
Review 85 (5): 1485–1509. doi:10.2308/accr.2010.85.5.1485.
Fama, E. F. 1965. “Random Walks in Stock Market Prices.”
Bertomeu, J., E. Cheynel, E. Floyd, and W. Pan. 2021. “Using Financial Analysts Journal 21 (5): 55–59. doi:10.2469/faj.v21.n5.
Machine Learning to Detect Misstatements.” Review of 55.
Accounting Studies 26 (2): 468–519. doi:10.1007/s11142-020-
09563-8. Fama, E. F., and K. R. French. 2000. “Forecasting Profitability
and Earnings.” The Journal of Business 73 (2): 161–175. doi:10.
Bradshaw, M. T., M. S. Drake, J. N. Myers, and L. A. Myers. 1086/209638.
2012. “A Re-Examination of Analysts’ Superiority over Time-
Series Forecasts of Annual Earnings.” Review of Accounting Fama, E. F., and K. R. French. 2006. “Profitability, Investment
Studies 17 (4): 944–968. doi:10.1007/s11142-012-9185-8. and Average Returns.” Journal of Financial Economics 82 (3):
491–518. doi:10.1016/j.jfineco.2005.09.009.
Brown, L. D. 1993. “Earnings Forecasting Research: Its
Implications for Capital Markets Research.” International Journal Fama, E. F., and K. R. French. 2015. “A Five-Factor Asset
of Forecasting 9 (3): 295–320. doi:10.1016/0169- Pricing Model.” Journal of Financial Economics 116 (1): 1–22.
2070(93)90023-G. doi:10.1016/j.jfineco.2014.10.010.

Bublitz, B., and M. Ettredge. 1989. “The Information in Feltham, G. A., and J. A. Ohlson. 1995. “Valuation and Clean
Discretionary Outlays: Advertising, Research, and Surplus Accounting for Operating and Financial Activities.”
Development.” The Accounting Review 64 (1): 108–124. Contemporary Accounting Research 11 (2): 689–731. doi:10.
1111/j.1911-3846.1995.tb00462.x.
Cao, S., W. Jiang, J. L. Wang, and B. Yang. 2024. “From Man
vs. Machine to Man þ Machine: The Art and AI of Stock Feltham, G. A., and J. A. Ohlson. 1996. “Uncertainty Resolution
and the Theory of Depreciation Measurement.” Journal of
Analyses.” Journal of Financial Economics. Forthcoming.
Accounting Research 34 (2): 209–234. doi:10.2307/2491500.
Call, A. C., M. Hewitt, T. Shevlin, and T. L. Yohn. 2016. “Firm-
Frazzini, A., R. Israel, and T. J. Moskowitz. 2018. “Trading
Specific Estimates of Differential Persistence and Their
Costs.” Available at SSRN 3229719.
Incremental Usefulness for Forecasting and Valuation.” The
Accounting Review 91 (3): 811–833. doi:10.2308/accr-51233. Freeman, R. N., and S. Y. Tse. 1992. “A Nonlinear Model of
Security Price Responses to Unexpected Earnings.” Journal of
Chan, L. K., J. Lakonishok, and T. Sougiannis. 2001. “The Stock
Accounting Research 30 (2): 185–209. doi:10.2307/2491123.
Market Valuation of Research and Development Expenditures.”
The Journal of Finance 56 (6): 2431–2456. doi:10.1111/0022- Gerakos, J. J., and R. Gramacy. 2013. “Regression-Based
1082.00411. Earnings Forecasts.” Chicago Booth Research Paper, 12–26.

96
Fundamental Analysis via Machine Learning

Available at SSRN: https://ssrn.com/abstract=2112137 or Monahan, S. J. 2018. “Financial Statement Analysis and


doi:10.2139/ssrn.2112137. Earnings Forecasting.” Foundations and TrendsV
R in Accounting

12 (2): 105–215. doi:10.1561/1400000036.


Gu, S., B. Kelly, and D. Xiu. 2020. “Empirical Asset Pricing via
Machine Learning.” The Review of Financial Studies 33 (5): Nissim, D., and A. Ziv. 2001. “Dividend Changes and Future
2223–2273. doi:10.1093/rfs/hhaa009. Profitability.” The Journal of Finance 56 (6): 2111–2133. doi:10.
1111/0022-1082.00400.
Hanlon, M. 2005. “The Persistence and Pricing of Earnings,
Accruals, and Cash Flows When Firms Have Large Book-Tax Novy-Marx, R., and M. Velikov. 2016. “A Taxonomy of
Differences.” The Accounting Review 80 (1): 137–166. doi:10. Anomalies and Their Trading Costs.” Review of Financial Studies
2308/accr.2005.80.1.137. 29 (1): 104–147. doi:10.1093/rfs/hhv063.
Hanlon, M., S. Laplante, and T. Shevlin. 2005. “Evidence for the Ohlson, J. A. 1995. “Earnings, Book Values, and Dividends in
Possible Information Loss of Conforming Book Income and Equity Valuation.” Contemporary Accounting Research 11 (2):
Taxable Income.” The Journal of Law and Economics 48 (2): 661–687. doi:10.1111/j.1911-3846.1995.tb00461.x.
407–442. doi:10.1086/497525.
Ohlson, J. A., and B. E. Juettner-Nauroth. 2005. “Expected EPS
Hou, K., M. A. van Dijk, and Y. Zhang. 2012. “The Implied Cost and EPS Growth as Determinants of Value.” Review of
of Capital: A New Approach.” Journal of Accounting and Accounting Studies 10 (2-3): 349–365. doi:10.1007/s11142-
Economics 53 (3): 504–526. doi:10.1016/j.jacceco.2011.12.001. 005-1535-3.
Kaplan, S. E., D. G. Kenchington, and B. S. Wenzel. 2020. “The Ohlson, J. A., and P. K. Shroff. 1992. “Changes versus Levels in
Valuation of Discontinued Operations and Its Effect on Earnings as Explanatory Variables for Returns: Some
Classification Shifting.” The Accounting Review 95 (4): 291–311. Theoretical Considerations.” Journal of Accounting Research 30
doi:10.2308/tar-2016-0235. (2): 210–226. doi:10.2307/2491124.
Kothari, S. P. 1992. “Price-Earnings Regressions in the Presence Penman, S. H. 1992. “Return to Fundamentals.” Journal of
of Prices Leading Earnings: Earnings Level versus Change Accounting, Auditing & Finance 7 (4): 465–483. doi:10.1177/
Specifications and Alternative Deflators.” Journal of Accounting 0148558X9200700403.
and Economics 15 (2-3): 173–202. doi:10.1016/0165-
4101(92)90017-V. Penman, S. H., F. Reggiani, S. A. Richardson, and A. Tuna.
2015. “An Accounting-Based Characteristic Model for Asset
Kothari, S. P. 2001. “Capital Markets Research in Accounting.” Pricing.” Available at SSRN 1966566.
Journal of Accounting and Economics 31 (1-3): 105–231. doi:10.
1016/S0165-4101(01)00030-1. Rasekhschaffe, K. C., and R. C. Jones. 2019. “Machine Learning
for Stock Selection.” Financial Analysts Journal 75 (3): 70–88.
Lee, C. M. 1999. “Accounting-Based Valuation: Impact on doi:10.1080/0015198X.2019.1596678.
Business Practices and Research.” Accounting Horizons 13 (4):
413–425. doi:10.2308/acch.1999.13.4.413. Richardson, S. A., R. G. Sloan, M. T. Soliman, and I. Tuna. 2005.
“Accrual Reliability, Earnings Persistence and Stock Prices.”
Lev, B. 1983. “Some Economic Determinants of Time-Series Journal of Accounting and Economics 39 (3): 437–485. doi:10.
Properties of Earnings.” Journal of Accounting and Economics 5: 1016/j.jacceco.2005.04.005.
31–48. doi:10.1016/0165-4101(83)90004-6.
Richardson, S. A., I. Tuna, and P. Wysocki. 2010.
Lev, B., and D. Nissim. 2004. “Taxable Income, Future Earnings, “Accounting Anomalies and Fundamental Analysis: A
and Equity Values.” The Accounting Review 79 (4): 1039–1074. Review of Recent Research Advances.” Journal of
doi:10.2308/accr.2004.79.4.1039. Accounting and Economics 50 (2-3): 410–454. doi:10.1016/
Lev, B., and T. Sougiannis. 1996. “The Capitalization, j.jacceco.2010.09.008.
Amortization, and Value-Relevance of R&D.” Journal of Sivaramakrishnan, K., M. Brown, and B. Kasturi. 2018. “Smart
Accounting and Economics 21 (1): 107–138. doi:10.1016/0165-
Beta: Even Smarter with an Optimizer and a Custom Risk
4101(95)00410-6.
Model.” Axioma Research Paper 126.
Li, K. K., and P. Mohanram. 2014. “Evaluating Cross-Sectional
Sloan, R. G. 1996. “Do Stock Prices Fully Reflect Information in
Forecasting Models for Implied Cost of Capital.” Review of
Accruals and Cash Flows about Future Earnings?” The
Accounting Studies 19 (3): 1152–1185. doi:10.1007/s11142-
Accounting Review 71 (3): 289–315.
014-9282-y.
So, E. C. 2013. “A New Approach to Predicting Analyst
Livnat, J., and R. R. Mendenhall. 2006. “Comparing the Post–
Forecast Errors: Do Investors Overweight Analyst Forecasts?”
Earnings Announcement Drift for Surprises Calculated from
Journal of Financial Economics 108 (3): 615–640. doi:10.1016/j.
Analyst and Time Series Forecasts.” Journal of Accounting
jfineco.2013.02.002.
Research 44 (1): 177–205. doi:10.1111/j.1475-679X.2006.
00196.x. Sougiannis, T. 1994. “The Accounting Based Valuation of
Corporate R&D.” The Accounting Review 69 (1): 44–68.
McVay, S. E. 2006. “Earnings Management Using Classification
Shifting: An Examination of Core Earnings and Special Items.” Thomas, J., and F. X. Zhang. 2011. “Tax Expense Momentum.”
The Accounting Review 81 (3): 501–531. doi:10.2308/accr.2006. Journal of Accounting Research 49 (3): 791–821. doi:10.1111/j.
81.3.501. 1475-679X.2011.00409.x.

Volume 80, Number 2 97


Financial Analysts Journal | A Publication of CFA Institute

Thomas, J., and F. X. Zhang. 2014. “Valuation of Tax Expense.” Science 60 (1): 227–245.
Review of Accounting Studies 19 (4): 1436–1467. doi:10.1007/ doi:10.1287/mnsc.2013.1748.
s11142-013-9274-3.
Watts, R. L., and R. W. Leftwich. 1977. “The Time Series of
Vitorino, M. A. 2014. “Understanding the Effect of Annual Accounting Earnings.” Journal of Accounting Research 15
Advertising on Stock Returns and Firm Value: Theory (2): 253–271. doi:10.2307/2490352.
and Evidence from a Structural Model.” Management

98

You might also like