Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Prediction-Based Portfolio Optimization Model using Neural Networks Fabio D. Freitas a,∗ , Alberto F. De Souza b , Ailson R. de Almeida a Secretaria da Receita Federal do Brasil – RFB, Programa de Pós Graduação em Engenharia Elétrica – UFES, Pietrangelo de Biase, 56 sala 308, 29.010-190 – Vitoria ES, Brazil. b Programa de Pós Graduação em Informática – UFES, Av. Fernando Ferrari, s/n, 29075-910 – Vitória, ES, Brazil. Abstract This work presents a new prediction-based portfolio optimization model that can capture short-term investment opportunities. We used neural network predictors to predict stocks’ returns and derived a risk measure, based on the prediction errors, that have the same statistical foundation of the mean-variance model. The efficient diversification effects holds thanks to the selection of predictors with low and complementary pairwise error profiles. We employed a large set of experiments with real data from the Brazilian stock market to examine our portfolio optimization model, which included the evaluation of the Normality of the prediction errors. Our results showed that it is possible to obtain Normal prediction errors with non-Normal time series of stock returns, and that the prediction-based portfolio optimization model took advantage of short term opportunities, outperforming the mean-variance model and beating the market index. Key words: Neural Networks; Time Series Prediction; Portfolio Optimization. 1. Introduction Investment selection is a central problem in financial theory and practice and it is primarily concerned with the future performance of investments, mainly their expected returns. When investments are exposed to uncertainties, the investment selection framework must include a quantitative measure of the uncertainty of obtaining the expected return, i.e. a quantitative measure of risk. ∗ Corresponding author. Tel/Fax +552732228201. Email addresses: freitas@computer.org (Fabio D. Freitas), alberto@lcad.inf.ufes.br (Alberto F. De Souza), ailson@ele.ufes.br (Ailson R. de Almeida). Preprint submitted to Neurocomputing 8 June 2008 The mean-variance model, proposed by Harry Markowitz [1], is a landmark in Modern Portfolio Theory (MPT). In this model, the risk of investment in a portfolio of stocks is minimized through optimal selection of stocks with low joint risk, which provides a mechanism of loss compensation known as Efficient Diversification. The portfolio optimization process consists of finding, in a large collection of stocks, the participation of each stock (i.e. the percentage in the portfolio value) that minimizes the portfolio’s risk at a desired portfolio return, or, in the dual problem, maximizes the portfolio’s return at a given risk. Varying the desired portfolio return (or risk) among all possible values for optimal portfolios outlines a locus in the risk-return space named Efficient Frontier. The portfolios that lie in Efficient Frontier are named Efficient Portfolios, and have the singular property that there are no other portfolios in the opportunity set that exhibit lower risk for the same return, or greater return for the same risk [2,3]. The model has the fundamental assumption that the time series of returns of each stock follows a Normal distribution and uses its mean as a prediction of the stock’s future return, its variance as a measure of the stock’s risk, and the covariance of each pair of time series as a measure of joint risk of each pair of stocks. After the Markowitz’s mean-variance model, many other models that use its fundamental assumption appeared [4, pp. 219–252]. In all these models, known today as classic models, the portfolio’s expected return is given by the linear combination of the participations of the stocks in the portfolio and its expected returns (the mean returns). The portfolio risk measure varies, but it is often based on the linear combination of the participations and the moments about the mean of the time series of returns of its stocks. Despite the wide adoption of the classical models of portfolio selection, their fundamental assumption has been threatened by real world data in many ways. The distributions of the series of returns often departs from Normality, exhibiting kurtosis and skewness [5,6], which makes the variance (or standard deviation) of the returns an inappropriate measure of stocks’ risk [3, pp. 156]. In addition, the use of mean returns as prediction of stocks’ future returns impose a low pass filtering effect on the dynamic behavior of the stock markets, leading to imprecise estimates of short-term future returns, which is detrimental to the performance of the models on short-term investments. The predictability of stock markets is still an open question in finance theory. The Efficient Market Hypothesis (EMH), the theoretical framework that guides the discussion around this question, has been under empirical testing and reviewing for the past several decades [7,8,9,10]. Market efficiency implies a random walk model for the prices of stocks, but pricing irregularities and predictable patterns like serial correlations, calendar effects, and even sports results effects do appear [10,11]. The forecasting of time series was traditionally tackled by the linear methods of time series analysis [12]. However, in the last two decades, many machine learning methods for time series prediction appeared, ranging from neural network models to support vector machines and fuzzy sets [13,14,15,16]. Among these methods, the non-linear mapping capabilities and the non-assisted estimation of the structural model’s parameters of the neural networks are advantageous for its application in the prediction of stock’s future returns, making then suitable for large scale applications [17,18]. The development of accurate neural predictors is still subject of many research efforts. Recently, Chen et al. [19] proposed a flexible neural tree ensemble to predict the NASDAQ–100 and S&P CNX NIFTY stock indexes, while Liu et al. [20] presented a reinforcement learning-based scheme for simultaneously determining the input dimension and time delay of a neural predictor that was employed to predict daily stock prices of General Motors Corp. These predictors were not used for building automated investment strategies, though. Nevertheless, in recent years, attention has been devoted to use neural networks for implementing automated investment strategies called trading systems. Neural-based trading systems typically employ neural network predictors to predict stock’s future prices, returns or other related measures, and use these to generate signals to buy, sell or hold assets. Lee et al. [21] proposed a divide and conquer approach that uses multiple specialized Q-Learning neuralbased agents to build a trading system. The investment problem was divided into timing and pricing prob2 lems, and four separate agents were used for generating buy and sell signals for the respective timing and pricing problems. The system was used for trading on the Korean KOSPI index and presented superior results than other methods examined by the authors. Thawornwong and Enke [22] used a data mining technique to select economic and financial variables with predictive power and used the selected variables as input of a neural predictor employed for predicting the direction (sign) of future return movements. The predictions were used to implement a trading strategy for deciding either to invest on the S&P 500 index portfolio or on the T-Bill (risk-free) during one month. The results showed that the monthly returns obtained with this adaptive strategy were always higher than those of the (non-adaptive) compared methods. Moody and Saffell [23] developed a trading system based on the direct reinforcement method that does not have to learn a value function. In their method, the trading strategies are learned directly from data, thus without the need of forecasting intermediate values. They used their system to trade on currency, and for S&P 500/T-Bill asset allocation, achieving better results than compared systems. Pantazopoulos et al. [24] presented a neurofuzzy volatility predictor that used partitioned training subsets with data vectors that preceded an increase, sustain or decrease in volatility, to derive a trading strategy for options on S&P 500 index during 10,000 days. A simulated portfolio based on the S&P 500 index and initiated with $1,000 obtained a final balance of $3, 862, while a portfolio based on the author’s trading system obtained a final balance of $625, 120, with the same initial value and in the same trading period, although with a higher volatility. Hellstrom [25] developed a rank measure based on the relative returns of a large number of securities. Ranks built with this ranking measure were predicted with a linear model, and the sign of daily thresholdselected 1-day predictions achieved an accuracy of 63% (or a 63% Hit Rate – HR ). The author used the predicted ranks to build a trading strategy for the Swedish stock market that significantly outperformed the market index. In a subsequent work [26], the author optimized a set of technical indicators through a sliding time window modeling that generates different parameters for the optimized trading rules for each time window. Solutions covering too few examples were rejected and the generalization of the optimized trading rules was enhanced. His results achieved HR between 59% and 64%, while the benchmark methods performed below 53%. The trading systems approach has produced remarkable results, but these applications are typically designed for maximizing the investment’s return by trading a single asset. Therefore, the higher returns are expected in periods of high volatility of the asset, which means high risk associated with the asset. But, although rational investors are profit-seekers, they are also averse to losses [27,28, pp. 595]. This makes the adoption of a risk controlling mechanism in the trading strategy an imperative issue. The efficient diversification framework, allied with predictors of future returns and proper risk measures, is a suitable alternative for addressing the single asset and risk controlling issues associated with trading systems. As stated previously, neural networks are advantageous for large scale applications such as portfolio selection. In an earlier work [29], we investigated the Normality of the errors of weekly stock returns predictions produced by a new autoregressive neural network predictor we have proposed (the autoregressive moving reference neural network predictor – AR-MRNN). We found more evidence of Normality on these errors of prediction than on the series of returns, and explored this in a subsequent work [30], where we proposed a portfolio selection model that uses predicted returns as expected returns and the variance of the errors of prediction as risk measure. In this work, we present an extensive evaluation of our prediction-based portfolio optimization model, comparing its short-term investment performance with that of the mean-variance model. This evaluation involved a large set of experiments with real data from the Brazilian stock market. Our experimental results showed that the prediction-based portfolio optimization model outperforms the mean-variance model and 3 the Brazilian IBOVESPA market index, achieving higher returns with lower risks, while using the same stocks in the same periods of time. The remainder of the paper is organized as follows. After this introduction, Section 2 presents the ARMRNN predictor [29] and Section 3 presents our prediction-based portfolio optimization model. Our experimental methods and results are presented in Section 4. We conclude with a brief discussion in Section 5 and our final conclusions in Section 6. 2. Autoregressive moving reference neural network (AR-MRNN) predictors An investment project is typically planned and executed over a time frame named investment horizon. Investments are compared using a relative performance measure named return on investment, or simply return, that quantifies the wealth variation over the investment horizon. The one-period stock return in time t is defined as the difference between the price of the stock at time t and the price at time t − 1, divided by the price at time t − 1, as shown in Eq.1. rt = Pt − Pt−1 Pt−1 t ≥1 (1) where rt is the one-period stock return at time t, and Pt and Pt−1 are the stock prices at times t and t − 1, respectively. The series of N past returns of a stock, r′ , is defined as: r′ = (r1 , r2 , . . . , rN ) (2) The one-period prediction of the future return of a stock can be defined as the process of using r′ for obtaining an estimate of rt+1 . The classical neural network time series predictor is the autoregressive neural network (ARNN) predictor [13], SR , with p inputs – the present value and the p − 1 past values of the series, whose output is an estimate of the value of the next time period, as shown in Eq.3: (rt−(p−1) , . . . , rt−1 , rt ) → SR → r̂t+1 (3) The number of inputs, p, is the regression order, and some methods for obtaining it are shown in [12]. After being trained, the neural network predictor implements a non-linear multiple regression model of the time series of returns [31, pp. 635-660]. We proposed a new autoregressive neural network model, named autoregressive moving reference neural network (AR-MRNN) predictor, to try and mimic a way one typically inspects a time series graph to guess its future value [29]. In this task, one tends to concentrate the visual attention in the last points of the graph, creating an imaginary frame that delimits this region of the graph and offers an image with, hopefully, sufficient visual information for extrapolating the next value of the series. In this prediction scheme, one uses some point inside this region as a reference from which the future value of the series is estimated. We mimic this in the AR-MRNN predictor by subtracting one of the past return values – the reference – from the returns that would be presented to the neural network inputs, and by presenting these differences as input instead. The AR-MRNN(p,k) predictor has regression order p, the return at time t − (p − 1) − k as reference, and inputs and outputs shown in Eq.4: (rt−(p−1) − z, . . . , rt − z) → SR → r\ t+1 − z (4) were z is the reference value given by: z = rt−(p−1)−k (5) 4 After training, r̂t+1 is obtained from the prediction r\ t+1 − z using: r̂t+1 = r\ t+1 − z + z (6) The values encoded into the AR-MRNN weights are usually smaller than those encoded in the weights of a standard ARNN, which reduces the possibility of saturation at the neurons outputs. This increases the dynamic range of the network and its ability to represent the series of returns. Also, preprocessing demands like normalization and detrending are alleviated, and smaller weight values are produced. As a result, the network can be regularized, which enhances its generalization capabilities [31]. 3. Prediction-based portfolio optimization model Since the proposition of the mean-variance portfolio optimization model by Harry Markowitz, computational feasibility, model simplifications and the development of risk measures have received considerable research attention [32,33,34,35,36]. The prediction of future returns in the context of portfolio selection have received little attention, however, and most models use the same prediction method employed in the meanvariance model, i.e., the mean of past returns. Nevertheless, the mean returns are expected to be verified only in the long term; therefore, they are inadequate predictions of future returns in short-term investment strategies, such as active portfolio management. The use of better prediction methods for obtaining estimates of future returns, and associated risk measures, can be used to produce predictive portfolio selection models that are suitable for short-term investment strategies. This section describes a prediction-based portfolio optimization model that, instead of using the mean returns as the mean-variance model, uses predicted returns as expected returns, and instead of using the variance of the returns, uses the variance of the errors of prediction as risk measure. 3.1. Expected return and risk of a stock Let the return of a stock and its predicted return be related by: rt = r̂t + εt (7) where, rt is the stock return at time t, r̂t is the predicted return for time t, obtained at time t − 1, and εt is the prediction error at time t. Rearranging terms, we can have the prediction error defined as: εt = rt − r̂t (8) The time series of n errors of prediction is then given by: ε′ = (ε1 , ε2 , . . . , εn ) (9) For a non-biased predictor, the series of errors of prediction must be statistically independent and identically distributed (iid), with mean and variance given by: µε = ε̄ = 0 ν̂ = σ2ε = (10) 1 n 2 ∑ εt n − 1 t=1 (11) The variance of the errors of prediction (Eq.11) reflects the uncertainty about the realization of the predicted return and is used in the model as measure of the individual risk of each stock (the higher the variance, the higher the risk). 5 3.2. Expected return and risk of a portfolio A portfolio is a collection of M stocks and M weights, or participations. Each participation, Xi , i = 1, . . . , M, 0 ≤ Xi ≤ 1, and ∑ Xi = 1, represents the fraction of the portfolio value invested in the stock i. The predicted return of the portfolio, or portfolio expected return, r̂ p , is the linear combination of the participation and predicted return of each stock in the portfolio: M r̂ p = ∑ Xi r̂i (12) i=1 Assuming that the time series of errors of prediction follows a Normal distribution, we model the portfolio risk as the variance of the joint Normal distribution of the linear combination of the participations and prediction errors of the stocks of the portfolio: M M V̂ = σ̂2p = ∑ ∑ Xi X j γε i j (13) i=1 j=1 where V̂ is the total portfolio risk, that is equal to the variance of the linear combination of the participations and prediction errors of each stock in the portfolio, σ̂2p ; M is the number of stocks in the portfolio; Xi and X j are the participations of stocks i and j in the portfolio, respectively; and γε i j is the interactive prediction risk of stocks i and j, which is the covariance of the errors of prediction of the stocks i and j (see also Eq.10 and Eq.11), and is given by: γ̂i j = γε i j = 1 n ∑ εit ε j t n − 1 t=1 (14) Eq.13 can be rewritten as: M M M i=1 i=1 j=1 j6=i V̂ = σ̂2p = ∑ Xi 2 σ2ε i + ∑ ∑ Xi X j γε i j (15) where the first sum represents the contribution of the risk associated with each stock to the portfolio risk (sum of the square of the participation Xi times the variance of the errors of prediction σ2ε i of each stock i), and the second group of sums represents the contribution of the interactive prediction risk of each pair of stocks i and j (sum of the participations Xi and X j times the covariance γε i j of the stocks i and j). 3.3. Portfolio optimization model With the fundamental measures defined, the prediction-based portfolio optimization model can be formulated as: Minimize M M M i=1 i=1 j=1 j6=i V̂ = ∑ Xi 2 ν̂i + ∑ ∑ Xi X j γ̂i j (16) Sub ject to M ∑ Xi r̂i = Rd (17) i=1 6 M ∑ Xi = 1 (18) Xi ≥ 0,i = 1, . . . , M (19) i=1 Eq.16 is the minimized objective function, the prediction-based portfolio risk; Eq.17 is the desired return constraint that guarantees the desired portfolio return Rd ; Eq.18 guarantees total resource allocation and Eq.19 restrict the model for purchase trades only. As in the mean-variance model, a set of minimum risk portfolios can be obtained by solving the minimization problem above for a range of desired portfolio return values. This set is named Efficient Frontier, these portfolios are named Efficient Portfolios, and the investment strategy based on this framework is named Efficient Diversification. Each efficient portfolio has the particular property that, according to the model, there is no other portfolio in the opportunity set that has a lower risk for a given return, or, in the dual problem, a higher return for a given risk. 3.4. Prediction-based portfolio optimization model versus mean-variance model The prediction-based portfolio optimization model differs from the mean-variance model because: (i) in the prediction-based portfolio optimization model, the expected return of each stock is its predicted return, instead of the mean of its time series of returns, as is the case in the mean-variance model; (ii) in the prediction-based portfolio optimization model, the individual risk of each stock and the interactive risk between each pair of stocks are obtained from the variance and covariance of the time series of the errors of prediction (Eqs.11 and 14), instead of from the variances and covariances of the time series of returns. (iii) although both models are based on the Normal framework, in the prediction-based portfolio optimization model the Normal variable of interest is the error of prediction of the return of the stocks, while, in the mean-variance model, the Normal variable of interest is the return of the stocks. Correctly predicting the time series of stock returns is recognized as a difficult task [37]. Stock return predictors do not have good performances, typically exhibiting high error levels. However, predictors can be combined in such a way that allows the exploitation of the complementarities of their errors, leading to good combined predictions, as is the case of the portfolio return of our model. The prediction-based portfolio optimization model is based on the assumptions that the mean of the errors of prediction is zero (Eq.10) and that the errors of prediction are Normal. These assumptions are supported by the experimental results shown in Section 4.2. 4. Experiments This section presents the experiments we have used to evaluate our neural predictors, the Normality of the errors of prediction, and the performance of the prediction-based portfolio optimization model on artificial data, and on real data under several risk levels. 4.1. Data From the 82 stocks that participated in the IBOVESPA index between December 2004 and September 2007, we selected a subset of 52 stocks with long enough time series for training the neural networks and cal7 culating the necessary parameters of the portfolio optimization models 1 . For each one of these 52 stocks, we computed the 413 weekly returns of the period between 27-Oct-1999 and 19-Sep-2007 using closing prices sampled at Wednesdays to avoid the beginning-of-the-week and end-of-the-week effects [10,4, pp. 404]. In all cases of missing data, we used the last daily closing price available. 4.2. Prediction of returns The topological and training parameters of the neural network predictors evaluated were determined empirically in previous works [29,30] and are as follows. We used AR-MRNN(4,1) predictors (Section 2) implemented with a fully connected feedforward neural network with 2 hidden layers, sigmoidal activation function, and a 4:16:4:1 topology (4 input neurons, 16 neurons in the first hidden layer, 4 neurons in the second hidden layer, and 1 output neuron) 2 . To train and test (use for predictions) the neural networks involved in the experiments, we used a sliding window of 168 of the 413 weekly returns available. This allowed 246 predictions, i.e. the 413 returns (from our full data set) minus 168 (sliding window, including the 4 returns necessary as the first 4 inputs of the AR-MRNN(4,1)), plus 1 (the prediction of the initial window). Therefore, we had 12,792 training sessions (246 train and test cycles × 52 stocks = 12,792). Each training session was conducted during 200,000 epochs using the back-propagation algorithm [31] with learning rate of 0.009 and inertia of 0.95. The sliding window contains the training set (163 input-output pairs) and the testing set (1 input-output pair). To reduce overfitting, we employed a technique, described below, that required dividing the training set into two segments: a training segment with the first 156 input-output pairs, and a validating segment with the last 7 input-output pairs. The training segment was used for updating the neural networks’ weights, while the validating segment was used to try and select, at each 1,000 training epochs (training blocks), the weights that provided the smaller root mean square prediction error (RMSE – see Eq.21 in Section 4.2.1). The traditional use of training and validating segments for controlling overfitting [31] is challenged by non-stationary time series prediction, exactly because its structural regime changes over time, and more recent data are needed for both the training segment (for encoding more recent information into the network’s weights) and the validating segment (for controlling overfitting). There are no standard methods to address this trade-off, but some alternatives do apply, like shifting the validation segment to other locations in time [24]. The overfitting controlling procedure we have used is inspired in the Non-linear Cross Validation (NCV) proposed in [39]. In our procedure, we first obtain a set of network weights at the end of a training block. Next, before starting the next training block, we perturb the weights by inserting new information through additional training with the validating segment during a small number of epochs (10 epochs), under the assumption that this perturbation will help finding a better set of weights for predicting the next time series value. Then, we use the network to predict the returns in the validation segment and examine its RMSE prediction error with the current set of weights. If it is the best performance so far, we save the current set of weights. We then proceed with the following training block. This procedure is repeated for the 200,000 epochs, at the end of which, we have the best set of weights that is used to predict the test set of the current sliding window. The sizes of the training and validating segments were obtained using the heuristic presented in [31, pp. 217]. 1 Since the end of the 1990’s, the Brazilian stock market has been experiencing a growing and maturing process, with a steady increase in the number of initial public offerings (IPOs). Many of these new stocks entered the IBOVESPA index in the beginning of 2000’s, and did not have long enough time series to be included in our analysis. 2 Smaller networks typically underfitted the training data. This network size provides good generalization [38], but demands a smoother training, with small learning rate and large number of epochs to avoid saturation of the neurons’ outputs. 8 The training and testing procedure described above was repeated for all 246 predictions by advancing the sliding window of 168 weeks, one week at a time. We performed the resulting 12,792 training sessions in the 64 ATHLON XP 1800 nodes Enterprise cluster of the High Performance Computing Lab of the Departamento de Informática of Universidade Federal do Espírito Santo (http://www.lcad.inf.ufes.br). 4.2.1. Evaluation metrics We used the Mean Error, Root Mean Square Error, Mean Absolute Percentage Error, and Hit Rates metrics to evaluate our predictor’s performance. The Mean Error (ME) is the average difference between the realized and predicted returns, defined as: ME = 1 n ∑ rt − r̂t n t=1 (20) where n is the length of the time series, and rt and r̂t are the realized and predicted returns at time t, respectively. The ME is used to evaluate the assumption of Eq.10 and the Normality of the errors of prediction. The Root Mean Square Error (RMSE) is a standard metric for comparing the differences between two time series, and it is defined as: s 1 n (21) RMSE = ∑ (rt − r̂t )2 n t=1 The RMSE error may be interpreted as the standard deviation of the errors of prediction (see Eq.8) with respect to a zero mean, and it shows the distance of these errors from the ideal situation of zero mean error (see Eq.10). The RMSE has low outlier protection, good sensitivity for small changes in data, and does not display data asymmetry [40]. The Mean Absolute Percentage Error (MAPE) is defined as: MAPE = 1 n rt − r̂t ∑ | rt | n t=1 (22) The MAPE error is a unit-free measure, has good sensitivity for small changes in data, does not display data asymmetry and has very low outlier protection [40]. It is important to note that, when rt is very close to zero and r̂t is not, the MAPE error can average to a very large value. The Hit Rates HR , HR+ and HR− measure the percentage of predictions whose signals of r̂ and r coincide: HR is the percentage of predictions were both have the same signal and are different then zero, HR+ the percentage were both are positive, and HR− were both are negative. These metrics are suitable for evaluating predictors as trading signals generators [37], and can be written as: HR = n (r r̂ > 0) Countt=1 t t n (r r̂ 6= 0) Countt=1 t t (23) HR+ = n (r > 0 AND r̂ > 0) Countt=1 t t n Countt=1 (r̂t > 0) (24) HR− = n (r < 0 AND r̂ < 0) Countt=1 t t n (r̂ < 0) Countt=1 t (25) where, Count(·) is the counting of occurrences of its argument. 9 Table 1 Summary of the results obtained with the 52 predictors. Error mean σ2 σ −0.000578 0.000011 0.003241 RMSE 0.072500 0.000254 0.015950 MAPE 9.398462 HR 0.512926 0.001255 0.035419 HR+ 0.610100 0.003013 0.054890 HR− 0.383322 0.003304 0.057476 ME 163.102558 12.771161 4.2.2. Prediction performance Table 1 presents the results of the experiments we employed to assess the prediction performance of the 52 AR-MRNN(4,1) predictors (one for each stock). It shows the mean, variance (σ2 ), and standard deviation (σ) of the results obtained with the 246 predictions of each predictor, measured according to each metric of the previous section. As Table 1 shows, the average of the ME error was close to zero with a low standard deviation, thus validating the assumption of Eq.10. The RMSE, MAPE and HR achieved typical levels for this application [20,37]. The HR+ of 61% showed that predictors achieved a performance 11% above pure chance (50%) in predicting the positive returns of the market, while the prediction of the negative returns obtained a HR− near 38%. Apart from HR+ and ME error, the performance of the predictors was modest; however, our predictionbased portfolio optimization model was able to capture these two good collective results to achieve good portfolio performance, as we show in Section 4.4. 4.2.3. Normality of the errors of prediction and returns We examined the Normality of the errors of prediction and the Normality of the returns of the precise time series used for portfolio optimization. For that, we measured the percentage of these series that had Normality accepted (i.e. not rejected) according the Chi-square goodness-of-fit test [41] at significance level of 1%. We called this measure Normality Index. Not all 52 stocks participated in all portfolios optimizations of our trading experiments (Section 4.4), but only those that were in the IBOVESPA index at the time of each portfolio optimization (the number of stocks used in each week varied between 40 and 49 of the 52). We did that to allow better performance comparisons of the portfolio optimization models under analysis with the IBOVESPA index (see Section 4.4). We performed trading experiments during the last 142 weeks (from 5-Jan-2005 to 19-Sep-2007) of the 413 weeks available in our data set. To compute the parameters of the portfolio optimization models (the variances and covariances of the prediction errors for the prediction-based portfolio optimization model, and the mean, variances and covariances of returns for the mean-variance model) we used a sliding window of 104 weeks preceding each of the 142 weeks mentioned above. The Normality Indexes computed at each one of the 142 weeks are shown in Fig.1. As Fig.1 shows, the Normality Indexes of the series of errors of prediction was slightly higher and varied less than the Normality Indexes of the series of returns. Table 2 summarizes the 142 Normality Indexes of each series, showing the mean, variance (σ2 ), and standard deviation (σ) of the Normality Indexes. We performed the t-test [41] of the means of the Normality Indexes at significance level of 5%. The result indicated that the mean of the Normality Index of the series of errors of prediction is larger than that of the series of returns (the t-statistic value of 3.41 was larger than 10 (a) Normality Indexes of the errors of prediction 1 Normality Index 0.95 0.9 0.85 0.8 0.75 2005 Apr 2005 Jul 2005 Oct 2006 Jan 2006 Apr 2006 Jul 2006 Oct 2007 Jan 2007 Apr 2007 Jul 2007 Jan 2007 Apr 2007 Jul Date (b) Normality Indexes of the series of returns 1 Normality Index 0.95 0.9 0.85 0.8 0.75 2005 Apr 2005 Jul 2005 Oct 2006 Jan 2006 Apr 2006 Jul 2006 Oct Date Fig. 1. Normality Indexes for (a) the series of errors of prediction, and (b) the series of returns, at each trading week. The series of errors of prediction are slightly more Normal than the series of returns. Table 2 Summary of the Normality Indexes for the series of errors of prediction and for the series of returns. mean σ2 σ Normality index for the series of errors of prediction 0.921621 0.001336 0.036561 Normality index for the series of returns 0.905141 0.001980 0.044503 the critical value t0.05 of 1.645). 4.3. Trading with artificial data We present a trading experiment with artificial data to show how the prediction-based portfolio optimization model takes advantage of predictive opportunities not visible to the mean-variance model. For that, we designed a trading experiment using the two simple artificial stocks of Fig.2(a): a stock R1 with constant weekly returns of 0.005, and a stock S1 with zero-mean sinusoidal weekly returns between −0.01 and 0.01. The experiment consisted of implementing the trading strategy of the next section for investing in portfolios that maximize the expected return. To evaluate the portfolios’ performance we used the metrics of Section 4.3.2 during 142 weeks. 11 (a) Artificial series of returns 0.01 Return 0.005 R1 S1 0 -0.01 (b) Accumulated returns of the artificial portfolios 2.4 PRED MV Accumulated Return 2.2 2 1.8 1.6 1.4 1.2 1 2005 Apr 2005 Jul 2005 Oct 2006 Jan 2006 2006 Apr Jul Date 2006 Oct 2007 Jan 2007 Apr 2007 Jul Fig. 2. (a) The artificial series of returns of R1 and S1; (b) Accumulated Returns obtained with the artificial stocks R1 and S2. The prediction-based portfolio optimization model (PRED) switched positions between R1 and S1, using this later when it predicted a superior return (shaded regions), while mean-variance model (MV) used only stock R1, because the mean return of stock S1 is zero. 4.3.1. Trading strategy The trading strategy employed [3, pp. 139] consisted in: (i) to obtain an Efficient Frontier from the available data at time t, (ii) to select the desired efficient portfolio, (iii) to invest all the available wealth in the stocks according to the weights of the portfolio, and (iv) to sell the entire portfolio at time t + 1 and to reinvest the whole wealth obtained in a new portfolio obtained in the same way. As usual, we implemented the above trading strategy for evaluating our models using the following underlying assumptions: (i) that the stocks are perfectly divisible; (ii) that it is possible to buy and sell any selected portfolio; (iii) that there is no friction (transactions costs, taxation, commissions, etc.); and (iv) that it is possible to buy and sell stocks at closing prices at any time t. 4.3.2. Portfolio evaluation metrics The portfolio performance evaluation was based on the return and risk measures defined in Section 2 and Section 3, and on the Accumulated Return, Portfolio Change Measure, and Turnover Index metrics described below. The Accumulated Return is defined as: t Accumulated Return = ∏ (1 + ri ) (26) i=0 where ri is the arithmetic return at time i, as defined in Eq.1. This is a standard performance measure for comparing investments and relates the wealth at time t, Wt , with the initial wealth, W0 , as Wt = W0 × Accumulated Returnt . All the trading experiments in this work used an initial wealth W0 = 1. 12 Table 3 Summary of the artificial trading during the 142 weeks Prediction-based portfolios selected at maximum r̂ p Accumulated Return Mean-variance portfolios selected at maximum r̄ p 2.3210 σ2 mean Accumulated Return σ mean 0.00594865 2.82215e − 06 0.00167993 weekly returns TI 0.12676 0.240729 0.490642 σ2 min max 1 1 1 number of stocks σ 0.005 3.79465e − 16 1.94799e − 08 0 0 PCM mean number of stocks weekly returns TI 5.75046e − 05 PCM 2.0303 0 0 mean min max 1 1 1 The Portfolio Change Measure (PCM) [42,43] is the covariance in time between the returns of the stocks and the variation of the participations on the same stocks, and is defined as: PCM = 1 T T M ∑ ∑ rit (Xit − Xit−1 ) (27) t=1 i=1 where T is the length of the time period under analysis (in our case, number of weeks), M is the number of stocks in the portfolio, rit is the return of the stock i at time t, Xit and Xit−1 are the participations of stock i in the portfolios of times t and t − 1, respectively. A stock contributes positively to the portfolio’s PCM when it has a positive return in time t (see Eq.1) and the participation of this stock increases between times t − 1 and t, or when it has a negative return in time t and the participation of this stock decreases between times t − 1 and t. We propose the Turnover Index (T I) that we use to measure the percentage of the invested wealth that is exposed to frictions at time t. The T I is defined as: M {t−1,t} T It = ∑ |Xit − Xit−1 | (28) i=1 where M {t−1,t} is the number of the stocks belonging to both the portfolios of times t and t − 1, Xit and Xit−1 are the participations of stock i in the portfolios of times t and t − 1, respectively, and | · | is the absolute value of its argument. The T It is bounded in the [0,2] interval, and its bounds reflect holding the same portfolio between times t and t − 1 (a 0% exposition to frictions), and selling the entire portfolio at time t − 1 and buying a completely different one at time t (a 200% exposition to frictions), respectively. 4.3.3. Experimental results Fig.2(b) shows the performance of the prediction-based portfolio optimization model and the meanvariance model, presenting their Accumulated Returns during the trading period defined in Section 4.2.3, i.e. the last 142 weeks (from 5-Jan-2005 to 19-Sep-2007) of the 413 weeks available in our data set. The neural predictors learned to predict both series of Fig.2(a) with zero error, allowing the prediction-based portfolio optimization model (the PRED curve in Fig.2(b)) to accumulate higher returns by alternating participations between R1 and S1, using that with the higher predicted return; while the mean-variance model (MV in Fig.2(b)) used only the R1 stock, because this stock always had a mean return superior than S1. 13 (a) Efficient Frontiers of Prediction-based Model (b) Efficient Frontiers of Mean-variance Model 0.2 0.2 PRED 001 PRED 142 MV 001 MV 142 0.1 0.1 Return 0.15 Return 0.15 0.05 0.05 0 0 -0.05 -0.05 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 Risk 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 Risk Fig. 3. Two Efficient Frontiers of the (a) prediction-based portfolio optimization model (PRED), and (b) mean-variance model (MV). The prediction-based portfolio optimization model obtained higher returns for the same risk levels, and have fewer negative r̂ p − σ̂ p values than the mean-variance model’s r̄ p − σ p values (vertical bars). Table 3 summarizes the results obtained with all metrics employed in the evaluation. As Table 3 shows, the prediction-based portfolio optimization model presented a final Accumulated Return of 2.3210, a value 14.31% higher than the 2.0303 value that was achieved by the mean-variance model. The prediction-based portfolio optimization model accumulated returns at weekly rates of 0.005 or 0.01 (shaded area of Fig.2), with a mean weekly return of 0.00594865, while the mean-variance model accumulated returns only at the 0.005 rate. The prediction-based portfolio optimization model presented a small positive PCM due to the periods when the model correctly switched between R1 to S1, and a mean T I of 0.12676 also due to the 9 switches between R1 and S1 observed during in the 142 weeks. The mean-variance model presented zero PCM and zero mean T I because it used only the R1 stocks in all weeks. Both models obtained portfolios with 1 stock in all the times. 4.4. Trading with real data This section presents the trading experiments we performed with real data from the Brazilian stock market. With these experiments, we evaluated the prediction-based portfolio optimization model comparing its performance with that of the mean-variance model and the IBOVESPA market index. We used the real data described in Section 4.1, and the same trading strategy, evaluation metrics and periods of time of Section 4.3. 14 4.4.1. Efficient frontiers In all the following experiments, we computed Efficient Frontiers with 30 portfolios for both the predictionbased portfolio optimization model and the mean-variance model at each of the 142 weeks of the trading period. A total of 8,520 ((30 + 30) × 142 = 8,520) quadratic optimization problems were solved in the experiments. Fig.3(a) shows two of the 142 Efficient Frontiers of the prediction-based portfolio optimization model (PRED), obtained at the first and last trading week (t = 1 and t = 142), respectively; and Fig.3(b) shows the same for the mean-variance model (MV). The efficient portfolios are shown in the squares and circles of Fig.3, while the 1σ vertical bars, obtained by calculating the square root of the portfolio risk (variance), are plotted for each efficient portfolio. These 1σ bars outline the regions, or confidence intervals, where the portfolio returns are expected to lie with a probability of 68.27%, according to the Normal distribution [41]. As the curves and respective confidence intervals of Fig.3 shows, the prediction-based portfolio optimization model produced higher expected returns than the mean-variance model for the same levels of risk. That is, the efficient portfolios of the prediction-based portfolio optimization model with risk superior to 0.001 = 15.87%), while all the efficient have probability of negative returns p(r̂ p − σ̂ p < 0) < 15.87% ( 100−68.27 2 portfolios of the mean-variance model have probability of negative returns p(r̄ p − σ p < 0) significantly higher than that. To our knowledge, it is the first time in the literature that efficient frontiers are analyzed with the help of the confidence intervals of the portfolios’ expected returns. 4.4.2. Trading with real data – low risk This experiment evaluates the portfolio models selecting portfolios of low level of risk. We selected portfolios with weekly expected returns r̄ p and r̂ p equal to 0.0135, since they typically lie at the beginning of the Efficient Frontiers of the models with our data set (see Fig.3). Fig.4 shows the Accumulated Returns obtained in the trading period with the prediction-based portfolio optimization model (PRED), the mean-variance model (MV), and the IBOVESPA index (IBOV), according with the trading strategy of Section 4.3. As Fig.4 shows, the prediction-based portfolio optimization model outperformed the mean-variance model at the end of the 142 weeks, and tracked the market index with a slight superiority than the mean-variance model, which oscillated around the IBOVESPA index. It is important to note that the mean-variance model performed below IBOVESPA in the period. Fig.5 shows an evaluation of the Accumulated Returns of the portfolio models in the context of the Normal framework. The figure shows the realized and expected Accumulated Returns curves for both models. The realized Accumulated Returns curve is obtained as usual (see Eq.26), while the expected Accumulated Returns curve is obtained as follows. At each time t, the expected return of the selected portfolio at time t − 1 is accumulated to the portfolio’s Accumulated Return at time t − 1 and is plotted as the portfolio’s expected Accumulated Return for time t. Then, we plot a 3σ bar (3 times the square root of variance, or risk, of the selected portfolio at time t − 1) centered at the expected Accumulated Return for time t. This 3σ bar outlines a region where the portfolio’s Accumulated Return for time t is expected to lie with a probability of 99.73%, according to the Normal distribution [41]. As Fig.5 shows, most of the realized Accumulated Returns of both models are inside the 3σ bars. Table 4 summarizes the results of the IBOVESPA market index, and Table 5 summarizes the results of the portfolio models. As the tables show, the prediction-based portfolio optimization model achieved an Accumulated Return of 2.184123, which is 29% higher than the 1.691034 achieved by the mean-variance model, and is very close to the 2.188907 achieved by IBOVESPA (see Table 4). The prediction-based portfolio optimization model obtained an ex-post risk (variance) of 0.000744, which is 57.40% below the 0.001747 of the mean-variance model, and 25.40% below the 0.000998 of the IBOVESPA (see Table 4). Thus, the prediction-based portfolio optimization model performed at market return rates with a significant lower risk. 15 Weekly accumulated returns of the portfolios selected at Rd = 0.0135 and the IBOVESPA index 4 PRED MV IBOV 3.5 Accumulated Return 3 2.5 2 1.5 1 0.5 2005 Jan 2005 Apr 2005 Jul 2005 Oct 2006 Jan 2006 Apr 2006 Jul 2006 Oct 2007 Jan 2007 Apr 2007 Jul 2007 Oct Date Fig. 4. Trading with real data – low risk: Accumulated Returns of the prediction-based portfolio optimization model (PRED) and the mean-variance model (MV), with r̂ p = 0.0135 and r̄ p = 0.0135, respectively, and of the IBOVESPA (IBOV) market index. The prediction-based portfolio optimization model outperformed the mean-variance model at the end of the 142 weeks, and tracked the IBOVESPA index better. Table 4 Summary of the IBOVESPA market index. Accumulated Return 2.188907 mean weekly returns number of stocks σ2 σ 0.006031 0.000998 0.031591 mean min max 57 53 63 We used a paired t-test with significance level of 5% to compare the means of the 11 quarterly accumulated returns 3 of the models and found a p-value of 0.384303, indicating that there is no difference between the mean of the prediction-based portfolio optimization model and the mean of the mean-variance model. Both models presented a very small negative PCM, and the mean T I of 42% for the prediction-based portfolio optimization model was slightly better than the 45% for the mean-variance model. The number of stocks in the prediction-based portfolios varied between 12 and 34, with an average of 21 stocks, while the number of stocks in the mean-variance portfolios varied between 2 and 21, with an average 3 Each quarterly accumulated return was calculated with Eq.26 during 13 ( 52 = 13) weeks. We repeated the last weekly Accumulated 4 Return of both models to obtain 142 + 1 = 143 trading periods and exactly 11 quarterly Accumulated Returns ( 143 13 = 11). 16 Realized vs. expected weekly accumulated returns of the portfolios selected at Rd = 0.0135 with the 3-σ bars 2.5 Accumulated Return PRED realized PRED expected 2 1.5 1 2.5 Accumulated Return MV realized MV expected 2 1.5 1 2005 Jan 2005 Apr 2005 Jul 2005 Oct 2006 Jan 2006 Apr 2006 Jul 2006 Oct 2007 Jan 2007 Apr 2007 Jul 2007 Oct Date Fig. 5. Trading with real data – low risk: evaluation of the Accumulated Returns under the Normal framework’s expected performance for the prediction-based portfolio optimization model (PRED) and the mean-variance model (MV). Table 5 Trading with real data – low risk: summary of the trading during the 142 weeks Prediction-based portfolios selected at r̂ p = 0.0135 Accumulated Return Mean-variance portfolios selected at r̄ p = 0.0135 2.184123 mean σ2 Accumulated Return σ 1.691034 mean σ2 σ weekly returns 0.005887 0.000744 0.027284 weekly returns 0.004564 0.001747 0.041804 TI 0.421122 0.057968 0.240766 TI 0.454311 0.104235 0.322854 −0.000590 PCM number of stocks −0.000636 PCM mean min max 21 12 34 number of stocks mean min max 7 2 21 of 7 stocks. Thus, the prediction-based portfolio optimization model achieved a better diversification level, and approached the Staltman’s reference level of 30 stocks [44]. 4.4.3. Trading with real data – medium risk Investors frequently face the problem of diversifying into a portfolio of stocks and a risk-free asset (e.g. the Treasury Bill). In this case, it is shown that the optimal values of risk and return lie at a tangent of the 17 Weekly accumulated returns of the portfolios selected at risk free intercept and the IBOVESPA index 4 PRED MV IBOV 3.5 Accumulated Return 3 2.5 2 1.5 1 0.5 2005 Jan 2005 Apr 2005 Jul 2005 Oct 2006 Jan 2006 Apr 2006 Jul 2006 Oct 2007 Jan 2007 Apr 2007 Jul 2007 Oct Date Fig. 6. Trading with real data – medium risk: Accumulated Returns for the prediction-based portfolio optimization model (PRED) and the mean-variance model (MV), selecting portfolios at the risk-free point of tangency, and for the IBOVESPA (IBOV) market index. The prediction-based portfolio optimization model outperformed the mean-variance model and the IBOVESPA index in almost all weeks. Efficient Frontier (plotted with risks as standard deviations instead of variances) which intercepts the return axis at the risk-free return (Rr f ) value. These optimal values correspond to the various combinations of the participation of the risk-free asset and the participation of the efficient portfolio that touches the tangent – all these combinations are also efficient [3,4]. The location of this point of tangency in the Efficient Frontier depends on the frontier’s shape and the value of Rr f . In this experiment, we selected portfolios at this point of tangency, for a weekly risk-free return Rr f = 0.001136, which corresponds to a 6% annualized interest rate. These portfolios were located near the middle of the Efficient Frontiers of the models (see Fig.3). Fig.6 shows the Accumulated Returns obtained in the trading period with the prediction-based portfolio optimization model (PRED), the mean-variance model (MV), and the IBOVESPA index (IBOV), according the same trading strategy of the previous experiment. As Fig.6 shows, the prediction-based portfolio optimization model outperformed the mean-variance model and the IBOVESPA index during almost all the weeks. It is important to note that the mean-variance model performed below IBOVESPA in the period. Fig.7 shows the evaluation of the Accumulated Returns of the portfolio models in the context of the Normal framework as in the previous experiment. As Fig.7 shows, several Accumulated Returns of the prediction-based portfolio optimization model are outside the 3σ bars, while the major part of the Accumulated Returns of the mean-variance model are inside the 3σ bars. In this experiment, we are selecting portfolios with higher predicted returns (and risks) than in the previous experiment of Section 4.4.2. Hence, the prediction-based portfolio optimization model produced several predicted returns above the returns realized in the market, but in the right direction (see HR+ in Table 1), thus separating its expected and realized curves in several points. 18 Realized vs. expected weekly accumulated returns of the portfolios selected at risk free intercept with the 3-σ bars 4 Accumulated Return 3.5 PRED realized PRED expected 3 2.5 2 1.5 1 4 Accumulated Return 3.5 MV realized MV expected 3 2.5 2 1.5 1 2005 Jan 2005 Apr 2005 Jul 2005 Oct 2006 Jan 2006 Apr 2006 Jul 2006 Oct 2007 Jan 2007 Apr 2007 Jul 2007 Oct Date Fig. 7. Trading with real data – medium risk: evaluation of the Accumulated Returns under the Normal framework’s expected performance for the prediction-based portfolio optimization model (PRED) and the mean-variance model (MV). Table 6 Trading with real data – medium risk: summary of the trading during the 142 weeks Prediction-based portfolios selected at risk-free tangency point Mean-variance portfolios selected at risk-free tangency point Accumulated Return Accumulated Return 3.078600 mean σ2 σ 1.699988 mean σ2 σ weekly returns 0.008924 0.002003 0.044756 weekly returns 0.004292 0.001101 0.033184 TI 1.767750 0.071966 0.268266 TI 0.263251 0.015723 0.125393 PCM number of stocks 0.0013431 −0.000645 PCM mean min max 6 2 14 number of stocks mean min max 11 6 24 Table 5 summarizes the results of the portfolio models. As table shows, the prediction-based portfolio optimization model achieved an Accumulated Return of 3.078600, which is 81.1% above the 1.699988 achieved by the mean-variance model, and 40.6% above the 2.188907 achieved by IBOVESPA (see Table 4). The prediction-based portfolio optimization model obtained an ex-post risk (variance) of 0.002003, which is 81.9% above the 0.001101 of the mean-variance model, and 100.7% above the 0.000998 of the IBOVESPA 19 Weekly accumulated returns of the portfolios selected at maximum Rd and the IBOVESPA index 4 PRED MV IBOV 3.5 Accumulated Return 3 2.5 2 1.5 1 0.5 2005 Jan 2005 Apr 2005 Jul 2005 Oct 2006 Jan 2006 Apr 2006 Jul 2006 Oct 2007 Jan 2007 Apr 2007 Jul 2007 Oct Date Fig. 8. Trading with real data – high risk: Accumulated Returns for the prediction-based portfolio optimization model (PRED) and the mean-variance model (MV), selecting portfolios with maximum expected return, and for the IBOVESPA (IBOV) market index. The prediction-based portfolio optimization model strongly outperformed the mean-variance model and the IBOVESPA index in almost all weeks. (see Table 4). We used a paired t-test with significance level of 5% to compare the means of the 11 quarterly accumulated returns of the models and found a p-value of 0.021143, indicating that the mean of the prediction-based portfolio optimization model is greater than the mean of the mean-variance model. The prediction-based portfolio optimization model achieved a PCM of 0.0013431 while the mean-variance model achieved a PCM of −0.000645. This shows that the prediction-based portfolio optimization model better explored the predictive opportunities than the mean-variance model. The prediction-based portfolio optimization model achieved a mean T I of 176% while the mean-variance model achieved 26%. This shows that the prediction-based portfolio optimization model used a more aggressive strategy than mean-variance model, changing its participations more intensively. The number of stocks in the prediction-based portfolios varied between 2 and 14, with an average of 6 stocks, while the number of stocks in the mean-variance portfolios varied between 6 and 24, with an average of 11 stocks. The prediction-based portfolio optimization model showed modest diversification because of its more aggressive strategy. 4.4.4. Trading with real data – high risk This experiment evaluates the performance of the portfolio models employed for maximizing returns, as in traditional trading systems. In this application, the models are expected to select the stock with the higher expected return – the predicted return for the prediction-based portfolio optimization model, and the mean return for the mean-variance model; thus selecting portfolios at the far right end of the Efficient Frontiers. 20 Realized vs. expected weekly accumulated returns of the portfolios selected at maximum Rd with the 3-σ bars 5 Accumulated Return 4.5 PRED realized PRED expected 4 3.5 3 2.5 2 1.5 1 5 Accumulated Return 4.5 MV realized MV expected 4 3.5 3 2.5 2 1.5 1 2005 Jan 2005 Apr 2005 Jul 2005 Oct 2006 Jan 2006 Apr 2006 Jul 2006 Oct 2007 Jan 2007 Apr 2007 Jul 2007 Oct Date Fig. 9. Trading with real data – high risk: evaluation of the Accumulated Returns under the Normal framework’s expected performance for the prediction-based portfolio optimization model (PRED) and the mean-variance model (MV). Fig.8 shows the Accumulated Returns obtained in the trading period with the prediction-based portfolio optimization model (PRED), the mean-variance model (MV), and the IBOVESPA index (IBOV), according the same trading strategy of the previous experiments. As Fig.8 shows, the prediction-based portfolio optimization model strongly outperformed the mean-variance model and the IBOVESPA index during almost all weeks. It is important to note that the mean-variance model performed far below IBOVESPA in the period. Fig.9 shows the evaluation of the Accumulated Returns of the portfolio models in the context of the Normal framework as in the previous experiments. As Fig.9 shows, several Accumulated Returns of the prediction-based portfolio optimization model are outside the 3σ bars, while the major part of the Accumulated Returns of the mean-variance model are inside the 3σ bars – a behavior similar to that of the previous experiment of Section 4.4.3. Table 7 summarizes the results of the portfolio models. As table shows, the prediction-based portfolio optimization model achieved an Accumulated Return of 3.893728, which is 291.9% above the 0.993455 achieved by the mean-variance model, and 77.9% above the 2.188907 achieved by IBOVESPA (see Table 4). The prediction-based portfolio optimization model obtained an ex-post risk (variance) of 0.003835, which is 21.2% above the 0.003164 of the mean-variance model, and 284.3% above the 0.000998 of the IBOVESPA (see Table 4). We used a paired t-test with significance level of 5% to compare the means of the 11 quarterly accumulated returns of the models and found a p-value of 0.025636, indicating that the mean of the prediction-based portfolio optimization model is greater than the mean of the mean-variance model. The prediction-based portfolio optimization model achieved a PCM of 0.006012 while the mean-variance model achieved a PCM of 0.002843. This shows that the prediction-based portfolio optimization model explored the predictive opportunities over two times more than the mean-variance model. 21 Table 7 Trading with real data – high risk: summary of the trading during the 142 weeks Prediction-based portfolios selected at maximum r̂ p Accumulated Return Mean-variance portfolios selected at maximum r̄ p 3.893728 mean σ2 Accumulated Return σ 0.993455 mean σ2 σ weekly returns 0.011444 0.003835 0.061933 weekly returns 0.001495 0.003164 0.056255 TI 1.900710 0.190071 0.435971 TI 0.453901 0.706788 0.840707 PCM number of stocks 0.006012 PCM mean min max 1 1 1 number of stocks 0.002843 mean min max 1 1 1 The prediction-based portfolio optimization model achieved a mean T I of 190% while the mean-variance model achieved 45%. This shows that the prediction-based portfolio optimization model used a much more aggressive strategy than the mean-variance model, selecting a different stock for its portfolios in almost all the times, since both models used only one stock in all of its portfolios. 5. Discussion Portfolio selection is still an open question in financial theory. Investor behavior, expected utility and probabilistic frameworks, dynamic models, risk models and distributions of stock returns, transaction costs, predictive and robust models, only to cite a few, still remain in the research agenda [45,46,47,48,49,50]. The interest in exploiting prediction in portfolio selection is not new, but was traditionally focused on the difficult task of determining the distributions of the series of returns [51]. The performance of a neural network predictor in predicting future stock returns depends on many aspects of the network and the data, including the network’s topology, training methods and noise. The central idea of this work is that, albeit predictors may not achieve good individual performance, when used in the investment in many stocks, their individual performances can be combined and produce superior results. Prediction-based portfolios can be more suitable for short-term investment than mean-variance portfolios. The experimental results showed in Section 4 are evidence of the validity of this hypothesis. There are other works in the literature that explore prediction and diversification as strategy for short term investment. Lazo et al. [52] proposed a hybrid genetic-neural and statistical system for optimizing portfolios. A genetic algorithm was used for optimizing the mean-variance model with a set of stocks, and a subset of them with significant participations was selected for investment. A GARCH model was then built to forecast the volatility (variance) of these selected stocks, and a neural network was used for predicting its weekly returns using its past returns and this predicted volatility. A second genetic algorithm used the predicted returns, predicted variances and the covariances of returns for selecting portfolios with maximum return or minimum risk. Although the system proposed by Lazo et al. uses predicted returns and efficient diversification, its risk measures are derived from the series of returns only (in the estimation of GARCH parameters and in the covariance of the stocks), thus giving equal treatment to predictors with possibly different performances. The prediction-based portfolio optimization model addresses this problem by deriving its risk measures 22 from the predictors’ individual and collective performances using the variances and covariances of their prediction errors. Hung et al. [53] extended the ASLD trading system [54] adding a risk control mechanism based on neural networks. During training, when future prices are "known", the extended ASLD system (EASLD) examines a set of N assets and generates instructions to buy, sell or hold assets based on a sequence of their "known" future prices. Optimal portfolio participations are then calculated for each asset that has a buy signal using the mean-variance or other classical portfolio selection models. N neural networks, one for each asset, are then trained to generate these participations, or hold and sell signals. During test, when future prices are unknown, the neural network’s outputs are then used as signals to buy, hold and sell participations; i.e., the network is expected to learn the full operation of the system (trading instructions and portfolio optimization). The EASLD trading system differs from our prediction-based portfolio optimization model in two fundamental ways: (i) EASLD trading system uses neural networks to predict trading signals, including participations, based on past prices and past trading signals, while the prediction-based portfolio optimization model uses neural networks to predict returns and employ these predicted returns and associated risks for optimizing portfolios. (ii) EASLD trading system uses diversification for addressing the issues of single asset and lack of risk control associated with trading systems. This is done by training neural networks on optimal portfolios, built in according to classical diversification frameworks. However, there is no guarantee that the resulting predicted portfolios will be optimal, as they are computed directly by neural networks. The prediction-based portfolio optimization model, on the other hand, assures optimality of its predicted portfolios by always selecting efficient portfolios at portfolio transitions. 6. Conclusion and Future Works In this paper we predict stock returns using a new method named autoregressive moving reference neural network (AR-MRNN), and use AR-MRNN in the implementation of a new portfolio optimization model named prediction-based portfolio optimization model. For each prediction, the AR-MRNN uses regression variables that are differences between the values of the time series and a determined past value used as reference. Some of the distributional aspects of the prediction errors produced by the AR-MRNN predictors were examined, and their Normality were verified on average for 92% of the stocks used. This showed that it is possible to have Normal prediction errors with the non-Normal series of returns and supported the development of our new prediction-based portfolio optimization model that uses the Normal framework. Simulations with our model achieved returns 291% above the mean-variance model, with similar levels of risk. Also, the predictive portfolios showed a better market index tracking capability, achieving returns 77% above the IBOVESPA market index. Our future works include: to research better neural predictors and training methods for minimizing the prediction errors, to insert market frictions in the re-balancing strategies, and to study other risk measures. 7. Acknowledgements We would like to thank Delegacia da Receita Federal do Brasil em Vitória-ES, Conselho Nacional de Desenvolvimento Científico e Tecnológico—CNPq-Brasil (grants 308207/2004-1, 471898/2004-0, 620165/20065, and 309831/2007-5) and Financiadora de Estudos e Projetos—FINEP-Brasil (grants CT-INFRA-PROUFES/2005, CT-INFRA-PRO-UFES/2006) for their support to this research work. 23 References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] H. M. Markowitz, Portfolio selection, Journal of Finance VII (1) (1952) 77–91. H. M. Markowitz, Portfolio Selection: Efficient Diversification of Investments, 2nd Edition, John Willey & Sons, New York, 1991. W. F. Sharpe, G. J. Alexander, J. V. Bailey, Investments, 6th Edition, Prentice Hall, Upper Saddle River, New Jersey, 1999. E. J. Elton, M. J. Gruber, S. J. Brown, W. N. Goetzmann, Modern Portfolio Theory and Investment Analysis, 7th Edition, John Wiley & Sons, Inc., 2007. E. F. Fama, Portfolio analysis in a stable paretian market, Management Science 11 (3 Series A) (1965) 404–419. S. J. Kon, Models of stock returns–a comparison, The Journal of Finance 39 (1) (1984) 147–165. E. F. Fama, Efficient capital markets: A review of theory and empirical work, The Journal of Finance 25 (2) (1970) 383–417. E. F. Fama, Efficient capital markets II, Journal of Finance 26 (5) (1991) 1575–1617. E. F. Fama, Market efficiency, long-term returns, and behavioral finance, Journal of Financial Economics 49 (1998) 283–306. B. G. Malkiel, The efficient market hypothesis and its critics, The Journal of Economic Perspectives 17 (1) (2003) 59–82. A. Edmans, D. García, Ø. Norli, Sports sentiment and stock returns, Journal of Finance 62 (4) (2007) 1967–1998. G. E. P. Box, G. M. Jenkins, G. C. Reinsel, Time Series Analysis: Forecasting and Control, 3rd Edition, Prentice Hall, 1994. H. White, Economic prediction using neural networks: The case of IBM daily stock returns, in: Proceedings of the IEEE International Conference on Neural Networks, 1988, pp. 451–458. F. Cao, L.J.; Tay, Support vector machine with adaptive parameters in financial time series forecasting, Neural Networks, IEEE Transactions on 14 (6) (Nov. 2003) 1506–1518. K. Huarng, T. H. Yu, Ratio-based lengths of intervals to improve fuzzy time series forecasting, IEEE Transactions on Systems, Man and Cybernetics – Part B: Cybernetics 36 (2) (2006) 328–340. K. Huarng, T. H. Yu, Y. W. Hsu, A multivariate heuristic model for fuzzy time-series forecasting, IEEE Transactions on Systems, Man and Cybernetics – Part B: Cybernetics 37 (4) (2007) 836–846. R. Sharda, R. B. Patil, A connectionist approach to time series prediction: An empirical test, Journal of Intelligent Manufacturing 3 (5) (1992) 317–323. R. Hansen, J.V.; Nelson, Neural networks and traditional time series methods: a synergistic combination in state economic forecasts, Neural Networks, IEEE Transactions on 8 (4) (Jul 1997) 863–873. Y. Chen, B. Yang, A. Abraham, Flexible neural trees ensemble for stock index modeling, Neurocomputing 70 (4-6) (2007) 697– 703. F. Liu, G. S. Ng, C. Quek, RLDDE: A novel reinforcement learning-based dimension and delay estimator for neural networks in time series prediction, Neurocomputing 70 (7–9) (2007) 1331–1341. J. W. Lee, J. Park, J. O, J. Lee, E. Hong, A multiagent approach to Q–Learning for daily stock trading, IEEE Transactions on Systems, Man and Cybernetics – Part A: Systems and Humans 37 (6) (Nov. 2007) C1–853. S. Thawornwong, D. Enke, The adaptive selection of financial and economic variables for use with artificial neural networks, Neurocomputing 56 (2004) 205–232. J. Moody, M. Saffell, Learning to trade via direct reinforcement, IEEE Transactions on Neural Networks 12 (4) (2001) 875–889. K. Pantazopoulos, L. Tsoukalas, N. Bourbakis, M. Brun, E. Houstis, Financial prediction and trading strategies using neurofuzzy approaches, IEEE Transactions on Systems, Man and Cybernetics – Part B: Cybernetics 28 (4) (Aug 1998) 520–531. T. Hellström, Predicting a rank measure for stock returns, Theory of Stochastic Processes 22 (6) (2000) 64–83. T. Hellström, Optimization of trading rules with a penalty term for increased risk-adjusted performance, Advanced Modeling and Optimization 2 (4) (2000) 135–149. H. M. Markowitz, The utility of wealth, The Journal of Political Economy 60 (2) (1952) 151–158. D. E. Fischer, R. J. Jordan, Security Analysis and Portfolio Management, 6th Edition, Prentice Hall International, 1995. F. D. Freitas, A. F. De Souza, A. R. Almeida, Autoregressive neural network predictors in the Brazilian stock markets, in: VII Simpósio Brasileiro de Automação Inteligente (SBAI)/II IEEE Latin American Robotics Symposium (IEEE-LARS), São Luis, Brasil, 2005, pp. 1–8. F. D. Freitas, A. F. De Souza, A. R. Almeida, A prediction-based portfolio optimization model, in: 5th International Symposium On Robotics and Automation - ISRA 2006, Hidalgo, Mexico, 2006, pp. 520–525. S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd Edition, Prentice Hall, Inc., 1999. H. M. Markowitz, The optimization of a quadratic function subject to linear constraints, Naval Research Logistics Quarterly (3) (1956) 111–133. F. Hamza, J. Janssen, Linear approach for solving large-scale portfolio optimization problems in a lognormal market, in: IAA/AFIR Colloquium, Nürnberg, Germany, 1996, pp. 1019–1039. W. F. Sharpe, A simplified model for portfolio analysis, Management Science 2 (9) (1963) 277–293. 24 [35] H. Konno, H. Yamazaki, Mean-absolute deviation portfolio optimization model and its applications to Tokyo stock market, Management Science 37 (5) (1991) 519–531. [36] F. A. Sortino, R. van der Meer, Downside risk–capturing what’s at stake in investment situations, Journal of Portfolio Management (1991) 27–31. [37] T. Hellström, Data snooping in the stock market, Theory of Stochastic Processes 5 (21) (1999) 33–50. [38] S. Lawrence, C. L. Giles, A. C. Tsoi, Lessons in neural network training: Overfitting may be harder than expected, in: Proceedings of the Fourteenth National Conference on Artificial Intelligence (AAAI-97), 1997, pp. 540–545. [39] J. Moody, Prediction risk and architecture selection for neural networks, in: V. Cherkassky, J. H. Friedman, H. Wechsler (Eds.), From Statistics to Neural Networks: Theory and Pattern Recognition Applications, Springer, NATO ASI Series F, 1994. [40] J. S. Armstrong, F. Collopy, Error measures for generalizing about forecasting methods: Empirical comparisons, International Journal of Forecasting 8 (1) (1992) 69–80. [41] M. R. Spiegel, L. J. Stephens, Schaum’s Outline of Theory and Problems of Statistics, 3rd Edition, McGraw-Hill, 1998. [42] M. Grinblatt, S. Titman, Performance measurement without benchmarks: An examination of mutual fund returns, The Journal of Business 66 (1) (1993) 47–68. [43] R. R. Grauer, N. H. Hakansson, Applying portfolio change and conditional performance measures: The case of industry rotation via the dynamic investment model, Review of Quantitative Finance and Accounting 17 (3) (2001) 237–265. [44] M. Statman, How many stocks make a diversified portfolio?, The Journal of Financial and Quantitative Analysis 22 (3) (1987) 353–363. [45] L. Huang, H. Liu, Rational inattention and portfolio selection, Journal of Finance LXII (4) (2001) 1999–2217. [46] M. W. Brandt, Dynamic portfolio selection by augmenting the asset space, Journal of Finance LXI (5) (2006) 2187–2217. [47] D. Maspero, Portfolio selection for financial planners, newfin Working Paper 3/04 (2004). [48] M. Kritzman, S. Myrgren, S. Page, Optimal execution for portfolio transitions, Journal of Portfolio Management (2007) 33–39. [49] Y. Ait-Sahalia, M. W. Brandt, Variable selection for portfolio choice, Journal of Finance LVI (4) (2001) 1297–1351. [50] F. J. Fabozzi, P. N. Kolm, Robust portfolio optimization: Recent trends and future directions., Journal of Portfolio Management (2007) 40–48. [51] J. Fried, Forecasting and probability distributions for models of portfolio selection, Journal of Finance XXV (3) (1970) 539–554. [52] J. G. L. Lazo, M. A. C. Pacheco, M. M. B. R. Vellasco, Portfolio selection and management using a hybrid intelligent and statistical system, in: S. Chen (Ed.), Genetic Algorithms and Genetic Programming in Computational Finance, Kluwer Academic Publishers, Boston, 2002, pp. 221–238. [53] K. Hung, Y. Cheung, L. Xu, An extended ASLD trading system to enhance portfolio management, IEEE Transactions on Neural Networks 14 (2) (2003) 413–425. [54] L. Xu, Y. Cheung, Adaptive supervised learning decision networks for trading and portfolio management, Journal of Computational Intelligence in Finance 5 (6) (1997) 11–16. 25