Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Computers & Operations Research 35 (2008) 34 – 46 www.elsevier.com/locate/cor Neural network-based mean–variance–skewness model for portfolio selection夡 Lean Yua, b , Shouyang Wanga, b, c , Kin Keung Laic, d,∗ a Institute of Systems Science, Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences, Beijing 100080, China b School of Management, Graduate School of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100039, China c College of Business Administration, Hunan University, Changsha 410082, China d Department of Management Sciences, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong Available online 23 March 2006 Abstract In this study, a novel neural network-based mean–variance–skewness model for optimal portfolio selection is proposed integrating different forecasts and trading strategies, as well as investors’risk preference. Based on the Lagrange multiplier theory in optimization and the radial basis function (RBF) neural network, the model seeks to provide solutions satisfying the trade-off conditions of mean–variance–skewness. The feasibility of the RBF network-based mean–variance–skewness model is verified with a simulation experiment. The experimental results show that, for all examined investor risk preferences and investment assets, the proposed model is a fast and efficient way of solving the trade-off in the mean–variance–skewness portfolio problem. In addition, we also find that the proposed approach can also be used as an alternative tool for evaluating various forecasting models. 䉷 2006 Elsevier Ltd. All rights reserved. Keywords: Mean–variance–skewness model; Portfolio selection; Radial basis function neural network; Forecasting; Trading strategy; Risk preference 1. Introduction The mean–variance model originally introduced by Markowitz [1] plays an important and critical role in modern portfolio theory. Markowitz’s portfolio model is a bi-criteria optimization problem where a reasonable trade-off between return and risk is considered—minimizing risk for a given level of expected return, or equivalently, maximizing expected return for a given level of risk. Since Markowitz’s pioneering work [1] was published, the mean–variance model has revolutionized the way people think about a portfolio of assets. This model has gained widespread acceptance as a practical tool for portfolio optimization, and numerous later studies have examined the issue of risk diversification. With the continuous effort of various researchers, Markowitz’s seminal work has been widely extended. Extended 夡 This study is partially supported by NSFC, CAS and SRG of City University of Hong Kong. ∗ Corresponding author. Department of Management Sciences, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong. Tel.: +852 27888563; fax: +852 27888560. E-mail address: mskklai@cityu.edu.hk (K.K. Lai). 0305-0548/$ - see front matter 䉷 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.cor.2006.02.012 L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46 35 models mainly include the mean semi-variance model [2], mean absolute deviation model [3–5], mean target model [6] and some frictional models, such as those reported in [2,7–14]. A distinct characteristic of these studies is that only the first two moments of return distributions are taken into account. But there is a controversy over the issue of whether higher moments should be accounted for in portfolio selection. Many academic researchers (see, e.g., Arditti [15,16], Samuelson [17], Rubinstein [18], and Konno and Suzuki [19]) argued that the higher moments cannot be neglected unless there is a reason to believe that the asset returns are normally distributed or the utility function is quadratic, or that the higher moments are irrelevant to the investor’s decision. As a result, in some recent studies, such as [19–24], the concept of mean–variance trade-off has been extended to include the skewness of return in portfolio selection. In other words, a mean–variance–skewness trade-off model for portfolio selection has been generated. One problem with the mean–variance–skewness trade-off model for portfolio selection is that it is not easy to find a trade-off between the three objectives because this is a nonsmooth multi-objective optimization problem. Until now, many methods used to tackle this problem have been restricted to goal programming or linear programming techniques. For example, Lai [20] gave a goal programming procedure that performs portfolio selection based on competing and conflicting objectives by maximizing both expected return and skewness while minimizing the risk associated with the return (i.e., variance). Similarly, Leung et al. [22] provided a goal programming algorithm to solve a mean–variance–skewness model with the aid of the general Minkovski distance. Diverging from previous studies, Wang and Xia [23] transformed the mean–variance–skewness model into a parametric linear programming problem by maximizing the skewness under given levels of mean and variance. Likewise, Liu et al. [24] also transformed the mean–variance–skewness model with transaction costs into a linear programming problem and verified its efficiency via a numerical example. However, the main disadvantage of these algorithms is that they generally converge slowly, if at all [25]. Furthermore, most existing studies only present some numerical examples with artificial data. Therefore, we try to introduce a new artificial intelligence technique and propose a fast and efficient radial basis function (RBF) neural network-based methodology to solve the trade-off of the mean–variance–skewness model from a new perspective. The reason for choosing the RBF network model is that this network itself contains trained weight matrices that combine the different objectives; thus the RBF model is appropriate for our problem. In addition, most portfolio selection models in the literature only consider the distribution properties of investment returns; other factors, such as investors’ risk preferences and trading strategies, are not taken into account. Unlike others, our study considers these important factors. Furthermore, our study also investigates the effect of various forecasts of the portfolio selection return distribution. Because investment decisions are oriented towards the future, realized returns are of no use here. In a sense, our proposed model is an integrated intelligent model. In order to verify the feasibility of the proposed model, a simulation study is performed. In summary, the primary focuses of this study are to propose an integrated neural network-based mean–variance– skewness model for portfolio optimization based on investors’ risk preferences, different forecasts and trading schemes, and to provide empirical evidence of the performance of our proposed approach. The remainder of this paper is organized as follows. In Section 2, a RBF network model is described briefly. In Section 3, a new and integrated RBF networkbased portfolio selection model is proposed to realize mean–variance–skewness trade-off among three competing and conflicting objectives. To verify the feasibility and efficiency of the proposed approach, an empirical example is presented in Section 4. Finally, Section 5 concludes the paper. 2. Brief description of the radial basis function (RBF) neural network An extremely powerful neural network type is the RBF neural network, which differs strongly from the multilayer perceptron (MLP) network both in the activation functions and in how it is used [26,27]. Generally, an RBF network can be regarded as a feed-forward network composed of three layers of neurons with different roles. The first layer is the input layer, and this feeds the input data to each of the nodes in the second or hidden layer. The nodes of second layer differ greatly from other neural networks in that each node has a Gaussian function as the nonlinearity processing element. The third and final layer is linear, supplying each network response as a linear combination of the hidden responses. It acts to sum the outputs of the second layer of nodes to yield the decision value. A graphical representation of an RBF neural network with an output node is shown in Fig. 1. 36 L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46 W0 C1 a1 W1 a2 C2 W2 b Wn ak Cn Input layer Hidden layer Output layer Fig. 1. An RBF neural network with one output. As can be seen from Fig. 1, an RBF neural network can compute a decision function from a given input. The RBF network receives a k dimensional input vector a and outputs a scalar value using the general formula: b = f (ā) = w0 + n  (1) wi g(ai ), i=1 where w0 is a bias and wi (i = 1, 2, . . . , n) are weight values, n represents the number of nodes in the hidden layer, ai is input data, g(.) is a Gaussian function with the center c and radius r, i.e., ⎛ g(ai ) = exp ⎝− k  j =1 ⎞ (aij − ci )2 /2r 2 ⎠ . (2) Usually, in an RBF network, the mean and standard deviation of the input vector are used as the cluster center and radius, respectively. Once the center and radius have been computed, the output layer can be nonlinearly mapped using the standard combination technique. The advantages of RBF networks over MLP networks are: (a) an RBF network does not get stuck in local minima; (b) an RBF network generally has a simple architecture consisting of two layers of weights, in which the first layer contains the parameters of the basis functions and the second layer forms linear combinations of the activations of the basis functions to generate the outputs. On the other hand, MLP networks often have many layers of weights and a complex connectivity pattern, but not all possible weights are present in any layer; (c) the training time is much shorter for an RBF network (up to 1000 times faster than back propagation); (d) no momentum coefficient is needed for an RBF network; (e) exemplars (cases) that are far from decision boundaries have little influence in RBF networks, while in an MLP network they influence the training; and (f) an RBF network does not become saturated during training [26–28]. 3. Proposed approach for portfolio selection 3.1. Formulation of the mean–variance–skewness model Previous studies [19–24] revealed that maximizing the skewness of return could efficiently improve performance of the traditional Markowitz mean–variance portfolio model. That is, we can obtain a better portfolio by maximizing expected return and skewness of return and minimizing the variance of return simultaneously. Typically, L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46 37 the mean–variance–skewness model can be represented by (P1 ) ⎧ ⎪ Maximize ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ Minimize ⎪ ⎪ ⎪ ⎪ ⎨ Maximize ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ subject to R(x) = X T R̄ = p n xi R̄i R̄i = i=1 n V (x) = X T V X = i=1 xi2 2i + R̄it /p , t=1 n n xi xj ij (i = j ), i=1 j =1 n S(x) = E(XT (R − R̄))3 = xi3 si3 i=1  n n i=1 j =1 +3 xi2 xj siij + n j =1 (3) xi xj2 sijj  (i = j ) XT I = 1, where 2i and ij are the variance and covariance of the excess returns based on various forecasting techniques i whereas si3 , siij , and sijj are the skewness and coskewness of the excess returns based on each forecasting method i, respectively. X is the proportion invested in various assets when the best trade-off is found. It is noted that, in this study, negative x represents a short sale. According to the theory of multi-objective optimization, a general way to solve the above multi-objective programming problem (P1 ) is to consolidate the various objectives into a single objective function. That is, the efficient solution can be generated by solving the following nonlinear programming problem (P2 ): (P2 )  Minimize Z(x) = −1 (X T R̄) + 2 (X T V X) − 3 (E(X T (R − R̄))3 ) subject to XT I = 1, (4) where 1 , 2 , 3 can be interpreted as the risk aversion factor or risk preference of investors and associated with the objectives R(x), V (x) and S(x), respectively. However, problem (P2 ) is not easily solved because it is a nonlinear programming problem with constraints. Generally, this problem (P2 ) is solved using nonlinear programming techniques. But the process of solving nonlinear programming problems is rather difficult. Existing methods, including goal programming [20,22] and linear programming [23,24] approaches, are very complex and time consuming. In order to solve the nonlinear programming problem (P2 ) efficiently, we therefore introduce a neural network-based optimization approach. 3.2. Neural network modeling for mean–variance–skewness optimization A neural network is a large dimensional nonlinear dynamic system composed of ‘neurons’. From the viewpoint of a dynamic system, the final behavior of the system is fully determined by its attraction points, if they exist. For a stable system, if certain inputs are given, the system will reach a stable attraction point. This point is seen as the optimum solution of some practical problems, and the evolution process through which the neural network reaches the stable state from any initial state is just a process of seeking optimization of the objective function within the active domain. Therefore, the key to designing a neural network is in seeing how to set the corresponding relationships between the problem and the network’s stable attraction points. Now, we consider the problem (P2 ): to design a neural network algorithm for multi-objective optimization. From (P2 ), the Lagrange function can be defined as L(X, ) = −1 (X T R̄) + 2 (X T V X) − 3 (E(X T (R − R̄))3 ) + (X T I − 1), (5) where  is the Lagrange multiplier. According to the classical limit theory, the necessary condition for X ∗ to be the optimum solution of (P2 ) is that there exists , which X∗ make satisfy the following relation:  ∇x L(X ∗ , ∗ ) = −1 R̄ + 22 V X∗ − 33 (R − R̄)(X ∗T V X∗ ) + I T ∗ = 0, ∇ L(X ∗ , ∗ ) = X ∗T I − 1 = 0. (6) 38 L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46 Various forecasts Trade strategies Sub-objective aspired level Investors' risk preference Sub-objective optimization (PT) Multi-objective optimization (RBFN) RBFN incremental learning Results comparision Minimal result Investment decision Investor feedback Fig. 2. The overall procedure for the integrated approach to portfolio selection. Our aim now is to design a neural network that will settle down to an optimum point satisfying Eq. (6). The transient behaviors of the neural network are defined by the following equations: ⎧ dX ⎪ = −∇x L(X ∗ , ∗ ) = 1 R̄ − 22 V X ∗ + 33 (R − R̄)(X ∗T V X ∗ ) − I T ∗ , ⎨ dt (7) ⎪ ⎩ d ∗ ∗ ∗T = ∇ L(X ,  ) = X I − 1. dt Using nonlinear programming theory [25,29] and the techniques of system realization [30,31], we find that if the network is physically stable, the attraction point or optimum point (X ∗ , ∗ ), described by (dX/dt)|(X∗ ,∗ ) = 0 and (d/dt)|(X∗ ,∗ ) = 0, clearly meets Eq. (6) and thus provide a Lagrange solution to (P2 ). Thus such a nonlinear programming problem with constraints can be solved by neural network learning. In the following a detailed procedure is described. 3.3. An RBF-based mean–variance–skewness model for portfolio selection In this section, an integrated RBF network-based mean–variance–skewness model for portfolio selection is proposed considering risk preference, trading strategies and various forecasts, as shown in Fig. 2. The overall procedure for the proposed integrated approach is described as follows. Various forecasts: As noted earlier, investment decisions are oriented towards the future; realized returns are of no use. Therefore, we use different models to predict the excess return for different investment assets. Trading strategies: Based on various forecasts, corresponding trading strategies for portfolio selection can be made. For brevity, we use a relatively simple trading strategy to guide the decision in this study. Of course, more complex trading strategies, such as the probability rule, can also be used. Let R̂k,t+1 denote the excess return for next period (i.e., t + 1) forecast for different assets. The trading rule for asset component k can be summarized below: Rule I: If (R̂k,t+1 > 0) then “buy”, else “sell”. If there are no short sales, then the trading rule is changed as follows: Rule II: If (R̂k,t+1 > 0) then “buy”, else “no action is taken”. In this study, we include an extra constraint to prohibit short sales and borrowings; hence Rule II is used here. Sub-objective optimization: In the light of the results from various forecasts and trading strategies, we optimize for three different sub-objectives—mean, variance and skewness—individually. That is, problem (P1 ) can be decomposed into three sub-problems (P3 )–(P5 ) as follows. The optimal results obtained from (P3 )–(P5 ) are considered to be aspired L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46 39 levels, and are denoted by R̃, Ṽ and S̃. As can be seen, the aspired levels indicate the best case scenario for a particular objective without considering other objectives.  Maximize R(x) = X T R̄ (P3 ) (8) subject to XT I = 1,  Minimize V (x) = X T V X (P4 ) (9) subject to XT I = 1,  Maximize S(x) = E(X T (R − R̄))3 (P5 ) (10) subject to X T I = 1. It is interesting to note that the solution to sub-problem (P3 ) is trivial because linear programming will always assign all weights to the highest yielding asset when other constraints are ignored. In order to avoid this situation, we adjust the constraints according to the requirements of practical applications, such as x1 0.1 if R1 ?0. On the other hand, solving sub-problems (P4 ) and (P5 ) does provide meaningful solutions which lead to tighter bounds. Multi-objective optimization: In this step, we use the RBF network to solve problem (P1 ) as a nonlinear combining optimization form. That is, consensus results of multi-objective optimization are obtained from the RBF network. The concrete process of multi-objective optimization is described below. For convenience, we firstly make a few notations as follows: Ri , Vi and Si (i = 1, 2, . . . , n) are the mean, variance and skewness of excess returns of every asset; xi (i = 1, 2, . . . , n) represents the investment proportion of different assets; R̃, Ṽ and S̃ are three sub-objective aspired levels defined in the previous stages, and are considered to be the centers of the RBF network; 1 , 2 , and 3 are the investors’ risk preference over the mean, variance and skewness of excess return; and rR , rV and rS are the radius of the RBF network obtained from the standard deviation of mean, variance, and skewness of returns. According to the theory in Sections 2, 3.1 and 3.2, problem (P2 ) can be reformulated by     ⎧ n n 2 2 V − Ṽ ) (x R − R̃) (x i i i i ⎪ i=1 ⎪ + 2 exp − i=1 ⎪ ⎨ Minimize f (x) = −1 exp − 2rv2 2rR2   (11) (P6 ) n n ⎪ (xi Si − S̃)2 ⎪ ⎪ −3 exp − i=1 2 xi − 1 . + ⎩ 2rS i=1 Using the results of Section 3.2, the nonlinear programming problem (P6 ) can be solved using a neural network. In this study, we use the RBF network to reach the solution. For further explanation, the RBF architecture in connection with (P6 ) is illustrated in Fig. 3. From the problem (P6 ) and Fig. 3, the consensus result can be obtained by considering the mean–variance–skewness simultaneously. As indicated in (P6 ), the smaller the objective function is, the better the solution is. Thus we can select the best portfolio according to the minimal output. Accordingly, the optimal weight set X = {x1 , x2 , . . . , xn } can be obtained from the RBF network. Results comparison: The “results comparison” is a logic frame. According to the different forecasts and trading strategies, we can obtain different consensus results. From these consensus results, we can find the minimum of the consensus results. In accordance with this minimum, investors can choose the best investment portfolio as their investment decision. Investment decision: Investors can make their corresponding decisions using the best investment portfolio from the minimal consensus result. Investor feedback: In the process of the investment decision, if it is not appropriate to use the optimal results in practical application, investors can send the information back to the RBF network for incremental learning. RBFN incremental learning: This incrementally trains the RBFN according to investor feedback, together with the results of various forecasts, trading strategies and sub-objective optimization to receive other optimal consensus results. The above procedure can be summarized as follows: Step 1: Make various predictions using different forecasting models. Step 2: Select the corresponding assets according to the trading rules. Step 3: Using programming techniques (PT), obtain three optimal values of mean, variance and skewness individually. 40 L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46 1 X1 1 1 Xn R1 X1  ∼ R Rn Xn V1 X1 − 1 ∼ V Vn Xn S1 X1 2 Y + − 3 ∼ S Sn Xn Fig. 3. A practical network architecture of RBF for problem (P6 ). Step 4: RBFN optimizes the three objectives simultaneously and consensus results can be obtained. Step 5: Investors can make their decisions from the consensus results or feasible sets according to the minimum principle. Investors can choose the optimal investment portfolio as their investment decision. Meanwhile, if the optimal solution cannot be applied practically, investors can feed the information back to the RBF network. In addition, the performance of different forecasting models can also be evaluated. Step 6: RBF neural network incrementally trains according to the feedback. Step 7: Go to Step 4 with the adjusted RBF network. Steps 4–7 are the essential and important parts of the process based on the neural network. An additional condition, i.e., the initial principal allocation problem, needs to be satisfied in the portfolio decision. Suppose the total principal for investment at the beginning of the period is fixed at T dollars. The asset allocation to different portions of the portfolio is T x i such that  T xi = T . (12) i Thus, the mean–variance–skewness portfolio optimization problem is solved in the integrated RBF neural networkbased optimization framework. As a consequence, the experiments have to test the validation of the proposed approach. 4. Simulation experiments 4.1. Data description and experiment design In the light of the earlier description, our analysis is based on the various forecasts and trading strategies to guide for investments. In this study, we adopt four forecasting models–random walk (RW) model, adaptive exponential smoothing (AES) model, autoregressive integrated moving average (ARIMA) model and multilayer feed-forward neural network (MLFNN) model—to predict these financial series. To test the versatility and robustness of the proposed approach for portfolio selection, three globally-traded stock market indices (S&P500 for the US, FTSE100 for the UK, and Nikkei 225 for Japan) and three globally traded foreign exchanges (euros (EUR), British pounds (GBP) and Japanese yen (JPY)) 41 L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46 Table 1 Average excess return of index and exchange Average rate of excess return S&P500 FTSE100 Nikkie225 USD/EUR USD/GBP USD/JPY RW AES ARIMA MLFNN 0.000338 −0.00010 0.001351 −0.00056 −0.00080 −0.00204 0.000638 0.000377 0.001350 0.000126 −0.00023 −0.00204 2.65E − 05 −0.00071 0.001488 −0.00047 −0.00074 −0.00193 0.000707 0.000982 0.002061 −0.00033 −0.00053 −0.00078 against the US dollar (USD) are examined in our empirical experiment. The stock indices and exchange data used in this paper are daily and are obtained from Datastream and Pacific Exchange Rate Service (http://fx.sauder.ubc.ca/), respectively. The entire data set covers the period from January 1, 2000 to September 30, 2003. The data sets are divided into two periods: the first period covers January 1, 2000 to July 31, 2003 while the second period is from August 1, 2003 to September 30, 2003. The first period, which is assigned to in-sample estimation, is used to determine the specifications of the models and parameters for the various forecasting techniques. The second period is reserved for out-of-sample evaluation and comparison of performance between various forecasting models. For brevity, the original data are not listed in the paper, and detailed data can be obtained from the sources. For the sake of facilitating forecast and portfolio optimization, we choose the daily excess returns of these indices and exchange rates as forecast variables. As shown in Eq. (13), the excess return on a certain index or exchange rate is defined as the continuously compounded return on the price minus the risk-free interest rate: Rt = log((Pt − Pt−1 )/Pt−1 ) − rt−1 , (13) where Pt is the price of the stock index or exchange rate traded at time t and rt is the risk-free interest rate at time t. The reason for choosing the excess returns (rather than the index or exchange rate) is that they can provide a measurement of how well our models perform relative to the minimum returns gained from depositing the money in a risk-free account. Furthermore, the forecasting results for excess returns can be used directly by the portfolio selection model, which is convenient for investors. 4.2. Experiment results The empirical experiment performed in this study consists of five main stages that are partially described in earlier sections. In the first stage, we estimate the forecasts of a particular stock market index and foreign exchange rate using different forecasting models. In the light of the previous plan, 2-month period forecasts of daily returns can be obtained. Based on these forecasts, the distributional properties—i.e., mean, variance and skewness of the excess return on index and exchange rate—are computed. Table 1 shows the average excess return on index and exchange trading using the forecasts supplied by each forecasting technique. In addition, the second and third moments are also computed for the forecasts made within the in-sample period. The estimated variance–covariance and skewness/coskewness matrices are shown in Tables 2 and 3 for each stock index or foreign exchange rate. Since the variance/covariance represents the square term while the skewness/coskewness is a cube term, the magnitudes of these descriptive statistics are quite small. In the second stage, according to the previous trading rules, we can select the corresponding assets for further analysis. In the third stage, based on the statistical moments mentioned in the first stage, the aspired levels R̃, Ṽ and S̃ are found by solving the problems (P3 ), (P4 ), and (P5 ) using the PT, respectively. The results are reported in Table 4. In the fourth stage, based on the previous results and various forecasting techniques as well as the different risk preferences, the proportion of the portfolio can be obtained using the RBF neural network, as indicated in Eq. (11). For exposition purposes, the optimal weight sets used to construct the portfolio given the investors’ preference for 42 L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46 Table 2 Variance–covariance matrices of the distributions of returns with different models Various assets S&P500 FTSE100 Nikkie225 USD/GBP USD/JPY 0.0000174 0.0000518 0.0000267 0.0000179 0.0000126 0.0000033 0.0000230 0.0000267 0.0001800 0.0000226 0.0000168 0.0000123 0.0000144 0.0000179 0.0000226 0.0000395 0.0000222 0.0000036 0.0000078 0.0000126 0.0000168 0.0000222 0.0000197 0.0000048 −0.000006 0.0000033 0.0000123 0.0000036 0.0000048 0.0000226 Panel B: Adaptive exponential smoothing (AES) model S&P500 0.0000015 0.0000009 FTSE100 0.0000009 0.0000013 Nikkie225 0.0000050 0.0000064 USD/EUR 0.0000008 0.0000006 USD/GBP 0.0000005 0.0000008 USD/JPY −0.0000007 0.0000007 0.0000050 0.0000064 0.0001800 0.0000064 0.0000074 0.0000121 0.0000008 0.0000006 0.0000064 0.0000037 0.0000026 0.0000016 0.0000005 0.0000008 0.0000074 0.0000026 0.0000027 0.0000021 −0.0000007 0.0000007 0.0000121 0.0000016 0.0000021 0.0000198 Panel C: Autoregressive integrated moving average (ARIMA) model S&P500 0.0000659 0.0000279 0.0000241 FTSE100 0.0000279 0.0001600 0.0000293 Nikkie225 0.0000241 0.0000293 0.0001780 USD/EUR 0.0000189 0.0000298 0.0000275 USD/GBP 0.0000064 0.0000142 0.0000158 USD/JPY −0.0000009 0.0000003 0.0000138 0.0000189 0.0000298 0.0000275 0.0000457 0.0000208 0.0000037 0.0000064 0.0000142 0.0000158 0.0000208 0.0000197 0.0000054 −0.000009 0.0000003 0.0000138 0.0000037 0.0000054 0.0000242 Panel D: Multilayer feed-forward neural network (MLFNN) model S&P500 0.0000486 0.0000191 0.0000243 FTSE100 0.0000191 0.0000776 0.0000267 Nikkie225 0.0000243 0.0000267 0.0001740 USD/EUR 0.0000102 0.0000096 0.0000300 USD/GBP 0.0000028 0.0000086 0.0000091 USD/JPY −0.0000050 0.0000049 0.0000029 0.0000102 0.0000096 0.0000300 0.0000379 0.0000180 −0.0000034 0.0000028 0.0000086 0.0000091 0.0000180 0.0000257 0.0000021 −0.0000050 0.0000049 0.0000029 −0.0000034 0.0000021 0.0000228 Panel A: Random walk (RW) model S&P500 0.0000633 FTSE100 0.0000174 Nikkie225 0.0000230 USD/EUR 0.0000144 USD/GBP 0.0000078 USD/JPY −0.000006 USD/EUR (1 = 2 = 3 = 1) are reported in Table 5. It is worth noting that preference set (1, 1, 1) is a compromise case where the weights for mean, variance and skewness are equivalent. It should be noted that an important factor, investors’ risk preference, is not changed in the previous stages. In order to verify the sensitivity of the proposed approach to changes in the investors’ risk preference (1 , 2 , 3 ), different levels of risk preference are investigated. Specifically, risk preferences of (1, 1, 1), (1, 2, 2), (2, 2, 1), (1, 2, 1), (2, 1, 1) and (1, 1, 0) are included in our experiment. The results based on the (1, 1, 1) preference structure indicate that mean, variance and skewness of return are of equal importance to the investor, as indicated in Table 5. Likewise, the results based on the (2, 1, 1) preference structure shows that the investors are willing to pursue more excess returns regardless of risk level while those of (1, 2, 2) and (1, 2, 1) give more emphasis to risk control. (1, 1, 0) is a benchmark case, representing the Markowitz mean–variance portfolio. Detailed results are shown in Table 6 below. In the fifth and final stage, as a result of minimization principle (as indicated in Eq. (11)), the minimal value of consensus results can be found. Accordingly, the best portfolio proportion and the best forecasting model can be identified. The results of four different forecasting models are shown in Table 7. From Table 7, we find that the result obtained from the last forecasting model is the best at any level of risk preference except for risk preference level (1, 1, 0). Accordingly, the corresponding weight sets of different risk preference level yield the optimal investment portfolio. For example, for the case of risk preference level (1, 2, 2), the optimal proportion of six different assets is vector (0.0872 0.4412 0.0709 0.3498 0.0000 0.0059); while for the case of risk preference level (2, 2, 1), the best proportion of six assets is vector (0.2861 0.2112 0.4922 0.0000 0.0000 0.0105). 43 L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46 Table 3 Skewness–coskewness matrices of distributions of returns with different modelsa Various assets S&P500 (sii1 ) FTSE100 (sii2 ) Nikkie225 (sii3 ) USD/EUR (sii4 ) USD/GBP (sii5 ) USD/JPY (sii6 ) −0.183229 0.189065 −0.626094 0.0676663 −0.028704 −0.812647 0.024170 0.095803 −0.256502 0.1391594 0.0018061 −0.564531 0.057413 0.173851 −0.465972 0.152388 −0.299149 −0.710214 −0.303717 −0.178357 −0.684112 −0.278846 −0.632863 −0.933453 Panel B: Adaptive exponential smoothing (AES) model −0.431241 −0.457729 S&P500 (s1jj ) FTSE100 (s2jj ) −0.397415 −0.304806 −0.069946 −21.78567 Nikkie225 (s3jj ) 0.0245973 −0.438029 USD/EUR (s4jj ) −0.389790 −0.561944 USD/GBP (s5jj ) 0.0630338 −0.201177 USD/JPY (s6jj ) −0.210677 −0.220709 −0.626094 −0.057571 −0.514118 −0.819557 −0.266502 −0.226864 −0.314988 0.4270127 −0.553485 −1.118498 −0.349488 −0.261407 −0.472046 −0.195935 −0.939281 −0.663008 0.091951 0.029908 −0.780072 −0.300213 −0.445175 −0.933780 Panel C: Autoregressive integrated moving average (ARIMA) model −0.398190 0.130751 S&P500(s1jj ) −0.008059 −0.035178 FTSE100 (s2jj ) Nikkie225 (s3jj ) 0.0374245 0.0801771 0.4110735 0.3196692 USD/EUR (s4jj ) −0.153756 −0.042369 USD/GBP (s5jj ) −0.451169 −0.232998 USD/JPY (s6jj ) 0.410521 0.188095 −0.659871 0.1441668 −0.198978 −0.379669 −0.397626 −0.127317 −0.210949 0.2762301 −0.113597 −0.470561 −0.011150 0.036948 −0.078079 0.082136 −0.396387 −0.658468 −0.223926 −0.082209 −0.100065 −0.299440 −0.666417 −0.890108 Panel D: Multilayer feed-forward neural network (MLFNN) model −0.550176 0.106540 S&P500 (s1jj ) 0.0635264 1.1309108 FTSE100 (s2jj ) −0.370498 −0.236815 Nikkie225 (s3jj ) 0.1695834 −0.000113 USD/EUR (s4jj ) 0.1047806 −0.264383 USD/GBP (s5jj ) −0.038874 −0.396269 USD/JPY (s6jj ) −0.255390 −0.061082 −0.792033 0.0645350 −0.011812 −0.312856 0.055261 −0.089385 −0.403015 0.5021456 −0.352080 −0.373512 0.129211 −0.032107 0.126464 0.241597 −1.239753 −0.525513 −0.151076 0.026677 −0.457580 −0.378057 −0.933087 −0.429079 Panel A: Random walk (RW) model −0.480883 S&P500 (s1jj ) 0.0022131 FTSE100 (s2jj ) −0.353687 Nikkie225 (s3jj ) 0.1122376 USD/EUR (s4jj ) 0.1110669 USD/GBP (s5jj ) −0.017235 USD/JPY (s6jj ) 0.125578 −0.078617 −0.144587 0.2093805 0.0716599 −0.373642 a The value of skewness and coskewness is calculated by the following equations (Campbell and Siddique [32]): Skewness (X) = s 3 = E[(X − i E(X))3 ], Coskewness (Xii , Yj ) = siij = E[(X − E(X))2 (Y − E(Y ))] or Coskewness (Xi , Yjj ) = sijj = E[(X − E(X))(Y − E(Y ))2 ]. Table 4 Results for (P3 ), (P4 ) and (P5 )a RW AES ARIMA MLFNN Problem 3 (P3 ) Problem 4 (P4 ) Problem 5 (P5 ) R̃(x) Ṽ (x) S̃(x) 0.001351 0.001350 0.001488 0.002061 0.00000545 0.00000052 0.00000572 0.00000468 0.142110 0.442101 0.268347 1.171740 a Solving (P ), (P ) and (P ) yields the aspired levels, R̃(x), Ṽ (x), S̃(x), respectively. Each aspired level indicates the best case scenario for a 3 4 5 particular objective without considering other objectives. RW represents the random walk model, AES denotes the adaptive exponential smoothing model, ARIMA represents the autoregressive integrated moving average model and MLFNN represents the multilayer feed-forward neural network model. In addition, the calculated average excess returns based on different forecasting techniques and risk preferences as well as the optimal investment proportion also demonstrate that the consensus results of MLFNN are the best; and the average excess returns obtained from MLFNN model are the highest, as shown in Table 8. This implies that the 44 L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46 Table 5 Proportions of different assets in the portfolio for risk preference (1, 1, 1) RW AES ARIMA MLFNN x (S&P500) x (FTSE100) x (Nikkie225) x (USD/EUR) x (USD/GBP) x (USD/JPY) 0.3557 0.3817 0.4130 0.1979 0.1588 0.1622 0.1635 0.2112 0.4430 0.4199 0.4174 0.4540 0.0425 0.0362 0.0000 0.1334 0.0000 0.0000 0.0061 0.0035 0.0000 0.0000 0.0000 0.0000 Table 6 Proportions of different assets in the portfolio for different risk preferences x (S&P500) x (FTSE100) x (Nikkie225) x (USD/EUR) x (USD/GBP) x (USD/JPY) Investor’s risk preference (1, 2, 2) RW 0.2117 AES 0.0918 ARIMA 0.1028 MLFNN 0.0872 0.1548 0.2843 0.0035 0.4412 0.3532 0.1814 0.3745 0.0709 0.1529 0.4336 0.1983 0.3948 0.1274 0.0089 0.3104 0.0000 0.0000 0.0000 0.0105 0.0059 Investor’s risk preference (2, 2, 1) RW 0.3573 AES 0.4011 ARIMA 0.4108 MLFNN 0.2861 0.1514 0.1760 0.0000 0.2112 0.4826 0.3903 0.4517 0.4922 0.0038 0.0326 0.0081 0.0000 0.0049 0.0000 0.1223 0.0000 0.0000 0.0000 0.0071 0.0105 Investor’s risk preference (1, 2, 1) RW 0.2542 AES 0.2814 ARIMA 0.2443 MLFNN 0.3437 0.1429 0.3556 0.0728 0.1011 0.2918 0.2174 0.3124 0.3205 0.1615 0.1427 0.1723 0.0478 0.1109 0.0029 0.1011 0.0843 0.0387 0.0000 0.0971 0.1026 Investor’s risk preference (2, 1, 1) RW 0.4225 AES 0.2543 ARIMA 0.2427 MLFNN 0.1516 0.0113 0.1324 0.0767 0.3057 0.2147 0.5117 0.4263 0.3529 0.2198 0.1016 0.2017 0.1854 0.1317 0.0000 0.0526 0.0000 0.0000 0.0000 0.0000 0.0044 Investor’s risk preference (1, 1, 0) RW 0.1963 AES 0.1178 ARIMA 0.3456 MLFNN 0.3497 0.1118 0.3526 0.0000 0.2717 0.4347 0.1987 0.3057 0.2226 0.0000 0.2735 0.1058 0.0000 0.1705 0.0552 0.1464 0.0747 0.0867 0.0022 0.0965 0.0813 Table 7 The minimum of the objective function with different models and risk preference Preference (1, 1, 1) (1, 2, 2) (2, 2, 1) (1, 2, 1) (2, 1, 1) (1, 1, 0) RW AES ARIMA MLFNN 2.5471 2.2706 2.4110 2.0465 4.1721 3.6406 3.8578 3.1331 4.5334 4.2686 4.3959 4.0389 3.5882 3.2915 3.4289 3.0416 3.5816 3.2687 3.4236 3.0469 0.9298 1.9969 1.9963 1.9953 proposed approach is a fast and efficient way for optimal portfolio selection considering the mean–variance–skewness simultaneously. From Table 8, we can see that the expected excess return of MLFNN is the highest at any level of risk preference, which indicates that the MLFNN is the best forecasting technique of the four selected forecasting models. This also 45 L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46 Table 8 Comparison of expected excess returns with different models and risk preferences Preference (1, 1, 1) (1, 2, 2) (2, 2, 1) (1, 2, 1) (2, 1, 1) (1, 1, 0) RW AES ARIMA MLFNN 0.0006790 0.0008761 0.0005114 0.0012000 0.0003457 0.0004632 0.0002143 0.0005000 0.0007516 0.0008533 0.0005750 0.0014000 0.0002077 0.0006244 0.0000764 0.0009000 0.0002033 0.0009158 0.0004526 0.0011000 0.0003292 0.0004936 0.0001197 0.0009000 implies that the proposed approach can be used as an alternative for evaluating the performance of various forecasting models. 5. Conclusions This study proposes an integrated RBF neural network-based mean–variance–skewness optimization model for portfolio selection. This model examines the historical performance of the various series based on forecasts and trading schemes in terms of the mean, variance, and skewness of investment returns simultaneously. Through the use of the RBF neural network model, an investor can construct a portfolio which matches his or her risk preference based on forecasts and trading strategies as well as the mean–variance–skewness objectives simultaneously. We test our proposed approach with three widely traded stock market indices (S&P500, FTSE100 and Nikkei225) and three widely traded foreign exchanges (USD/EUR, USD/GBP and USD/JPY). The empirical results indicate that the proposed approach is an efficient way of solving the trade-off in the mean–variance–skewness portfolio problem. Moreover, we find that this approach can also be used as an alternative method of evaluating the performance of different forecasting models. Acknowledgments The authors would like to thank the guest editor and the anonymous reviewers for their valuable comments and suggestions, which have improved the quality of the paper immensely. References [1] Markowitz HM. Portfolio selection. Journal of Finance 1952;7:77–91. [2] Mao JCT. Models of capital budgeting, E-V vs. E-S. Journal of Financial and Quantitative Analysis 1970;5:657–75. [3] Konno H, Yamazaki H. Mean-absolute deviation portfolio optimization model and its applications to Tokyo stock market. Management Science 1991;37:519–31. [4] Feinstein CD, Thapa MN. A reformulation of a mean-absolute deviation portfolio optimization model. Management Science 1993;39:1552–3. [5] Simaan Y. Estimation risk in portfolio selection: the mean variance model versus the mean absolute deviation model. Management Science 1997;43:1437–46. [6] Fishburn DC. Mean-risk analysis with risk associated with below-target returns. American Economical Review 1977;67:117–26. [7] Pogue JA. An extension of the Markowitz portfolio selection model to include variable transactions costs, short sales, leverage policies, and taxes. Journal of Finance 1970;25:1005–28. [8] Brennan MJ. The optimal number of securities in a risky asset portfolio when there are fixed costs of transaction: theory and some empirical results. Journal of Financial and Quantitative Analysis 1975;10:483–96. [9] Levy H. Equilibrium in an imperfect market: a constraint on the number of securities in the portfolio. American Economic Review 1978;68: 643–58. [10] Patel NR, Subrahmanyam MG. A simple algorithm for optimal portfolio selection with fixed transaction costs. Management Science 1982;28: 303–14. [11] Gennotte G, Jung A. Investment strategies under transaction costs: the finite horizon case. Management Science 1994;40:385–404. [12] Morton AJ, Pliska SR. Optimal portfolio management with transaction costs. Mathematical Finance 1995;5:337–56. [13] Yoshimoto A. The mean-variance approach to portfolio optimization subject to transaction costs. Journal of the Operations Research Society of Japan 1996;39:99–117. [14] Li ZF, Wang SY, Deng XT. A linear programming algorithm for optimal portfolio selection with transaction costs. International Journal of Systems Science 2000;31:107–17. [15] Arditti FD. Risk and required return on equity. Journal of Finance 1967;22:19–36. 46 L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46 [16] Arditti FD. Another look at mutual fund performance. Journal of Financial and Quantitative Analysis 1971;6:909–12. [17] Samuelson P. The fundamental approximation theorem of portfolio analysis in terms of means variances and higher moments. Review of Economic Studies 1958;25:65–86. [18] Rubinstein ME. A comparative static analysis of risk premiums. The Journal of Business 1973;12:605–15. [19] Konno H, Suzuki K. A mean-variance-skewness optimization model. Journal of the Operations Research of Japan 1995;38:137–87. [20] Lai T. Portfolio selection with skewness: a multiple-objective approach. Review of Quantitative Finance and Accounting 1991;1:293–305. [21] Chunhachinda P, Dandapani K, Hamid S, Prakash AJ. Portfolio selection and skewness: evidence from international stock market. Journal of Banking and Finance 1997;21:143–67. [22] Leung MT, Daouk H, Chen AS. Using investment portfolio return to combine forecasts: a multiobjective approach. European Journal of Operational Research 2001;134:84–102. [23] Wang SY, Xia YS. Portfolio selection and asset pricing. Berlin: Springer; 2002. [24] Liu SC, Wang SY, Qiu WH. A mean–variance–skewness model for portfolio selection with transaction costs. International Journal of Systems Sciences 2003;34(4):255–62. [25] Kennedy MP, Chua LO. Neural networks for nonlinear programming. IEEE Transactions on Circuits and Systems 1988;35(3):554–62. [26] Broomhead DS, Lowe D. Multivariable functional interpolation and adaptive networks. Complex Systems 1988;2:321–55. [27] Wedding II DK, Cios KJ. Time series forecasting by combining RBF networks, certainty factors, and the Box-Jenkins model. Neurocomputing 1996;10:149–68. [28] Loukas YL. Radial basis function networks in host-guest interactions: instant and accurate formation constant calculations. Analytica Chimica Acta 2000;417:221–9. [29] Zhang XS. Neural networks in optimization. Norwell: Kluwer Academic Publishers; 2000. [30] Nise NS. Control system engineering. Menlo Park: Benjamin, Cummings; 1992. [31] Chua LO, Lin GN. Nonlinear programming without computation. IEEE Transactions on Circuit Systems 1984;31:182–8. [32] Campbell RH, Siddique A. Conditional skewness in asset pricing models tests. Journal of Finance 2000;55:1263–95.