Computers & Operations Research 35 (2008) 34 – 46
www.elsevier.com/locate/cor
Neural network-based mean–variance–skewness model for
portfolio selection夡
Lean Yua, b , Shouyang Wanga, b, c , Kin Keung Laic, d,∗
a Institute of Systems Science, Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences, Beijing 100080, China
b School of Management, Graduate School of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100039, China
c College of Business Administration, Hunan University, Changsha 410082, China
d Department of Management Sciences, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
Available online 23 March 2006
Abstract
In this study, a novel neural network-based mean–variance–skewness model for optimal portfolio selection is proposed integrating
different forecasts and trading strategies, as well as investors’risk preference. Based on the Lagrange multiplier theory in optimization
and the radial basis function (RBF) neural network, the model seeks to provide solutions satisfying the trade-off conditions of
mean–variance–skewness. The feasibility of the RBF network-based mean–variance–skewness model is verified with a simulation
experiment. The experimental results show that, for all examined investor risk preferences and investment assets, the proposed model
is a fast and efficient way of solving the trade-off in the mean–variance–skewness portfolio problem. In addition, we also find that
the proposed approach can also be used as an alternative tool for evaluating various forecasting models.
䉷 2006 Elsevier Ltd. All rights reserved.
Keywords: Mean–variance–skewness model; Portfolio selection; Radial basis function neural network; Forecasting; Trading strategy; Risk
preference
1. Introduction
The mean–variance model originally introduced by Markowitz [1] plays an important and critical role in modern
portfolio theory. Markowitz’s portfolio model is a bi-criteria optimization problem where a reasonable trade-off between
return and risk is considered—minimizing risk for a given level of expected return, or equivalently, maximizing expected
return for a given level of risk. Since Markowitz’s pioneering work [1] was published, the mean–variance model has
revolutionized the way people think about a portfolio of assets. This model has gained widespread acceptance as a
practical tool for portfolio optimization, and numerous later studies have examined the issue of risk diversification.
With the continuous effort of various researchers, Markowitz’s seminal work has been widely extended. Extended
夡 This
study is partially supported by NSFC, CAS and SRG of City University of Hong Kong.
∗ Corresponding author. Department of Management Sciences, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong.
Tel.: +852 27888563; fax: +852 27888560.
E-mail address: mskklai@cityu.edu.hk (K.K. Lai).
0305-0548/$ - see front matter 䉷 2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.cor.2006.02.012
L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46
35
models mainly include the mean semi-variance model [2], mean absolute deviation model [3–5], mean target model
[6] and some frictional models, such as those reported in [2,7–14]. A distinct characteristic of these studies is that only
the first two moments of return distributions are taken into account. But there is a controversy over the issue of whether
higher moments should be accounted for in portfolio selection. Many academic researchers (see, e.g., Arditti [15,16],
Samuelson [17], Rubinstein [18], and Konno and Suzuki [19]) argued that the higher moments cannot be neglected
unless there is a reason to believe that the asset returns are normally distributed or the utility function is quadratic, or
that the higher moments are irrelevant to the investor’s decision. As a result, in some recent studies, such as [19–24],
the concept of mean–variance trade-off has been extended to include the skewness of return in portfolio selection. In
other words, a mean–variance–skewness trade-off model for portfolio selection has been generated.
One problem with the mean–variance–skewness trade-off model for portfolio selection is that it is not easy to
find a trade-off between the three objectives because this is a nonsmooth multi-objective optimization problem. Until
now, many methods used to tackle this problem have been restricted to goal programming or linear programming
techniques. For example, Lai [20] gave a goal programming procedure that performs portfolio selection based on
competing and conflicting objectives by maximizing both expected return and skewness while minimizing the risk
associated with the return (i.e., variance). Similarly, Leung et al. [22] provided a goal programming algorithm to solve
a mean–variance–skewness model with the aid of the general Minkovski distance. Diverging from previous studies,
Wang and Xia [23] transformed the mean–variance–skewness model into a parametric linear programming problem
by maximizing the skewness under given levels of mean and variance. Likewise, Liu et al. [24] also transformed the
mean–variance–skewness model with transaction costs into a linear programming problem and verified its efficiency
via a numerical example. However, the main disadvantage of these algorithms is that they generally converge slowly, if
at all [25]. Furthermore, most existing studies only present some numerical examples with artificial data. Therefore, we
try to introduce a new artificial intelligence technique and propose a fast and efficient radial basis function (RBF) neural
network-based methodology to solve the trade-off of the mean–variance–skewness model from a new perspective. The
reason for choosing the RBF network model is that this network itself contains trained weight matrices that combine
the different objectives; thus the RBF model is appropriate for our problem.
In addition, most portfolio selection models in the literature only consider the distribution properties of investment
returns; other factors, such as investors’ risk preferences and trading strategies, are not taken into account. Unlike others,
our study considers these important factors. Furthermore, our study also investigates the effect of various forecasts of
the portfolio selection return distribution. Because investment decisions are oriented towards the future, realized returns
are of no use here. In a sense, our proposed model is an integrated intelligent model. In order to verify the feasibility
of the proposed model, a simulation study is performed.
In summary, the primary focuses of this study are to propose an integrated neural network-based mean–variance–
skewness model for portfolio optimization based on investors’ risk preferences, different forecasts and trading schemes,
and to provide empirical evidence of the performance of our proposed approach. The remainder of this paper is organized
as follows. In Section 2, a RBF network model is described briefly. In Section 3, a new and integrated RBF networkbased portfolio selection model is proposed to realize mean–variance–skewness trade-off among three competing
and conflicting objectives. To verify the feasibility and efficiency of the proposed approach, an empirical example is
presented in Section 4. Finally, Section 5 concludes the paper.
2. Brief description of the radial basis function (RBF) neural network
An extremely powerful neural network type is the RBF neural network, which differs strongly from the multilayer
perceptron (MLP) network both in the activation functions and in how it is used [26,27]. Generally, an RBF network
can be regarded as a feed-forward network composed of three layers of neurons with different roles. The first layer is
the input layer, and this feeds the input data to each of the nodes in the second or hidden layer. The nodes of second
layer differ greatly from other neural networks in that each node has a Gaussian function as the nonlinearity processing
element. The third and final layer is linear, supplying each network response as a linear combination of the hidden
responses. It acts to sum the outputs of the second layer of nodes to yield the decision value. A graphical representation
of an RBF neural network with an output node is shown in Fig. 1.
36
L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46
W0
C1
a1
W1
a2
C2
W2
b
Wn
ak
Cn
Input layer
Hidden layer
Output layer
Fig. 1. An RBF neural network with one output.
As can be seen from Fig. 1, an RBF neural network can compute a decision function from a given input. The RBF
network receives a k dimensional input vector a and outputs a scalar value using the general formula:
b = f (ā) = w0 +
n
(1)
wi g(ai ),
i=1
where w0 is a bias and wi (i = 1, 2, . . . , n) are weight values, n represents the number of nodes in the hidden layer, ai
is input data, g(.) is a Gaussian function with the center c and radius r, i.e.,
⎛
g(ai ) = exp ⎝−
k
j =1
⎞
(aij − ci )2 /2r 2 ⎠ .
(2)
Usually, in an RBF network, the mean and standard deviation of the input vector are used as the cluster center and
radius, respectively. Once the center and radius have been computed, the output layer can be nonlinearly mapped using
the standard combination technique.
The advantages of RBF networks over MLP networks are: (a) an RBF network does not get stuck in local minima;
(b) an RBF network generally has a simple architecture consisting of two layers of weights, in which the first layer
contains the parameters of the basis functions and the second layer forms linear combinations of the activations of
the basis functions to generate the outputs. On the other hand, MLP networks often have many layers of weights and
a complex connectivity pattern, but not all possible weights are present in any layer; (c) the training time is much
shorter for an RBF network (up to 1000 times faster than back propagation); (d) no momentum coefficient is needed
for an RBF network; (e) exemplars (cases) that are far from decision boundaries have little influence in RBF networks,
while in an MLP network they influence the training; and (f) an RBF network does not become saturated during
training [26–28].
3. Proposed approach for portfolio selection
3.1. Formulation of the mean–variance–skewness model
Previous studies [19–24] revealed that maximizing the skewness of return could efficiently improve performance
of the traditional Markowitz mean–variance portfolio model. That is, we can obtain a better portfolio by maximizing expected return and skewness of return and minimizing the variance of return simultaneously. Typically,
L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46
37
the mean–variance–skewness model can be represented by
(P1 )
⎧
⎪
Maximize
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
Minimize
⎪
⎪
⎪
⎪
⎨
Maximize
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
subject to
R(x) = X T R̄ =
p
n
xi R̄i R̄i =
i=1
n
V (x) = X T V X =
i=1
xi2 2i +
R̄it /p ,
t=1
n
n
xi xj ij
(i = j ),
i=1 j =1
n
S(x) = E(XT (R − R̄))3 =
xi3 si3
i=1
n
n
i=1
j =1
+3
xi2 xj siij +
n
j =1
(3)
xi xj2 sijj
(i = j )
XT I = 1,
where 2i and ij are the variance and covariance of the excess returns based on various forecasting techniques i whereas
si3 , siij , and sijj are the skewness and coskewness of the excess returns based on each forecasting method i, respectively.
X is the proportion invested in various assets when the best trade-off is found. It is noted that, in this study, negative x
represents a short sale.
According to the theory of multi-objective optimization, a general way to solve the above multi-objective programming problem (P1 ) is to consolidate the various objectives into a single objective function. That is, the efficient solution
can be generated by solving the following nonlinear programming problem (P2 ):
(P2 )
Minimize Z(x) = −1 (X T R̄) + 2 (X T V X) − 3 (E(X T (R − R̄))3 )
subject to XT I = 1,
(4)
where 1 , 2 , 3 can be interpreted as the risk aversion factor or risk preference of investors and associated with the
objectives R(x), V (x) and S(x), respectively.
However, problem (P2 ) is not easily solved because it is a nonlinear programming problem with constraints. Generally,
this problem (P2 ) is solved using nonlinear programming techniques. But the process of solving nonlinear programming
problems is rather difficult. Existing methods, including goal programming [20,22] and linear programming [23,24]
approaches, are very complex and time consuming. In order to solve the nonlinear programming problem (P2 ) efficiently,
we therefore introduce a neural network-based optimization approach.
3.2. Neural network modeling for mean–variance–skewness optimization
A neural network is a large dimensional nonlinear dynamic system composed of ‘neurons’. From the viewpoint of a
dynamic system, the final behavior of the system is fully determined by its attraction points, if they exist. For a stable
system, if certain inputs are given, the system will reach a stable attraction point. This point is seen as the optimum
solution of some practical problems, and the evolution process through which the neural network reaches the stable
state from any initial state is just a process of seeking optimization of the objective function within the active domain.
Therefore, the key to designing a neural network is in seeing how to set the corresponding relationships between the
problem and the network’s stable attraction points. Now, we consider the problem (P2 ): to design a neural network
algorithm for multi-objective optimization. From (P2 ), the Lagrange function can be defined as
L(X, ) = −1 (X T R̄) + 2 (X T V X) − 3 (E(X T (R − R̄))3 ) + (X T I − 1),
(5)
where is the Lagrange multiplier.
According to the classical limit theory, the necessary condition for X ∗ to be the optimum solution of (P2 ) is that
there exists , which X∗ make satisfy the following relation:
∇x L(X ∗ , ∗ ) = −1 R̄ + 22 V X∗ − 33 (R − R̄)(X ∗T V X∗ ) + I T ∗ = 0,
∇ L(X ∗ , ∗ ) = X ∗T I − 1 = 0.
(6)
38
L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46
Various forecasts
Trade strategies
Sub-objective
aspired level
Investors' risk
preference
Sub-objective
optimization (PT)
Multi-objective
optimization (RBFN)
RBFN incremental
learning
Results comparision
Minimal result
Investment decision
Investor feedback
Fig. 2. The overall procedure for the integrated approach to portfolio selection.
Our aim now is to design a neural network that will settle down to an optimum point satisfying Eq. (6). The transient
behaviors of the neural network are defined by the following equations:
⎧ dX
⎪
= −∇x L(X ∗ , ∗ ) = 1 R̄ − 22 V X ∗ + 33 (R − R̄)(X ∗T V X ∗ ) − I T ∗ ,
⎨
dt
(7)
⎪
⎩ d
∗
∗
∗T
= ∇ L(X , ) = X I − 1.
dt
Using nonlinear programming theory [25,29] and the techniques of system realization [30,31], we find that if
the network is physically stable, the attraction point or optimum point (X ∗ , ∗ ), described by (dX/dt)|(X∗ ,∗ ) = 0
and (d/dt)|(X∗ ,∗ ) = 0, clearly meets Eq. (6) and thus provide a Lagrange solution to (P2 ). Thus such a nonlinear
programming problem with constraints can be solved by neural network learning. In the following a detailed procedure
is described.
3.3. An RBF-based mean–variance–skewness model for portfolio selection
In this section, an integrated RBF network-based mean–variance–skewness model for portfolio selection is proposed
considering risk preference, trading strategies and various forecasts, as shown in Fig. 2. The overall procedure for the
proposed integrated approach is described as follows.
Various forecasts: As noted earlier, investment decisions are oriented towards the future; realized returns are of no
use. Therefore, we use different models to predict the excess return for different investment assets.
Trading strategies: Based on various forecasts, corresponding trading strategies for portfolio selection can be made.
For brevity, we use a relatively simple trading strategy to guide the decision in this study. Of course, more complex
trading strategies, such as the probability rule, can also be used. Let R̂k,t+1 denote the excess return for next period
(i.e., t + 1) forecast for different assets. The trading rule for asset component k can be summarized below:
Rule I: If (R̂k,t+1 > 0) then “buy”, else “sell”.
If there are no short sales, then the trading rule is changed as follows:
Rule II: If (R̂k,t+1 > 0) then “buy”, else “no action is taken”.
In this study, we include an extra constraint to prohibit short sales and borrowings; hence Rule II is used here.
Sub-objective optimization: In the light of the results from various forecasts and trading strategies, we optimize for
three different sub-objectives—mean, variance and skewness—individually. That is, problem (P1 ) can be decomposed
into three sub-problems (P3 )–(P5 ) as follows. The optimal results obtained from (P3 )–(P5 ) are considered to be aspired
L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46
39
levels, and are denoted by R̃, Ṽ and S̃. As can be seen, the aspired levels indicate the best case scenario for a particular
objective without considering other objectives.
Maximize R(x) = X T R̄
(P3 )
(8)
subject to XT I = 1,
Minimize V (x) = X T V X
(P4 )
(9)
subject to XT I = 1,
Maximize S(x) = E(X T (R − R̄))3
(P5 )
(10)
subject to X T I = 1.
It is interesting to note that the solution to sub-problem (P3 ) is trivial because linear programming will always assign
all weights to the highest yielding asset when other constraints are ignored. In order to avoid this situation, we adjust
the constraints according to the requirements of practical applications, such as x1 0.1 if R1 ?0. On the other hand,
solving sub-problems (P4 ) and (P5 ) does provide meaningful solutions which lead to tighter bounds.
Multi-objective optimization: In this step, we use the RBF network to solve problem (P1 ) as a nonlinear combining
optimization form. That is, consensus results of multi-objective optimization are obtained from the RBF network. The
concrete process of multi-objective optimization is described below.
For convenience, we firstly make a few notations as follows: Ri , Vi and Si (i = 1, 2, . . . , n) are the mean, variance
and skewness of excess returns of every asset; xi (i = 1, 2, . . . , n) represents the investment proportion of different
assets; R̃, Ṽ and S̃ are three sub-objective aspired levels defined in the previous stages, and are considered to be the
centers of the RBF network; 1 , 2 , and 3 are the investors’ risk preference over the mean, variance and skewness
of excess return; and rR , rV and rS are the radius of the RBF network obtained from the standard deviation of mean,
variance, and skewness of returns.
According to the theory in Sections 2, 3.1 and 3.2, problem (P2 ) can be reformulated by
⎧
n
n
2
2
V
−
Ṽ
)
(x
R
−
R̃)
(x
i
i
i
i
⎪
i=1
⎪
+ 2 exp − i=1
⎪
⎨ Minimize f (x) = −1 exp −
2rv2
2rR2
(11)
(P6 )
n
n
⎪
(xi Si − S̃)2
⎪
⎪
−3 exp − i=1 2
xi − 1 .
+
⎩
2rS
i=1
Using the results of Section 3.2, the nonlinear programming problem (P6 ) can be solved using a neural network. In
this study, we use the RBF network to reach the solution. For further explanation, the RBF architecture in connection
with (P6 ) is illustrated in Fig. 3.
From the problem (P6 ) and Fig. 3, the consensus result can be obtained by considering the mean–variance–skewness
simultaneously. As indicated in (P6 ), the smaller the objective function is, the better the solution is. Thus we can select
the best portfolio according to the minimal output. Accordingly, the optimal weight set X = {x1 , x2 , . . . , xn } can be
obtained from the RBF network.
Results comparison: The “results comparison” is a logic frame. According to the different forecasts and trading
strategies, we can obtain different consensus results. From these consensus results, we can find the minimum of
the consensus results. In accordance with this minimum, investors can choose the best investment portfolio as their
investment decision.
Investment decision: Investors can make their corresponding decisions using the best investment portfolio from the
minimal consensus result.
Investor feedback: In the process of the investment decision, if it is not appropriate to use the optimal results in
practical application, investors can send the information back to the RBF network for incremental learning.
RBFN incremental learning: This incrementally trains the RBFN according to investor feedback, together with the
results of various forecasts, trading strategies and sub-objective optimization to receive other optimal consensus results.
The above procedure can be summarized as follows:
Step 1: Make various predictions using different forecasting models.
Step 2: Select the corresponding assets according to the trading rules.
Step 3: Using programming techniques (PT), obtain three optimal values of mean, variance and skewness individually.
40
L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46
1
X1
1
1
Xn
R1
X1
∼
R
Rn
Xn
V1
X1
− 1
∼
V
Vn
Xn
S1
X1
2
Y
+
− 3
∼
S
Sn
Xn
Fig. 3. A practical network architecture of RBF for problem (P6 ).
Step 4: RBFN optimizes the three objectives simultaneously and consensus results can be obtained.
Step 5: Investors can make their decisions from the consensus results or feasible sets according to the minimum
principle. Investors can choose the optimal investment portfolio as their investment decision. Meanwhile, if the optimal
solution cannot be applied practically, investors can feed the information back to the RBF network. In addition, the
performance of different forecasting models can also be evaluated.
Step 6: RBF neural network incrementally trains according to the feedback.
Step 7: Go to Step 4 with the adjusted RBF network.
Steps 4–7 are the essential and important parts of the process based on the neural network.
An additional condition, i.e., the initial principal allocation problem, needs to be satisfied in the portfolio decision.
Suppose the total principal for investment at the beginning of the period is fixed at T dollars. The asset allocation to
different portions of the portfolio is T x i such that
T xi = T .
(12)
i
Thus, the mean–variance–skewness portfolio optimization problem is solved in the integrated RBF neural networkbased optimization framework. As a consequence, the experiments have to test the validation of the proposed approach.
4. Simulation experiments
4.1. Data description and experiment design
In the light of the earlier description, our analysis is based on the various forecasts and trading strategies to guide for
investments. In this study, we adopt four forecasting models–random walk (RW) model, adaptive exponential smoothing
(AES) model, autoregressive integrated moving average (ARIMA) model and multilayer feed-forward neural network
(MLFNN) model—to predict these financial series. To test the versatility and robustness of the proposed approach for
portfolio selection, three globally-traded stock market indices (S&P500 for the US, FTSE100 for the UK, and Nikkei
225 for Japan) and three globally traded foreign exchanges (euros (EUR), British pounds (GBP) and Japanese yen (JPY))
41
L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46
Table 1
Average excess return of index and exchange
Average rate of excess return
S&P500
FTSE100
Nikkie225
USD/EUR
USD/GBP
USD/JPY
RW
AES
ARIMA
MLFNN
0.000338
−0.00010
0.001351
−0.00056
−0.00080
−0.00204
0.000638
0.000377
0.001350
0.000126
−0.00023
−0.00204
2.65E − 05
−0.00071
0.001488
−0.00047
−0.00074
−0.00193
0.000707
0.000982
0.002061
−0.00033
−0.00053
−0.00078
against the US dollar (USD) are examined in our empirical experiment. The stock indices and exchange data used in
this paper are daily and are obtained from Datastream and Pacific Exchange Rate Service (http://fx.sauder.ubc.ca/),
respectively. The entire data set covers the period from January 1, 2000 to September 30, 2003. The data sets are divided
into two periods: the first period covers January 1, 2000 to July 31, 2003 while the second period is from August 1,
2003 to September 30, 2003. The first period, which is assigned to in-sample estimation, is used to determine the
specifications of the models and parameters for the various forecasting techniques. The second period is reserved for
out-of-sample evaluation and comparison of performance between various forecasting models. For brevity, the original
data are not listed in the paper, and detailed data can be obtained from the sources.
For the sake of facilitating forecast and portfolio optimization, we choose the daily excess returns of these indices
and exchange rates as forecast variables. As shown in Eq. (13), the excess return on a certain index or exchange rate is
defined as the continuously compounded return on the price minus the risk-free interest rate:
Rt = log((Pt − Pt−1 )/Pt−1 ) − rt−1 ,
(13)
where Pt is the price of the stock index or exchange rate traded at time t and rt is the risk-free interest rate at time t. The
reason for choosing the excess returns (rather than the index or exchange rate) is that they can provide a measurement of
how well our models perform relative to the minimum returns gained from depositing the money in a risk-free account.
Furthermore, the forecasting results for excess returns can be used directly by the portfolio selection model, which is
convenient for investors.
4.2. Experiment results
The empirical experiment performed in this study consists of five main stages that are partially described in earlier
sections. In the first stage, we estimate the forecasts of a particular stock market index and foreign exchange rate using
different forecasting models. In the light of the previous plan, 2-month period forecasts of daily returns can be obtained.
Based on these forecasts, the distributional properties—i.e., mean, variance and skewness of the excess return on index
and exchange rate—are computed. Table 1 shows the average excess return on index and exchange trading using the
forecasts supplied by each forecasting technique.
In addition, the second and third moments are also computed for the forecasts made within the in-sample period. The
estimated variance–covariance and skewness/coskewness matrices are shown in Tables 2 and 3 for each stock index
or foreign exchange rate. Since the variance/covariance represents the square term while the skewness/coskewness is
a cube term, the magnitudes of these descriptive statistics are quite small.
In the second stage, according to the previous trading rules, we can select the corresponding assets for further
analysis.
In the third stage, based on the statistical moments mentioned in the first stage, the aspired levels R̃, Ṽ and S̃ are
found by solving the problems (P3 ), (P4 ), and (P5 ) using the PT, respectively. The results are reported in Table 4.
In the fourth stage, based on the previous results and various forecasting techniques as well as the different risk
preferences, the proportion of the portfolio can be obtained using the RBF neural network, as indicated in Eq. (11).
For exposition purposes, the optimal weight sets used to construct the portfolio given the investors’ preference for
42
L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46
Table 2
Variance–covariance matrices of the distributions of returns with different models
Various assets
S&P500
FTSE100
Nikkie225
USD/GBP
USD/JPY
0.0000174
0.0000518
0.0000267
0.0000179
0.0000126
0.0000033
0.0000230
0.0000267
0.0001800
0.0000226
0.0000168
0.0000123
0.0000144
0.0000179
0.0000226
0.0000395
0.0000222
0.0000036
0.0000078
0.0000126
0.0000168
0.0000222
0.0000197
0.0000048
−0.000006
0.0000033
0.0000123
0.0000036
0.0000048
0.0000226
Panel B: Adaptive exponential smoothing (AES) model
S&P500
0.0000015
0.0000009
FTSE100
0.0000009
0.0000013
Nikkie225
0.0000050
0.0000064
USD/EUR
0.0000008
0.0000006
USD/GBP
0.0000005
0.0000008
USD/JPY
−0.0000007
0.0000007
0.0000050
0.0000064
0.0001800
0.0000064
0.0000074
0.0000121
0.0000008
0.0000006
0.0000064
0.0000037
0.0000026
0.0000016
0.0000005
0.0000008
0.0000074
0.0000026
0.0000027
0.0000021
−0.0000007
0.0000007
0.0000121
0.0000016
0.0000021
0.0000198
Panel C: Autoregressive integrated moving average (ARIMA) model
S&P500
0.0000659
0.0000279
0.0000241
FTSE100
0.0000279
0.0001600
0.0000293
Nikkie225
0.0000241
0.0000293
0.0001780
USD/EUR
0.0000189
0.0000298
0.0000275
USD/GBP
0.0000064
0.0000142
0.0000158
USD/JPY
−0.0000009
0.0000003
0.0000138
0.0000189
0.0000298
0.0000275
0.0000457
0.0000208
0.0000037
0.0000064
0.0000142
0.0000158
0.0000208
0.0000197
0.0000054
−0.000009
0.0000003
0.0000138
0.0000037
0.0000054
0.0000242
Panel D: Multilayer feed-forward neural network (MLFNN) model
S&P500
0.0000486
0.0000191
0.0000243
FTSE100
0.0000191
0.0000776
0.0000267
Nikkie225
0.0000243
0.0000267
0.0001740
USD/EUR
0.0000102
0.0000096
0.0000300
USD/GBP
0.0000028
0.0000086
0.0000091
USD/JPY
−0.0000050
0.0000049
0.0000029
0.0000102
0.0000096
0.0000300
0.0000379
0.0000180
−0.0000034
0.0000028
0.0000086
0.0000091
0.0000180
0.0000257
0.0000021
−0.0000050
0.0000049
0.0000029
−0.0000034
0.0000021
0.0000228
Panel A: Random walk (RW) model
S&P500
0.0000633
FTSE100
0.0000174
Nikkie225
0.0000230
USD/EUR
0.0000144
USD/GBP
0.0000078
USD/JPY
−0.000006
USD/EUR
(1 = 2 = 3 = 1) are reported in Table 5. It is worth noting that preference set (1, 1, 1) is a compromise case where
the weights for mean, variance and skewness are equivalent.
It should be noted that an important factor, investors’ risk preference, is not changed in the previous stages. In order
to verify the sensitivity of the proposed approach to changes in the investors’ risk preference (1 , 2 , 3 ), different
levels of risk preference are investigated. Specifically, risk preferences of (1, 1, 1), (1, 2, 2), (2, 2, 1), (1, 2, 1), (2, 1, 1)
and (1, 1, 0) are included in our experiment. The results based on the (1, 1, 1) preference structure indicate that mean,
variance and skewness of return are of equal importance to the investor, as indicated in Table 5. Likewise, the results
based on the (2, 1, 1) preference structure shows that the investors are willing to pursue more excess returns regardless
of risk level while those of (1, 2, 2) and (1, 2, 1) give more emphasis to risk control. (1, 1, 0) is a benchmark case,
representing the Markowitz mean–variance portfolio. Detailed results are shown in Table 6 below.
In the fifth and final stage, as a result of minimization principle (as indicated in Eq. (11)), the minimal value of
consensus results can be found. Accordingly, the best portfolio proportion and the best forecasting model can be
identified. The results of four different forecasting models are shown in Table 7.
From Table 7, we find that the result obtained from the last forecasting model is the best at any level of risk preference
except for risk preference level (1, 1, 0). Accordingly, the corresponding weight sets of different risk preference level
yield the optimal investment portfolio. For example, for the case of risk preference level (1, 2, 2), the optimal proportion
of six different assets is vector (0.0872 0.4412 0.0709 0.3498 0.0000 0.0059); while for the case of risk preference
level (2, 2, 1), the best proportion of six assets is vector (0.2861 0.2112 0.4922 0.0000 0.0000 0.0105).
43
L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46
Table 3
Skewness–coskewness matrices of distributions of returns with different modelsa
Various assets
S&P500
(sii1 )
FTSE100
(sii2 )
Nikkie225
(sii3 )
USD/EUR
(sii4 )
USD/GBP
(sii5 )
USD/JPY
(sii6 )
−0.183229
0.189065
−0.626094
0.0676663
−0.028704
−0.812647
0.024170
0.095803
−0.256502
0.1391594
0.0018061
−0.564531
0.057413
0.173851
−0.465972
0.152388
−0.299149
−0.710214
−0.303717
−0.178357
−0.684112
−0.278846
−0.632863
−0.933453
Panel B: Adaptive exponential smoothing (AES) model
−0.431241
−0.457729
S&P500 (s1jj )
FTSE100 (s2jj )
−0.397415
−0.304806
−0.069946
−21.78567
Nikkie225 (s3jj )
0.0245973
−0.438029
USD/EUR (s4jj )
−0.389790
−0.561944
USD/GBP (s5jj )
0.0630338
−0.201177
USD/JPY (s6jj )
−0.210677
−0.220709
−0.626094
−0.057571
−0.514118
−0.819557
−0.266502
−0.226864
−0.314988
0.4270127
−0.553485
−1.118498
−0.349488
−0.261407
−0.472046
−0.195935
−0.939281
−0.663008
0.091951
0.029908
−0.780072
−0.300213
−0.445175
−0.933780
Panel C: Autoregressive integrated moving average (ARIMA) model
−0.398190
0.130751
S&P500(s1jj )
−0.008059
−0.035178
FTSE100 (s2jj )
Nikkie225 (s3jj )
0.0374245
0.0801771
0.4110735
0.3196692
USD/EUR (s4jj )
−0.153756
−0.042369
USD/GBP (s5jj )
−0.451169
−0.232998
USD/JPY (s6jj )
0.410521
0.188095
−0.659871
0.1441668
−0.198978
−0.379669
−0.397626
−0.127317
−0.210949
0.2762301
−0.113597
−0.470561
−0.011150
0.036948
−0.078079
0.082136
−0.396387
−0.658468
−0.223926
−0.082209
−0.100065
−0.299440
−0.666417
−0.890108
Panel D: Multilayer feed-forward neural network (MLFNN) model
−0.550176
0.106540
S&P500 (s1jj )
0.0635264
1.1309108
FTSE100 (s2jj )
−0.370498
−0.236815
Nikkie225 (s3jj )
0.1695834
−0.000113
USD/EUR (s4jj )
0.1047806
−0.264383
USD/GBP (s5jj )
−0.038874
−0.396269
USD/JPY (s6jj )
−0.255390
−0.061082
−0.792033
0.0645350
−0.011812
−0.312856
0.055261
−0.089385
−0.403015
0.5021456
−0.352080
−0.373512
0.129211
−0.032107
0.126464
0.241597
−1.239753
−0.525513
−0.151076
0.026677
−0.457580
−0.378057
−0.933087
−0.429079
Panel A: Random walk (RW) model
−0.480883
S&P500 (s1jj )
0.0022131
FTSE100 (s2jj )
−0.353687
Nikkie225 (s3jj )
0.1122376
USD/EUR (s4jj )
0.1110669
USD/GBP (s5jj )
−0.017235
USD/JPY (s6jj )
0.125578
−0.078617
−0.144587
0.2093805
0.0716599
−0.373642
a The value of skewness and coskewness is calculated by the following equations (Campbell and Siddique [32]): Skewness (X) = s 3 = E[(X −
i
E(X))3 ], Coskewness (Xii , Yj ) = siij = E[(X − E(X))2 (Y − E(Y ))] or Coskewness (Xi , Yjj ) = sijj = E[(X − E(X))(Y − E(Y ))2 ].
Table 4
Results for (P3 ), (P4 ) and (P5 )a
RW
AES
ARIMA
MLFNN
Problem 3 (P3 )
Problem 4 (P4 )
Problem 5 (P5 )
R̃(x)
Ṽ (x)
S̃(x)
0.001351
0.001350
0.001488
0.002061
0.00000545
0.00000052
0.00000572
0.00000468
0.142110
0.442101
0.268347
1.171740
a Solving (P ), (P ) and (P ) yields the aspired levels, R̃(x), Ṽ (x), S̃(x), respectively. Each aspired level indicates the best case scenario for a
3
4
5
particular objective without considering other objectives. RW represents the random walk model, AES denotes the adaptive exponential smoothing
model, ARIMA represents the autoregressive integrated moving average model and MLFNN represents the multilayer feed-forward neural network
model.
In addition, the calculated average excess returns based on different forecasting techniques and risk preferences as
well as the optimal investment proportion also demonstrate that the consensus results of MLFNN are the best; and
the average excess returns obtained from MLFNN model are the highest, as shown in Table 8. This implies that the
44
L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46
Table 5
Proportions of different assets in the portfolio for risk preference (1, 1, 1)
RW
AES
ARIMA
MLFNN
x (S&P500)
x (FTSE100)
x (Nikkie225)
x (USD/EUR)
x (USD/GBP)
x (USD/JPY)
0.3557
0.3817
0.4130
0.1979
0.1588
0.1622
0.1635
0.2112
0.4430
0.4199
0.4174
0.4540
0.0425
0.0362
0.0000
0.1334
0.0000
0.0000
0.0061
0.0035
0.0000
0.0000
0.0000
0.0000
Table 6
Proportions of different assets in the portfolio for different risk preferences
x (S&P500)
x (FTSE100)
x (Nikkie225)
x (USD/EUR)
x (USD/GBP)
x (USD/JPY)
Investor’s risk preference (1, 2, 2)
RW
0.2117
AES
0.0918
ARIMA
0.1028
MLFNN
0.0872
0.1548
0.2843
0.0035
0.4412
0.3532
0.1814
0.3745
0.0709
0.1529
0.4336
0.1983
0.3948
0.1274
0.0089
0.3104
0.0000
0.0000
0.0000
0.0105
0.0059
Investor’s risk preference (2, 2, 1)
RW
0.3573
AES
0.4011
ARIMA
0.4108
MLFNN
0.2861
0.1514
0.1760
0.0000
0.2112
0.4826
0.3903
0.4517
0.4922
0.0038
0.0326
0.0081
0.0000
0.0049
0.0000
0.1223
0.0000
0.0000
0.0000
0.0071
0.0105
Investor’s risk preference (1, 2, 1)
RW
0.2542
AES
0.2814
ARIMA
0.2443
MLFNN
0.3437
0.1429
0.3556
0.0728
0.1011
0.2918
0.2174
0.3124
0.3205
0.1615
0.1427
0.1723
0.0478
0.1109
0.0029
0.1011
0.0843
0.0387
0.0000
0.0971
0.1026
Investor’s risk preference (2, 1, 1)
RW
0.4225
AES
0.2543
ARIMA
0.2427
MLFNN
0.1516
0.0113
0.1324
0.0767
0.3057
0.2147
0.5117
0.4263
0.3529
0.2198
0.1016
0.2017
0.1854
0.1317
0.0000
0.0526
0.0000
0.0000
0.0000
0.0000
0.0044
Investor’s risk preference (1, 1, 0)
RW
0.1963
AES
0.1178
ARIMA
0.3456
MLFNN
0.3497
0.1118
0.3526
0.0000
0.2717
0.4347
0.1987
0.3057
0.2226
0.0000
0.2735
0.1058
0.0000
0.1705
0.0552
0.1464
0.0747
0.0867
0.0022
0.0965
0.0813
Table 7
The minimum of the objective function with different models and risk preference
Preference
(1, 1, 1)
(1, 2, 2)
(2, 2, 1)
(1, 2, 1)
(2, 1, 1)
(1, 1, 0)
RW
AES
ARIMA
MLFNN
2.5471
2.2706
2.4110
2.0465
4.1721
3.6406
3.8578
3.1331
4.5334
4.2686
4.3959
4.0389
3.5882
3.2915
3.4289
3.0416
3.5816
3.2687
3.4236
3.0469
0.9298
1.9969
1.9963
1.9953
proposed approach is a fast and efficient way for optimal portfolio selection considering the mean–variance–skewness
simultaneously.
From Table 8, we can see that the expected excess return of MLFNN is the highest at any level of risk preference,
which indicates that the MLFNN is the best forecasting technique of the four selected forecasting models. This also
45
L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46
Table 8
Comparison of expected excess returns with different models and risk preferences
Preference
(1, 1, 1)
(1, 2, 2)
(2, 2, 1)
(1, 2, 1)
(2, 1, 1)
(1, 1, 0)
RW
AES
ARIMA
MLFNN
0.0006790
0.0008761
0.0005114
0.0012000
0.0003457
0.0004632
0.0002143
0.0005000
0.0007516
0.0008533
0.0005750
0.0014000
0.0002077
0.0006244
0.0000764
0.0009000
0.0002033
0.0009158
0.0004526
0.0011000
0.0003292
0.0004936
0.0001197
0.0009000
implies that the proposed approach can be used as an alternative for evaluating the performance of various forecasting
models.
5. Conclusions
This study proposes an integrated RBF neural network-based mean–variance–skewness optimization model for
portfolio selection. This model examines the historical performance of the various series based on forecasts and trading
schemes in terms of the mean, variance, and skewness of investment returns simultaneously. Through the use of the
RBF neural network model, an investor can construct a portfolio which matches his or her risk preference based on
forecasts and trading strategies as well as the mean–variance–skewness objectives simultaneously. We test our proposed
approach with three widely traded stock market indices (S&P500, FTSE100 and Nikkei225) and three widely traded
foreign exchanges (USD/EUR, USD/GBP and USD/JPY). The empirical results indicate that the proposed approach
is an efficient way of solving the trade-off in the mean–variance–skewness portfolio problem. Moreover, we find that
this approach can also be used as an alternative method of evaluating the performance of different forecasting models.
Acknowledgments
The authors would like to thank the guest editor and the anonymous reviewers for their valuable comments and
suggestions, which have improved the quality of the paper immensely.
References
[1] Markowitz HM. Portfolio selection. Journal of Finance 1952;7:77–91.
[2] Mao JCT. Models of capital budgeting, E-V vs. E-S. Journal of Financial and Quantitative Analysis 1970;5:657–75.
[3] Konno H, Yamazaki H. Mean-absolute deviation portfolio optimization model and its applications to Tokyo stock market. Management Science
1991;37:519–31.
[4] Feinstein CD, Thapa MN. A reformulation of a mean-absolute deviation portfolio optimization model. Management Science 1993;39:1552–3.
[5] Simaan Y. Estimation risk in portfolio selection: the mean variance model versus the mean absolute deviation model. Management Science
1997;43:1437–46.
[6] Fishburn DC. Mean-risk analysis with risk associated with below-target returns. American Economical Review 1977;67:117–26.
[7] Pogue JA. An extension of the Markowitz portfolio selection model to include variable transactions costs, short sales, leverage policies, and
taxes. Journal of Finance 1970;25:1005–28.
[8] Brennan MJ. The optimal number of securities in a risky asset portfolio when there are fixed costs of transaction: theory and some empirical
results. Journal of Financial and Quantitative Analysis 1975;10:483–96.
[9] Levy H. Equilibrium in an imperfect market: a constraint on the number of securities in the portfolio. American Economic Review 1978;68:
643–58.
[10] Patel NR, Subrahmanyam MG. A simple algorithm for optimal portfolio selection with fixed transaction costs. Management Science 1982;28:
303–14.
[11] Gennotte G, Jung A. Investment strategies under transaction costs: the finite horizon case. Management Science 1994;40:385–404.
[12] Morton AJ, Pliska SR. Optimal portfolio management with transaction costs. Mathematical Finance 1995;5:337–56.
[13] Yoshimoto A. The mean-variance approach to portfolio optimization subject to transaction costs. Journal of the Operations Research Society
of Japan 1996;39:99–117.
[14] Li ZF, Wang SY, Deng XT. A linear programming algorithm for optimal portfolio selection with transaction costs. International Journal of
Systems Science 2000;31:107–17.
[15] Arditti FD. Risk and required return on equity. Journal of Finance 1967;22:19–36.
46
L. Yu et al. / Computers & Operations Research 35 (2008) 34 – 46
[16] Arditti FD. Another look at mutual fund performance. Journal of Financial and Quantitative Analysis 1971;6:909–12.
[17] Samuelson P. The fundamental approximation theorem of portfolio analysis in terms of means variances and higher moments. Review of
Economic Studies 1958;25:65–86.
[18] Rubinstein ME. A comparative static analysis of risk premiums. The Journal of Business 1973;12:605–15.
[19] Konno H, Suzuki K. A mean-variance-skewness optimization model. Journal of the Operations Research of Japan 1995;38:137–87.
[20] Lai T. Portfolio selection with skewness: a multiple-objective approach. Review of Quantitative Finance and Accounting 1991;1:293–305.
[21] Chunhachinda P, Dandapani K, Hamid S, Prakash AJ. Portfolio selection and skewness: evidence from international stock market. Journal of
Banking and Finance 1997;21:143–67.
[22] Leung MT, Daouk H, Chen AS. Using investment portfolio return to combine forecasts: a multiobjective approach. European Journal of
Operational Research 2001;134:84–102.
[23] Wang SY, Xia YS. Portfolio selection and asset pricing. Berlin: Springer; 2002.
[24] Liu SC, Wang SY, Qiu WH. A mean–variance–skewness model for portfolio selection with transaction costs. International Journal of Systems
Sciences 2003;34(4):255–62.
[25] Kennedy MP, Chua LO. Neural networks for nonlinear programming. IEEE Transactions on Circuits and Systems 1988;35(3):554–62.
[26] Broomhead DS, Lowe D. Multivariable functional interpolation and adaptive networks. Complex Systems 1988;2:321–55.
[27] Wedding II DK, Cios KJ. Time series forecasting by combining RBF networks, certainty factors, and the Box-Jenkins model. Neurocomputing
1996;10:149–68.
[28] Loukas YL. Radial basis function networks in host-guest interactions: instant and accurate formation constant calculations. Analytica Chimica
Acta 2000;417:221–9.
[29] Zhang XS. Neural networks in optimization. Norwell: Kluwer Academic Publishers; 2000.
[30] Nise NS. Control system engineering. Menlo Park: Benjamin, Cummings; 1992.
[31] Chua LO, Lin GN. Nonlinear programming without computation. IEEE Transactions on Circuit Systems 1984;31:182–8.
[32] Campbell RH, Siddique A. Conditional skewness in asset pricing models tests. Journal of Finance 2000;55:1263–95.