Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Ye 2016

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Information Sciences 367–368 (2016) 41–57

Contents lists available at ScienceDirect

Information Sciences
journal homepage: www.elsevier.com/locate/ins

A novel forecasting method based on multi-order fuzzy time


series and technical analysis
Furong Ye a, Liming Zhang b, Defu Zhang a,∗, Hamido Fujita c, Zhiguo Gong b
a
Department of Computer Science, Xiamen University, Xiamen, 361005, China
b
Department of Computer and Information Science, University of Macau, Macau, China
c
Faculty of Software and Information Science, Iwate Prefectural University, Iwate, Japan

a r t i c l e i n f o a b s t r a c t

Article history: Financial trading is one of the most common risk investment actions in the modern eco-
Received 30 November 2015 nomic environment because financial market systems are complex non-linear dynamic
Revised 8 March 2016
systems. It is a challenge to develop the inherent rules using the traditional time se-
Accepted 25 May 2016
ries prediction technique. In this paper, we proposed a new forecasting method based
Available online 30 May 2016
on multi-order fuzzy time series, technical analysis, and a genetic algorithm. Multi-order
Keywords: fuzzy time series (first-order, second-order and third-order) are applied in the proposed
Financial forecasting algorithm, and to improve the performance, genetic algorithm is used to find a good do-
Genetic algorithm main partition. Technical analysis such as the Rate of Change (ROC), Moving Average Con-
Fuzzy time series vergence/Divergence (MACD), and Stochastic Oscillator (KDJ) are introduced to construct
Technical analysis multi-variable fuzzy time series, and exponential smoothing is used to eliminate noise in
the time series. In addition to the root mean square error and mean square error, the di-
rectional accuracy rate (DAR) is also used in our empirical studies. We apply the proposed
method to forecast five well-known stock indexes and the NTD/USD exchange rates. Exper-
imental results demonstrate that our proposed method outperforms other existing models
based on fuzzy time series.
© 2016 Elsevier Inc. All rights reserved.

1. Introduction

fuzzy intervals are widely regarded as the fundamental problem for modeling fuzzy time series and are essential for
model calculation and trend prediction [1]. As a result, fuzzy intervals are often taken as a key research problem in data
analysis. Since Song and Chissom [1] introduced the concepts of fuzzy time series, such models have received much attention
from researchers, and considerable research progress has been made afterwards. We can group the existing works into the
following four categories according to the technique used to partition fuzzy intervals as follows.
As the first category, the works by Song [1,2], Chen [3], Hwang [4] and Lee [5] are regarded as the pioneer research efforts
in this area. In their models, the minimum and maximum values of the sample data were rounded upward and downward,
respectively, to determine the universe classification. Then, based on the size of the universe, they took an integer as the
length of the interval to uniformly divide the universe.
The representative scholars of the second category include Huarng [6], Teoh [7], Jilani [8], and Yu [9]. They proposed to
divide the interval based on the distribution of the sample data. The techniques used include adjusting the interval lengths


Corresponding author. Tel.: +86 5925918207; fax: +86 5922580035.
E-mail address: dfzhang@xmu.edu.cn (D. Zhang).

http://dx.doi.org/10.1016/j.ins.2016.05.038
0020-0255/© 2016 Elsevier Inc. All rights reserved.
42 F. Ye et al. / Information Sciences 367–368 (2016) 41–57

according to the density of the samples, defining a new distance formula and dividing intervals according to the distance
distribution between samples, and determining the number of intervals according to the statistical peak of the samples.
Aladag [10,11], Yolcu [12] and Egrioglu [13,14], as the representatives of the third category, proposed to find the partition
method of the optimal fuzzy subset by using optimization algorithms. Aladag [10] used neural networks to forecast in high
order fuzzy time series, Yolcun [12] focused on a sing-variable constraint and proposed an efficient approach to identify the
intervals’ length, and the method of Egrioglu is based on SARIMA. Chen and Cheng’s genetic algorithm [15–17] also falls in
this category. The basic idea of methods is to use the prediction error as the objective function and seek for the minimum
value of the objective function according to a certain step length. The interval with the minimum objective value is taken
as the final partition of the model.
The fourth category of methods includes newly proposed clustering algorithms [18–21] whose idea is similar to the fuzzy
clustering method (FCM) by Li [22]. The basic idea of these methods is to use an appropriate algorithm to perform cluster
analysis on the sample data, and then determine the partition of each subinterval according to the clustering results.
Many investors often use technical indicators to analyze the stock market and predict its future trend [23]. Stevenson
and John [24] applied a new technique year percentage change replacing enrollments as the universe of discourse. Mul-
tivariable fuzzy time series or multi-factor fuzzy time series based on technical indicators is used to solve the problem
of prediction. Lee et al. [25] developed more techniques on prediction, which considered more factors and high-order to
achieve better results. To avoid complicated matrix computations, Huarng et al. [26] handled forecasting problem with a
multivariate heuristic model. To forecast the TAIEX, Yu and Huarng [27,28] proposed a bivariate model by using neural net-
works. Chen and Chang [29] handled fuzzy rules by clustering algorithms and assigned different weights to clusters. Chen
and Chen [30] investigated TAIEX forecasting with fuzzy time series by considering main factor and secondary factor. Chen
et al. [31] used a particle swarm optimization algorithm to improve their research. Kim et al. [32] used technical indicators
to construct multiple classifiers for predicting a stock price index. Egrioglu et al. [33] predicted Belgian traffic accident ca-
sualties from 1974 to 2004 and used a feedforward neural network to deal with fuzzy relationships in bivariate time series.
Avazbeigi et al. [34] applied three variable fuzzy time series for prediction of automobile production in Iran; in addition,
tabu search is presented in their work. Park et al. [35] developed bivariate fuzzy time series, which underlying price is used
as the second variable, to predict TAIEX and South Korea’s KOSPI 200 index.
Data processing is an important step of data mining to improve the performance of data mining algorithms. In general,
data mining algorithms fail to extract nonlinear valuable patterns from noisy data; therefore, many data smoothing methods
are used to address this task. By “averaging out” the noise, they can extract nonlinear relations from the time series. For
example, Zhang et al. [36] used nonparametric kernel regression to filter the noise in the time series. This paper preprocesses
the training data using the exponential smoothing method [37] to obtain smooth training data values, but do not perform
any processing on the testing data.
As the global financial markets are becoming deregulated, the modeling and forecast of financial market system are be-
coming more complex in the risk management and derivatives rating. However, one of the key aspects of complex statistical
model in financial market is accurate forecasting that could yield significant profits and could also decrease investment
risks [38]. Considering the stock prediction, the most frequently used forecasting methods are nonlinear models, for exam-
ple, neural network [36,39], Markov modeling [40], genetic algorithm [41], fuzzy logic [42], support vector machine [43] and
hybrid models [44]. However, fuzzy time series method has been regarded as one of important novel methods in this area.
Thus far, there has been various research of handling stock index forecasting using fuzzy time series [45–49].
For fuzzy time series forecasting, Chen et al. [15,16,18,19,29–31,49], Yu and Huarng [4,9,26–28,39,45] did a lot of excel-
lent works. Recently, Wei et al. [50] forecasted the trend of TAIEX stock by combining a linear model and moving average
technical index. Chen and Chen [51] introduced binning-based partition and entropy-based discretization and proposed a
new fuzzy time series model based on granular computing. Chen and Chen [52] proposed a new fuzzy forecasting model,
replacing the fuzzy logical relationship groups with fuzzy-trend logical relationship groups and introducing the probabil-
ities of trends. Cai et al. [53] used ant colony optimization to obtain a good partition of the universe of discourse, and
auto-regression was introduced to better use historical information. These algorithms show good ability of achieving good
forecasting results.
In this paper, a new hybrid multi-order fuzzy time series model is proposed for financial forecasting, and genetic algo-
rithm is applied to obtain good partitions of the universe of discourse. Generally, only first-order fuzzy time series is used
in this area, which we think is not appropriate. Generally, stock price is influenced by historical data. The price of a certain
stock on a certain day is not only related to the price of the day before but also related to the price of the near past, al-
though they might not have the same impact strength. Obviously first-order fuzzy time series neglects the influence from
the price of the near past, and this may cause the inaccuracy of forecast. To establish contact between the predicted day
and the near past days, a hybrid multi-order fuzzy time series is applied. Specifically, we extract first-order, second-order
and third-order fuzzy time series and average the three fuzzy values to obtain the final predicted value. We stop at the
third-order because the effect of mixing higher order fuzzy time series to the results is negligible, and it affects the compu-
tation speed. Additionally, higher orders need more time series data to train a forecasting model. After many experiments,
better forecasting results are achieved by using uniform weights across the orders instead of different weights. The reason in
using a genetic algorithm is that the operators in the algorithm like selection, crossover, and mutation, can help the model
find excellent domain partition iteratively. In particular, we introduce three indicators, the Rate of Change (ROC), Moving
Average Convergence/Divergence (MACD), and Stochastic Oscillator (KDJ), to construct the multi-variables fuzzy time series,
F. Ye et al. / Information Sciences 367–368 (2016) 41–57 43

where the main variable is ROC, and MACD and KDJ are the secondary variables. According to the experimental results, the
proposed method is superior to other existing models based on fuzzy time series.
The proposed method is different from our previous methods [47] and [53]. FTSGA in [47] combines fuzzy time series
and a genetic algorithm for stock forecasting and uses only first-order model and a single variable. ACO-AR in [53] used
fuzzy time series and an ant colony algorithm and auto-regression for forecasting, and it is a multi-order and single variable
model. A few other researchers concentrated on multi-variables and a multi-order fuzzy time series model, such as [54].
In those papers, variables are used dependently; therefore the whole method is like a summary of several single variable
models. The proposed method in this paper is a multi-variable and multi-order fuzzy time series model, and it uses technical
analysis to further improve the results.
To the best of our knowledge, the directional accuracy rate (DAR) of forecasting model is seldom considered in other
forecasting models. In fact, the directional accuracy of forecasting model is more taken into account the trading strategy for
investment than the root mean square error (RMSE) [36]. In addition, most forecasting models based on fuzzy time series
test their results with only one stock index, and this is not sufficient to compare the generalization abilities of different
models. This paper considers the root mean square error (RMSE) and the directional accuracy rate (DAR) as the evaluation
standard simultaneously and calculates RMSE and DAR for several well-known stock indexes such as Dow Jones, NASDAQ,
HSI, and SP500.
The rest of this paper is organized as follows. The basic concepts of fuzzy time series are introduced in Section II, and
technical analysis is briefly introduced in Section III. In Section IV, we present a novel framework for financial forecasting.
In Section V, we develop a new forecasting method based on multi-order, Genetic algorithm and technical indicators. In
Section VI, the forecasting results of the proposed method are compared with those of the existing models for forecasting
six well-known financial time series. The conclusions are given in Section VII.

2. Fuzzy time series

The fuzzy time series was proposed by Song and Chissom in 1993 [1]. Up to now, fuzzy time series have been success-
fully used to solve many practical problems. To comprehend and apply fuzzy time series, we give the introduction of basic
concepts of fuzzy time series as follows.
U = {u1 ,u2 , … , un } denotes the universe of discourse. A fuzzy set A in U is given as follows
A = fA (u1 )/u1 + fA (u2 )/u2 + · · · + fA (un )/un (1)
where fA : U → [0, 1] is the membership function of the fuzzy set A, and f A (ui ) is the membership degree of ui belonging
to A, and fA (ui ) ∈ [0, 1] (1 ≤ i ≤ n ).

Definition 2.1. Let Y(t) (t = 0, 1, …) denote a subset of real numbers R and the universe of discourse. Fuzzy sets f i (t ) (i = 1,
2, …) are defined on Y(t), and assume that F(t) is a collection of f i (t ) (i = 1, 2, …). Then, F(t) is called a fuzzy time series on
Y(t) (t = 0, 1, …).

Definition 2.2. Assume that R(t, t − 1) is the fuzzy relation between F(t − 1) and F(t) and both F(t − 1) and F(t) are fuzzy sets.
The relationship can be shown as F(t) = F(t − 1)oR(t, t − 1), then F(t) is called derived from F(t − 1), namely, F (t − 1 ) → F (t ),
where the symbol “o” is the max-min composition operator. F(t) = F(t − 1)oR(t, t − 1) is called the first-order model of F(t).

Definition 2.3. Assume that F(t − 1) = Ai and F(t) = Aj , then Ai → A j represents the fuzzy logic relationship between F(t − 1)
and F(t), where Ai (the left-hand side of Ai → A j ) is called the current state of the fuzzy logic relationship (FLR) and Aj (the
right-hand side of Ai → A j ) is called the next state of the fuzzy logic relationship (FLR).

Definition 2.4. .F(t) denotes a fuzzy time series, if F(t) is caused by F(t − 1), F(t − 2), … , F(t-n), then F(t) is n-order fuzzy time
series. For example, in this paper the expression of first-order fuzzy time series is F (t − 1 ) → F (t ); for the second-order, it
is F (t − 2 ), F (t − 1 ) → F (t ); and for the third–order, it is F (t − 3 ), F (t − 2 ), F (t − 1 ) → F (t ).

Definition 2.5. Let R(t, t − 1) be the fuzzy relation defined on F(t − 1) and F(t). If R(t, t − 1) = R(t − 1, t − 2) for any t, then
F(t) is called a time invariant fuzzy time series. Otherwise, it is called a time variant fuzzy time series.

Definition 2.6. Let F(t) be a fuzzy time series, if F(t) is caused by ((F1 (t − 1), F1 (t − 2),…,F1 (t − n)), (F2 (t − 1),
F2 (t − 2),…,F2 (t − n)),…, (Fm (t − 1), Fm (t − 2),…,Fm (t − n))), then it is called m-variate n-order fuzzy time series. For exam-
ple, 4-variate 3-order fuzzy logic relationship is as follows:
{(F1 (t − 1 ), F1 (t − 2 ), F1 (t − 3 )), (F2 (t − 1 ), F2 (t − 2 ), F2 (t − 3 )), (F3 (t − 1 ), F3 (t − 2 ), F3 (t − 3 )),
(F4 (t − 1 ), F4 (t − 2 ), F4 (t − 3 ))} → F (t )

3. Technical analysis

Technical analysis is considered as a good analysis methodology in finance, it predicts the direction of prices by using
historical information. Some indicators that obtain patterns through trading volume and price and describe behavior of
relative prices have become the most important tools of technical analysis. The well-known indicators include the relative
moving averages, strength index, regressions, business cycles, classically, recognition of chart patterns. Basic explanations for
44 F. Ye et al. / Information Sciences 367–368 (2016) 41–57

these financial terms can be referred to [23]. According to many experiments, ROC, MACD and KDJ are selected as variables
for the model in this paper.
The Rate of Change (ROC) reflects the percentage difference between the closing price N days ago and today’s closing
price. It fluctuates around zero. The ROC rises as the price increases, and when the price decreases, the ROC drops. If the
price experiences a sharp change, the ROC responses a great change.
The ROC can be seen as a momentum technical indicator, which identifies high lows and zero line crossovers. The ROC
reflects the sales and purchase level of markets. The higher ROC reflects a more overbought security or vice versa. However,
in many cases, the extremely overbought/oversold ROC may indicate a continuation of the recent trend.
This paper sets N = 1, and ROC of the tth day is computed as follows:
C loset − C loset−1
ROCt = × 100% (2)
Closet−1
MACD (Moving Average Convergence/Divergence) can help to reveal changes of stock prices and present a trend in a
period of time. MACD is a set of three time series calculated from historical price data, and it is common to choose the
closing price. These three series are: the MACD series proper, the “signal” or “average” series, and the “divergence" series
which is the difference between the two. The MACD series is the difference between a “fast” (short period) exponential
moving average (EMA), and a “slow” (longer period) EMA of the price series. The average series is an EMA of the MACD
series itself.
The notation “MACD(a, b, c) ” usually denotes the indicator where the MACD series is the difference of EMAs with char-
acteristic times a and b, and the average series is an EMA of the MACD series with characteristic time c. These parameters
are usually measured in days. MACD(12,26,9) is with most commonly selected parameters. For example, MACD(12,26, 9) is
computed as follows:
EMA of 12 days
11 2
E MA12 = × E MA12 of last day + × the closing price of today
13 13
EMA of 26 days
25 2
E MA26 = × E MA26 of last day + × the closing price of today
27 27
DIF: DIF = EMA12 − EMA26
8 2
DEA of today = × DEA of last day + × DIF of today
10 10

MACD = (DIF − DEA ) × 2


Another momentum indicator, Stochastic Oscillator, shows the location of the close relative to the high-low range over
a set number of periods. The term “stochastic” makes use of the price range of a current price over a period of time. This
method aims to forecast price turning points by comparing the closing price of a security to its price range. The KDJ indicator
is an extension of Stochastic Oscillator. Compared with the Stochastic Oscillator, KDJ includes one more ‘J’ line along with
the traditional ‘D’ and ‘K’ lines. Along with D & K, the J line assists traders in identifying overbought and oversold markets.
The KDJ indicator can be used for devising trading strategies. Either a trader can buy when all lines (K, D, and J) are below
20 and sell above 80, or a trader can use the ‘J’ line to construct a momentum based technique. The KDJ is superior to the
stochastic indicator as identifying stock overbought and oversold.
To understand the KDJ indicator, the construction of the stochastic indicator is introduced. The stochastic indicator con-
sists of K and D lines which move between 0 and 100. The K line in the stochastic indicator is computed as follows:
closet − Lowest6
K= × 100%
Highest6 − Lowest6
Where Lowest6 denotes the lowest closing price of the last 6 days, Highest6 denotes the highest closing price of the last 6
days.
The D line in the stochastic indicator is the simple moving average of the K line. D is based on what the trader wants to
choose, most often the three day simple moving average of K. The computation of D is as follows:
D = N day simple moving average of K line
where N is usually three or a value decided by the trader.
The J line in the KDJ indicator can be computed as follows:
J = (3 × D ) − (2 × K )
In the J line, D is usually given greater weight than K. In the KDJ indicator, K is the fastest line and J is the slowest line.
Exponential smoothing is applied to smooth time series data which acts as low-pass filters to remove high frequency
noise. Exponential smoothing is developed from moving average method, which does not need to store much former data,
F. Ye et al. / Information Sciences 367–368 (2016) 41–57 45

Table 1
Taiwan weighted index in January 20 0 0.

Datet Indext Closet ROC Fuzzy value

20 0 0-1-4 1 8756.55 — —
20 0 0-1-5 2 8849.87 1.07% A4
20 0 0-1-6 3 8922.03 0.82% A3
20 0 0-1-7 4 8845.47 −0.86% A2
20 0 0-1-10 5 9102.6 2.91% A5
20 0 0-1-11 6 8927.03 −1.93% A1
20 0 0-1-12 7 9144.65 2.44% A5
20 0 0-1-13 8 9107.19 −0.41% A2
20 0 0-1-14 9 9023.24 −0.92% A2
20 0 0-1-15 10 9191.37 1.86% A4
20 0 0-1-17 11 9315.43 1.35% A4
20 0 0-1-18 12 9250.19 −0.70% A2
20 0 0-1-19 13 9151.44 −1.07% A1
20 0 0-1-20 14 9136.95 −0.16% A2
20 0 0-1-21 15 9255.94 1.30% A4
20 0 0-1-24 16 9387.07 1.42% A4
20 0 0-1-25 17 9372.37 −0.16% A2
20 0 0-1-26 18 9581.96 2.24% A5
20 0 0-1-27 19 9628.98 0.49% A3
20 0 0-1-28 20 9696.91 0.71% A3
20 0 0-1-29 21 9636.38 −0.62% A2
20 0 0-1-31 22 9744.89 1.13% A4

but includes the importance of data at every period, and uses all of the historical data. Exponential smoothing method is to
achieve the minimum variance (MSE) between the measured value and the predictive value, and its estimation is nonlinear.
The simplest form of exponential smoothing is given by the formula:
st = α xt + (1 − α )st−1
where α is the smoothing factor, and 0 < α < 1. {xt } denotes the raw time series data, and the output of the exponential
smoothing algorithm is commonly written as {st }. The smoothed statistic st is a simple weighted average of the current
observation xt and the previous smoothed statistic st−1 . The term smoothing factor applied to α here is something of a
misnomer, as larger values of α actually reduce the level of smoothing, and in the limiting case with α = 1 the output series
is just the same as the original series. If two observations are available, it is easy to produce a smoothed statistic for simple
exponential smoothing. Values of α close to one have less of a smoothing effect and give greater weight to recent changes
in data, while values of α closer to zero have a greater smoothing effect and are less responsive to recent changes. There is
no formally correct procedure for choosing α .

4. Fuzzy time series model

There are mainly four steps to construct a forecasting model with fuzzy time series: (1) Fuzzify the training data; (2)
Extract the fuzzy logic relationships from the training data, (3) Fuzzify the testing data, and (4) Predict the testing data using
the fuzzy logic relationships extracted in (2). To interpret the proposed forecasting model in detail, the Taiwan Weighted
Index in January 20 0 0 (Table 1) is taken as an example, which is similar to Chen’s model [3].
Step 1: First, calculate the ROC according to the closing price of training data and define the range of the domain. Let
Dmax and Dmin be the maximum and minimum values of the percentage change, respectively, and then the universe of
discourse U = [Dmin , Dmax ], which is shown in Table 1.
Step 2: Partition the universe of discourse into isometric intervals: u0 , u1 , u2 , . . . , un . m0 , m1 , m2 , . . . , mn denotes the
mid-values of each interval correspondingly. The partition of U is shown in next section.
Step 3: Define the fuzzy sets A = {A0 , A1 , A2 , . . . , An } as:
A0 = a00 /u0 + a01 /u1 + · · · + a0n /un (3)

A1 = a10 /u0 + a11 /u1 + · · · + a1n /un (4)

An = an0 /u0 + an1 /u1 + · · · + ann /un (5)


where ai, j denotes the membership degree of the interval j to the fuzzy set i, and “+” here indicates which elements are in
Ai and their membership degree. The value of ai, j is defined as follows:

1 i= j
ai, j = 0.5 i = j + 1 or i = j − 1
0 others
46 F. Ye et al. / Information Sciences 367–368 (2016) 41–57

Table 2
Fuzzy logic relationships group (FLRG).

First-order FLRGs

Left-hand side FLRG

A1 A1 → A2 1 A1 → A5 1
A2 A2 → A1 1 A2 → A2 1 A2 → A4 3 A2 → A5 2
A3 A3 → A2 2 A3 → A3 1
A4 A4 → A2 2 A4 → A3 1 A4 → A4 2
A5 A5 → A1 1 A5 → A2 1 A5 → A3 1

Second-order FLRGs

Left-hand side FLRG

A1 , A2 ( A1 , A2 ) → A4 1
A1 , A5 ( A1 , A5 ) → A2 1
A2 , A1 ( A2 , A1 ) → A2 1
A2 , A2 ( A2 , A2 ) → A4 1
A2 , A4 ( A2 , A4 ) → A4 2
A2 , A5 ( A2 , A5 ) → A1 1, (A2 , A5 ) → A3 1
A3 , A2 ( A3 , A2 ) → A5 1, (A3 , A2 ) → A4 1
A3 , A3 ( A3 , A3 ) → A2 1
A4 , A2 ( A4 , A2 ) → A1 1, (A4 , A2 ) → A5 1
A4 , A3 ( A4 , A3 ) → A2 1
A4 , A4 ( A4 , A4 ) → A2 2
A5 , A1 ( A5 , A1 ) → A5 1
A5 , A2 ( A5 , A2 ) → A2 1
A5 , A3 ( A5 , A3 ) → A3 1

Third-order FLRGs

Left-hand side FLRG

A4 , A3 , A2 ( A4 , A3 , A2 ) → A5 1
A3 , A2 , A5 ( A3 , A2 , A5 ) → A1 1
A2 , A5 , A1 ( A2 , A5 , A1 ) → A5 1
A5 , A1 , A5 ( A5 , A1 , A5 ) → A2 1
A1 , A5 , A2 ( A1 , A5 , A2 ) → A2 2
A5 , A2 , A2 ( A5 , A2 , A2 ) → A4 1
A2 , A2 , A4 ( A2 , A2 , A4 ) → A4 1
A2 , A4 , A4 ( A2 , A4 , A4 ) → A2 2
A4 , A4 , A2 (A4 , A4 , A2 ) → A1 1, (A4 , A4 , A2 ) → A5 1
A4 , A2 , A1 ( A4 , A2 , A1 ) → A2 1
A2 , A1 , A2 ( A2 , A1 , A2 ) → A4 1
A1 , A2 , A4 ( A1 , A2 , A4 ) → A4 1
A4 , A2 , A5 ( A4 , A2 , A5 ) → A3 1
A2 , A5 , A3 ( A2 , A5 , A3 ) → A3 1
A5 , A3 , A3 ( A5 , A3 , A3 ) → A2 1
A3 , A3 , A2 ( A3 , A3 , A2 ) → A4 1

Step 4: Calculate the membership degree of the daily percentage change for each element in A, and fuzzify it to Ai whose
value is the biggest to form the fuzzy sequence shown in the fifth column of Table 1.
Step 5: Extract the first-order, second-order and third-order fuzzy logic relationships (FLR) from the fuzzy sequence, and
classify them by the left-hand sides to form corresponding fuzzy logic relationships group (FLRG). Each FLR is followed by
its frequency, as shown in Table 2. It is noted that Table 2 shows all of the first-order, second-order, and third order fuzzy
logic relationships in Table 1. In fact, some FLRGs are not used for forecasting. For example, when we forecast the value of
20 0 0-2-1, we use only first-order FLRGs: A4 → A2 , A4 → A3 , A4 → A4 , and second-order FLRGs: (A2 , A4 ) → A4 , but no third-
order FLRGs (A3 , A2 , A4 ) occurs. If the target is to extract a k-th order FLRs, the left-hand side of FLRGs is fuzzy sets of k
days, and the right hand side is the corresponding fuzzy set.
Step 6: Build the weighted matrix W, and normalize W by row to obtain W’. Take the first-order fuzzy logic relationships
group for example (see Fig. 1). The left-hand side is shown in the ordinate axis of the weighted matrix and the right-hand is
shown in the abscissa axis. Each number in W is the frequency of occurrence of the corresponding FLR. W’ is the normalized
W.
Then, the weight value of the fuzzy logic relationship Ai → A j is Wi, j . The processing methods for the second-order and
third-order fuzzy logic relationships group are similar.
Step 7: Fuzzify the testing data. The method is similar as those above and predicts the stock’s closing price in the future
based on the normalized weighted matrix W’. First, for the first-order fuzzy logic relationships group, a specific prediction
method is given as follows:
F. Ye et al. / Information Sciences 367–368 (2016) 41–57 47

Fig. 1. Weighted matrixes W and W’.

If the fuzzy value of day t is Ai , and the FLRG whose left-hand side is Ai contains{Ai → Ak1 , Ai → Ak2 , . . . , Ai → Ak p },
p
then the predicted fuzzy value of day t+1 is Wi, k × mk1 + Wi, k × mk2 + · · · + Wi, k × mk p = W  × mk j , where mk j is
j=1 i,k
1 2 p j
the mid-value of interval Ak j . If there is no FLRG whose left-hand side is Ai , then the predicted fuzzy value of day t + 1 is
mi .
For the second-order, we use the fuzzy values of the previous two days for prediction, and specific method is given as
follows:
If the fuzzy value of day t-1 and day t is (Ai , A j ), and the FLRG whose left-hand side is (Ai , A j ) contains {(Ai , A j ) →
Ak1 , (Ai , A j ) → Ak2 , . . . , (Ai , A j ) → Ak p }, then the predicted fuzzy value of day t+1 is Wi, k × mk1 + Wi, k × mk2 + · · · +
p 1 2
Wi, k × mk p = j=1 Wi, k × mk j , where mk j is the mid-value of interval Ak j . If there is no FLRG whose left-hand side is
p j
(Ai , A j ), then we do not use the second-order fuzzy logic relationships group, just use the first-order to predict. In gen-
eral, the amount of data is very large, such that this type of situation may not happen.
For the third-order, we apply the fuzzy values of the previous three days to predict. Forecasting method is similar as
above.
Finally, we average the three predicted fuzzy values to achieve the final predicted value.

5. A new forecasting method based on technical analysis and genetic algorithm

In this section, a new method for financial forecasting is presented by combining multi-order fuzzy time series, technical
analysis, and genetic algorithm. The model uses genetic algorithm operators such as selection, crossover, and mutation to
iteratively seek for an optimal domain partition. There is one only chromosome for each individual and each chromosome
stores genetic information that reflects a type of domain partition. Every gene in the chromosome corresponds to an interval
of the partition. The model searches for a good domain partition using the training data. Using the fuzzy time series, the
model can obtain different predicted values with each partition. The model takes the root mean square error between the
predicted value and the actual value as the fitness of the corresponding individual.
We also use the Taiwan Weighted Index from January 20 0 0 (Table 1) as an example to elaborate. The proposed method
is now presented as follows:
Step 1: Construct the fuzzy time series model.

(1) Encoding. Grou pk = {C j | j = 1, 2, . . . , m; k = 1, 2, . . . , T } is the whole population, where C j is the jth chromosome in the
population, m is the size of the population, and k is the current number of iterations. Each C j consists of four genes:
g(ROC ),g(MACD ),g(K ) and g(J ). Taking g(ROC ) as an example, g(ROC ) j,i = (v j,i , v j,i+1 ] is an encoding gene, where v j,i
is a break point. And two break points establish a fragment as genetic information.

Domain partition is the key of the population encoding. If there are n+1 gene segments, the universe of discourse U
will be divided into n + 1 intervals. {vi |i = 1, 2, . . . , n} denotes the value of the break points, then U = (−∞, v1 ] ∪
(v1 , v2 ] ∪ . . . ∪ (vn−1 , vn ] ∪ (vn , +∞]. Therefore, a chromosome has a corresponding domain partition, and the popula-
tion presents a set of domain partitions. The description of population encoding in the genetic algorithm is shown in
Fig. 2. The corresponding partition of the universe of discourse in Table 1 is shown in Fig. 3.

(2) Fitness calculation. Based on the FLRGs and the normalized weighted matrix W’, the model can obtain the predicted
price. The fitness of each individual is RMSE between the actual value (Realt ) and the predicted value (Predictt ) in the
tth day. The formula is listed below:


n
RMSE = (Predictt − Realt )2 /n (6)
t=1
48 F. Ye et al. / Information Sciences 367–368 (2016) 41–57

Fig. 2. The description of population encoding.

Fig. 3. The domain partition of U.

Fig. 4. The crossover process.

Step 2: The selection process. The model uses the tournament method to select superior individuals. The method ran-
domly selects m individuals from the population and chooses the two optimal individuals, whose fitness is better than
others, as the parents to process the crossover.
Step 3: The crossover process. The model uses single-point crossover operator to hybridize two parent individuals. For
two parent individuals Parent1 and Parent2, we randomly select a break point from each of them as the crossover point;
the crossover points are represented as p1 and p2, respectively. Then, the first half of Parent1 before p1 and the second half
of Parent2 after p2 are merged into a new individual Child1 and the first half of Parent2 before p2 and the second half of
Parent1 after p1 are merged into another new individual Child2. To form two new domain partitions, break points on Child1
and Child2 need to be sorted in ascending order. The process is illustrated in Fig. 4.
Step 4: The mutation process. The mutation process is implemented with the probability PM for genic change. There are
three strategies for mutation as shown in Fig. 5:
F. Ye et al. / Information Sciences 367–368 (2016) 41–57 49

Fig. 5. The three ways of the mutation process.

(1) Insertion. Randomly generate one break point vk in [Dmin , Dmax ] and insert it into the chromosome properly to make
the points in the chromosome maintain ascending order.
(2) Deletion. Randomly delete a break point vk from the chromosome if the number of break points in the chromosome
is greater than two.
(3) Variation. Randomly select a break point vk and change its value. Then, adjust the position of vk to make the points
in the chromosome maintain ascending order.
Step 5: Check whether the termination condition is satisfied. In the experiment, we use two rules as stop conditions:
(1) The number of iterations is beyond the maximum iteration times T;
(2) The best fitness remains unchanged.
The iterative process of the genetic algorithm is stopped if any of the above two rules are met and the algorithm goes to
step 6; Otherwise, the iterative process continues.
Step 6: Let Cbest be the best individual evolved from the iterative process of the genetic algorithm and partition the
universe of discourse based on the Cbest . Then, fuzzify the training data according to Cbest , extract the fuzzy logic relationships
group FLRGs and establish the normalized weighted matrix W  .
Step 7: Fuzzify the testing data according to Cbest , then predict the value according to the FLRGs and W  of step 6. Finally
calculate the RMSE between the actual value and the predicted value.
50 F. Ye et al. / Information Sciences 367–368 (2016) 41–57

Table 3
FLRG and frequency of occurrence.

FLRG Frequency FLRG Frequency FLRG Frequency

( A1 , M1 ) → A1 2 (A1 , A2 , K1 ) → A1 4 (A1 , J1 ) → A1 2
( A1 , A3 , M2 ) → A2 2 (A1 , K2 ) → A2 4 (A1 , A1 , J2 ) → A2 2
( A1 , M1 ) → A2 3 (A1 , K1 ) → A1 5 (A2 , J1 ) → A1 3
( A1 , A2 , M1 ) → A2 3 (A2 , K2 ) → A2 5 (A1 , J2 ) → A2 3
( A1 , M2 ) → A1 3

For example, assume that FLRGs is as Table 2. Flags and frequency of occurrence are shown in Table 3. For day t, the
fuzzy value of ROC, MACD, K, and J is A2 , M1 , K2 , and J1 , respectively. FLRGs are as follows: (A1 , A2 , M1 ) → A2 3, (A2 , K2 ) →
A2 5, (A2 , J1 ) → A1 3, then the predicted fuzzy value of day t + 1 is (3 × m1 + 8 × m2 ) / (3 + 8), where m1 , m2 is the mid-
value of interval A1 , A2 . Based on the predicted value, we can compute the fitness value and RMSE because ROC is the main
variable, and MACD and KDJ are the secondary variables. In the process of extracting FLRG, MACD and KDJ are combined
with ROC to construct the left side of FLR, and the right side is the corresponding fuzzy set of ROC. Instead of orders of
k fuzzy sets in Table 2, the left sides of FLRs in Table 3 summarize all fuzzy sets in k continuous days. (A1 , A3 , M2 ) → A2
denotes that in continuous k days, there exist A1 , A3 , M2 , and in the k + 1 day, the fuzzy set of ROC is A2 .

6. Experiment results

In this section, our proposed method is used to forecast six well-known financial time series: TAIEX, Dow Jones, NASDAQ,
HSI, SP500, and the NTD/USD exchange rates. We implemented the proposed method using the C++ programming language
on an Intel Core i5 PC. In our model, we determine the parameters with a great deal of experiments based on simulation
and its selection is based on how much accuracy could provide among other time series in the competition. The selected
parameters used in genetic algorithm are set as follows: the maximum number of iterations T is set as 100; the size of the
population m is set as 200; the crossover probability PC is set as 80%; the mutation probability PM is set as 1%; and the
individual number in the tournament method K is set as 6. Because of the randomness in genetic algorithm, we executed
the proposed method 100 times and used the average values as the results.

6.1. Index forecasting

To compare the proposed method with most well-known models in this field, the daily closing prices from 1990 to
2004 are used as the evaluation dataset. The data from January to October are used as the training set, and the data from
November to December are used as the testing set to verify the performance of the model. The compared models include
not only fuzzy time series models but also neural network models [3,26,29–31,40,46,47,52,53]. RMSE is used to evaluate the
performance of the proposed method. The RMSE is smaller and the results are better. We also give the directional accuracy
rate (DAR) of the forecast results for each year. Here, the directional accuracy rate is defined as the ratio in the same
direction of the predicted value and actual value of the next moment relative to the actual value of the previous moment;
the formula is as follows:

n
DAR = di /n × 100% (7)
i=1

1, i f (P redicti − Reali−1 ) × (Reali − Reali−1 ) ≥ 0
di = (8)
0, otherwise

Table 4 shows the comparison of the RMSEs and the average RMSEs of different methods on forecasting the TAIEX from
1999 to 2004. Table 5 shows the comparison of the RMSEs and the average RMSEs of different methods on forecasting the
TAIEX from 1990 to 1999.
To highlight the effectiveness of the proposed method, we averaged the average RMSEs of the two sections (1999–2004
and 1990–1999) for the same model, as shown in Table 6.
From Tables 4 and 5 we can observe that, although our predicted average RMSE from 1999 to 2004 is a bit higher than
the result in [53], all of the other average RMSEs are smaller than those of the existing works in [3,26,29–31,40,46,47,52,53].
Because of the multivariate strategy, our predicted results tend to be conservative. At the time of improving the directional
accuracy rate, it also increases the deviation of each single value. This is the direct cause of a bit higher RMSE. However,
from Table 6, our average RMSE of the two sections outperforms all of the results of other methods. Overall, our method is
more effective. In addition, we can observe that the improvement of the results becomes more and more difficult beginning
from the fourth model (Chen et al.’s method [31]).
Furthermore, four well-known financial time series, the Dow Jones, NASDAQ, HSI, and SP500, are applied to verify the
effectiveness of our method and compared the generalization ability of different models. For comparison, we did the same
F. Ye et al. / Information Sciences 367–368 (2016) 41–57 51

Table 4
A comparison of the RMSEs of different methods on forecasting TAIEX from 1999 to 2004.

1999 20 0 0 2001 2002 2003 2004 Average RMSEs

Huarng et al.’s method [26] (Use NASDAQ) N/A 158.70 136.49 95.15 65.51 73.57 105.88
Huarng et al.’s method [26] (Use Dow Jones) N/A 165.80 138.25 93.73 72.95 73.49 108.84
Huarng et al.’s method [26] (Use M1b) N/A 169.19 133.26 97.10 75.23 82.01 111.36
Huarng et al.’s method [26] (Use NASDAQ & Dow Jones) N/A 157.64 131.98 93.48 65.51 73.49 104.42
Huarng et al.’s method [26] (Use NASDAQ & M1b) N/A 155.51 128.44 97.15 70.76 73.48 105.07
Huarng et al.’s method [26] (Use NASDAQ & Dow Jones & N/A 154.42 124.02 95.73 70.76 72.35 103.46
M1b)
Chen’s fuzzy time series model (U_FTS Model) [3,30,31] 120.00 176.00 148.00 101.00 74.00 84.00 117.40
Univariate conventional regression model (U_R model) 164.00 420.00 1070.00 116.00 329.00 146.00 374.20
[30,31]
Univariate neural network model (U_NN model) [30,31] 107.00 309.00 259.00 78.00 57.00 60.00 145.00
Univariate neural network-based fuzzy time series model 109.00 255.00 130.00 84.00 56.00 116.00 125.00
[30,31,46]
Univariate neural network-based fuzzy time series model 109.00 152.00 130.00 84.00 56.00 116.00 107.80
use substitutes (U_NN_FTS_S model) [30,31,46]
Bivariate conventional regression model (B_R model) [30,31] 103.00 154.00 120.00 77.00 54.00 85.00 98.80
Bivariate neural network model (B_NN model) [30,31] 112.00 274.00 131.00 69.00 52.00 61.00 116.40
Bivariate neural network-based fuzzy time series model 108.00 259.00 133.00 85.00 58.00 67.00 118.30
[30,31]
Bivariate neural network-based fuzzy time series model use 112.00 131.00 130.00 80.00 58.00 67.00 96.40
substitutes (B_NN_FTS_S model) [30,31]
AR (1) model [40] 116.84 155.12 112.39 97.09 91.67 79.94 108.84
AR (2) model [40] 128.15 142.30 129.84 89.80 66.58 60.33 102.83
Chen and Chang’s method Use NASDAQ 123.64 131.10 115.08 73.06 66.36 60.48 94.95
[29]
Use Dow Jones 101.97 148.85 113.70 79.81 64.08 82.32 98.46
Use M1B 156.92 142.70 132.76 96.06 90.27 100.10 119.80
Use Dow Jones & NASDAQ 106.34 130.13 113.33 72.33 60.29 68.07 91.75
Use NASDAQ & M1B 116.22 134.63 116.59 76.48 53.51 69.29 94.45
Use NASDAQ & Dow Jones 111.70 129.42 113.67 66.82 56.10 64.76 90.41
& M1B
Chen and Chen’s method Use Dow Jones 115.47 127.51 121.98 74.65 66.02 58.89 94.09
[30]
Use NASDAQ 119.32 129.87 123.12 71.01 65.14 61.94 95.07
Use M1b 120.01 129.87 117.61 85.85 63.10 67.29 97.29
Use Dow Jones & NASDAQ 116.64 123.62 123.85 71.98 58.06 57.73 91.98
Use Dow Jones & M1B 116.59 127.71 115.33 77.96 60.32 65.86 93.96
Use NASDAQ & M1B 114.87 128.37 123.15 74.05 67.83 65.09 95.56
Use NASDAQ & Dow Jones 112.47 131.04 117.86 77.38 60.65 65.09 94.08
& M1b
Chen et al.’s method [31] Use Dow Jones 102.34 131.25 113.62 65.77 52.23 56.16 86.89
Use NASDAQ 102.11 131.30 113.83 66.45 52.83 54.17 86.78
Use M1b 103.52 131.36 112.55 66.23 53.20 55.36 87.04
Chen and Kao’s method [46] 87.63 125.34 114.57 76.86 54.29 58.17 86.14
FTSGA model [47] 102.74 126.68 115.79 65.56 57.40 56.10 87.38
ACO-AR model [53] 102.22 131.53 112.59 60.33 51.54 50.33 84.75
Chen et al.’s method [52] Use Dow Jones 103.90 127.32 115.37 64.71 52.84 53.36 86.25
Use NASDAQ 104.99 124.52 114.66 64.79 53.63 52.96 85.93
Use M1b 105.61 127.37 115.46 66.07 53.67 53.3 86.91
The proposed method 101.29 125.42 113.22 63.99 52.99 52.40 84.88

forecast using the models FTSGA in [47] and ACO-AR in [53] respectively. Section 1 introduces the difference of three models.
The experimental results are shown in Tables 7–11.
Table 7 shows the comparison of the RMSEs and the average RMSEs for forecasting the five indexes from 1999 to 2004.
Table 8 shows the comparison of the RMSEs and the average RMSEs for forecasting the five indexes from 1990 to 1999.
Table 9 shows the comparison of the DARs for forecasting the five indexes from 1999 to 2004. Table 10 shows the compari-
son of the DARs for forecasting the five indexes from 1990 to 1999.
According to Tables 7 and 8, comparing FTSGA and ACO-AR, our proposed method obtains four better average RMSEs in
five indexes, and performs not well for TAIEX in Table 7 and HSI in Table 8. From Tables 9 and 10, we can observe that,
comparing FTSGA and ACO-AR, our proposed method obtains four better average DARs in five indexes for stock indexes from
1999 to 2004. Our proposed method obtains four better average DARs in five indexes for stock indexes from 1990 to 1999.
On the whole, our method is more effective and stable.
Similarly, we also reported the average RMSEs and DARs of the two sections (1999–2004 and 1990–1999) of the five in-
dexes, as shown in Table 11. From Table 11, comparing FTSGA and ACO-AR, the proposed method obtains five better average
52 F. Ye et al. / Information Sciences 367–368 (2016) 41–57

Table 5
A comparison of the RMSEs of different methods on forecasting TAIEX from 1990 to 1999.

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 Average RMSEs

Conventional models [45] Average-based lengths 220.00 80.00 60.00 110.00 112.00 79.00 54.00 148.00 167.00 149.00 117.90
Distribution-based lengths 270.00 79.00 60.00 105.00 132.00 79.00 52.00 149.00 159.00 159.00 124.40
Weighted models [45] Average-based lengths 227.00 61.00 67.00 105.00 135.00 70.00 54.00 133.00 151.00 142.00 114.50
Distribution-based lengths 266.00 67.00 56.00 105.00 114.00 70.00 52.00 152.00 154.00 145.00 118.10
Chen and Chen’s method [30] Use Dow Jones 172.89 72.87 43.44 103.21 78.63 66.66 59.75 139.68 124.44 115.47 97.70
Use NASDAQ 169.93 66.12 49.61 104.75 75.66 67.01 60.90 140.86 144.13 119.32 99.83
Use Dow Jones & NASDAQ 172.99 74.85 43.78 101.38 78.13 68.14 61.26 139.29 132.94 116.64 98.94
AR (1) model [40] 178.08 77.23 85.34 101.23 90.85 82.38 78.63 146.22 144.53 116.84 110.13
AR (2) model [40] 198.24 65.38 85.30 113.40 97.16 73.63 57.85 174.09 135.21 128.15 112.84
Chen et al.’s method [49] Use Dow Jones 174.35 43.78 43.12 108.02 88.32 53.69 51.02 139.86 113.58 102.34 91.81
Use NASDAQ 176.17 43.16 43.34 106.66 87.95 53.30 51.10 138.41 113.88 102.11 91.61
Chen and Kao’s method [46] 156.47 56.50 36.45 126.45 105.52 62.57 51.50 125.33 104.12 87.63 91.25
FTSGA model [47] 175.80 44.27 42.62 102.33 78.45 56.36 48.80 134.68 112.96 102.86 89.91
ACO-AR model [53] 187.10 39.58 39.37 101.80 76.32 56.05 49.45 123.98 118.41 102.34 89.44
Chen et al.’s method [52] Use Dow Jones 180.36 43.8 43.06 104.89 75.35 55.06 50.06 133.82 112.11 103.9 90.24
Use NASDAQ 174.15 45.04 42.10 104.94 76.40 54.96 50.17 133.45 113.37 104.99 89.96
The proposed method 189.30 41.74 38.46 103.72 61.90 48.85 50.72 115.77 114.21 110.09 87.47

Table 6
A comparison of the average RMSEs of different methods in Tables
4 and 5.

Methods Average RMSEs

AR (1) model [40] 109.49


AR (2) model [40] 107.84
Chen and Chen’s method [30] 96.70
Chen et al.’s method [31] 89.31
Chen et al.’s method [49] 88.83
Chen and Kao’s method [46] 88.70
FTSGA model [47] 88.65
Chen et al.’s method [52] using Dow Jones 88.25
Chen et al.’s method [52] using NASDAQ 87.95
ACO-AR model [53] 87.10
The proposed method 86.18

Table 7
A comparison of the RMSEs of three methods for forecasting five indexes from 1999 to 2004.

Methods 1999 20 0 0 2001 2002 2003 2004 Average RMSEs

TAIEX The proposed method 101.29 125.42 113.22 63.99 52.99 52.40 84.88
FTSGA model [47] 102.74 126.68 115.79 65.56 57.40 56.10 87.38
ACO-AR model [53] 102.22 131.53 112.59 60.33 51.54 50.33 84.75
Dow Jones The proposed method 67.67 105.46 91.53 85.47 48.73 63.28 77.02
FTSGA model [47] 82.20 130.50 96.72 106.55 57.79 64.10 89.64
ACO-AR model [53] 84.29 130.21 90.01 102.70 57.22 55.97 86.73
NASDAQ The proposed method 45.43 105.22 32.46 23.71 19.39 14.02 40.04
FTSGA model [47] 49.14 104.49 35.77 26.63 22.05 16.36 42.41
ACO-AR model [53] 46.04 121.72 31.74 25.41 21.99 15.48 43.73
HSI The proposed method 220.87 240.31 159.27 95.01 127.21 81.98 154.11
FTSGA model [47] 216.87 246.12 163.33 104.28 129.00 101.29 160.15
ACO-AR model [53] 225.24 253.21 168.45 94.97 116.42 102.65 160.16
SP500 The proposed method 9.50 19.19 10.34 9.24 5.58 7.11 10.16
FTSGA model [47] 11.38 19.60 11.39 11.71 6.49 7.14 11.28
ACO-AR model [53] 10.14 19.17 9.945 11.43 6.364 6.413 10.58

RMSEs and DARs in five indexes. The results of RMSEs and DARs in Table 11 are visualized in Figs. 6 and 7, respectively.
From Figs. 6 and 7, we can clearly observe that the proposed model outperforms FTSGA and ACO-AR for the five indexes.
From the above comparisons, we can observe that, for the average RMSEs and the average DARs of five indices in different
period, there exists an index that the proposed method performs not well. Although our experimental results are not always
the best, most of them outperform the results of the latest models in [47] and [53]. Moreover, this type of phenomenon is
very common in this field of stock prediction because a particular prediction method is not necessarily suitable for all
types of stocks. Therefore, these additional experiments we performed can also offer evidence for the efficiency of our
proposed method. In particular, for the average RMSEs and Dars of different period, the proposed method outperforms
F. Ye et al. / Information Sciences 367–368 (2016) 41–57 53

Table 8
A comparison of the RMSEs of three methods for forecasting five indexes from 1990 to 1999.

Methods 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 Average RMSEs

TAIEX The proposed method 189.30 41.74 38.46 103.72 61.90 48.85 50.72 115.77 114.21 110.09 87.47
FTSGA model [47] 175.80 44.27 42.62 102.33 78.45 56.36 48.80 134.68 112.96 102.86 89.91
ACO-AR model [53] 187.10 39.58 39.37 101.80 76.32 56.05 49.45 123.98 118.41 102.34 89.44
Dow Jones The proposed method 20.43 30.32 15.99 16.81 26.99 30.98 47.83 79.14 91.66 84.26 44.44
FTSGA model [47] 21.19 29.41 16.30 18.09 27.53 30.93 46.89 82.60 94.51 82.09 44.95
ACO-AR model [53] 21.59 31.84 16.53 15.60 28.09 32.43 46.73 84.01 96.46 87.68 46.10
NASDAQ The proposed method 3.23 6.93 4.04 7.07 6.72 10.54 9.76 22.86 21.07 45.41 13.76
FTSGA model [47] 3.33 6.08 3.89 5.71 5.28 10.87 11.21 20.63 31.23 49.27 14.75
ACO-AR model [53] 3.06 5.86 3.73 4.31 5.24 11.58 9.64 18.44 29.68 46.65 13.82
HSI The proposed method 24.96 36.27 126.58 201.57 129.83 72.98 139.49 251.36 194.28 225.41 140.27
FTSGA model [47] 26.60 38.10 126.99 193.30 133.81 70.52 145.82 248.69 202.66 217.09 140.36
ACO-AR model [53] 24.77 36.49 138.13 192.48 128.84 69.74 136.02 260.03 196.92 217.66 140.11
SP500 The proposed method 2.78 4.094 2.24 2.22 2.69 2.73 6.80 10.44 11.44 9.50 5.49
FTSGA model [47] 2.70 3.93 1.95 2.02 2.74 3.67 5.64 9.64 13.04 11.38 5.67
ACO-AR model [53] 2.82 4.09 2.03 1.76 2.70 3.48 5.70 10.12 13.56 10.87 5.71

Table 9
A comparison of the DARs of three methods for forecasting five indexes from 1999 to 2004.

Methods 1999 20 0 0 2001 2002 2003 2004 Average DARs

TAIEX The proposed method 65.12 75.56 51.22 63.41 48.78 53.49 59.60
FTSGA model [47] 65.24 76.86 52.06 62.35 48.07 55.43 60.00
ACO-AR model [53] 72.22 81.08 52.63 63.16 50.00 52.50 61.93
Dow Jones The proposed method 65.85 69.21 55.36 69.23 61.54 58.54 63.29
FTSGA model [47] 56.47 57.79 57.87 64.41 56.68 60.37 58.93
ACO-AR model [53] 55.26 66.67 52.78 61.11 58.33 60.53 59.11
NASDAQ The proposed method 78.05 69.23 58.95 61.54 51.28 66.95 64.33
FTSGA model [47] 63.15 68.76 56.76 58.83 49.13 51.16 57.97
ACO-AR model [53] 63.16 66.67 58.33 58.33 55.56 52.63 59.11
HSI The proposed method 70.73 58.97 64.10 66.67 61.54 51.79 62.30
FTSGA model [47] 69.00 58.98 61.18 60.30 62.99 44.04 59.42
ACO-AR model [53] 71.05 55.56 58.33 63.89 66.67 48.72 60.70
SP500 The proposed method 63.51 58.97 53.85 64.10 61.54 53.00 59.16
FTSGA model [47] 59.96 60.20 51.04 58.17 52.23 51.86 55.58
ACO-AR model [53] 55.26 66.67 55.56 58.33 58.33 50.00 57.36

Table 10
A comparison of the DARs of three methods for forecasting five indexes from 1990 to 1999.

Methods 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 Average DARs

TAIEX The proposed method 55.56 55.44 70.45 54.34 63.00 53.19 47.92 54.28 60.47 74.42 58.91
FTSGA model [47] 55.31 57.86 60.92 58.12 53.77 58.31 44.86 50.08 59.10 65.31 56.37
ACO-AR model [53] 57.14 54.76 63.41 55.81 51.11 50.24 55.56 46.51 54.05 72.22 56.08
Dow Jones The proposed method 47.36 66.67 55.10 51.22 52.50 61.54 69.95 64.10 69.85 53.66 59.19
FTSGA model [47] 47.69 66.78 55.87 54.00 52.81 61.78 69.43 62.21 67.78 56.54 59.49
ACO-AR model [53] 47.22 66.67 64.86 52.63 56.76 55.56 69.44 66.67 62.16 55.26 59.72
NASDAQ The proposed method 60.10 71.79 72.55 73.17 62.50 58.97 64.10 66.67 75.95 78.05 68.39
FTSGA model [47] 61.91 68.24 76.43 83.83 67.48 55.61 66.08 64.46 49.74 63.14 65.69
ACO-AR model [53] 61.11 69.44 75.68 86.84 70.27 50.00 58.33 63.89 48.65 63.16 64.74
HSI The proposed method 57.41 56.41 68.29 69.05 77.50 76.32 58.97 56.41 43.90 68.29 63.26
FTSGA model [47] 63.59 53.72 66.89 70.71 72.79 75.86 57.49 56.35 44.92 68.97 63.13
ACO-AR model [53] 50.00 55.56 63.16 69.23 78.38 71.43 58.33 52.78 42.11 71.05 61.20
SP500 The proposed method 43.59 58.67 64.93 60.98 62.93 58.97 58.97 61.54 65.00 63.49 59.91
FTSGA model [47] 43.65 55.39 60.26 67.76 68.86 54.68 66.14 57.03 59.33 59.95 59.30
ACO-AR model [53] 41.67 58.33 64.86 65.79 67.57 50.00 75.00 55.56 51.35 55.26 58.54

other existing models. Additionally, the average Dars of different stock-indexes are all around or higher than 60% which is a
good forecasting result in this research.

6.2. NTD/USD exchange rate forecasting

Exchange rate forecasting is a challenging task in financial forecasting; therefore, our method was applied to the NTD/USD
exchange rate forecasting to show performance of the proposed method. To compare the experimental results of our pro-
posed method with the methods presented in [48] and [49], the historical data of the NTD/USD exchange rates from March
54 F. Ye et al. / Information Sciences 367–368 (2016) 41–57

Table 11
A comparison of the average RMSEs and DARs of three methods in Tables 9
and 10.

Indexes Methods Average RMSEs Average DARs

TAIEX The proposed method 86.18 59.26


FTSGA model [47] 88.65 58.19
ACO-AR model [53] 87.10 59.01
Dow Jones The proposed method 60.73 61.24
FTSGA model [47] 67.30 59.21
ACO-AR model [53] 66.42 59.42
NASDAQ The proposed method 26.90 66.36
FTSGA model [47] 28.58 61.83
ACO-AR model [53] 28.78 61.93
HSI The proposed method 147.19 62.78
FTSGA model [47] 150.25 61.28
ACO-AR model [53] 150.14 60.95
SP500 The proposed method 7.83 59.54
FTSGA model [47] 8.48 57.44
ACO-AR model [53] 8.15 57.95

The proposed method


FTSGA model [47]
ACO-AR model [53]

TAIEX Dow Jones NASDAQ HIS SP500


Fig. 6. A comparison of the average RMSEs of the three methods on forecasting five indexes.

1, 2006 to March 1, 2007 are used as the verification dataset and the data form March 1, 2006 to October 26, 2006 are
used as the training data set. The data from October 27, 2006, to March 1, 2007 are used as the testing data set to verify
the performance of the different models. Moreover, we also applied the proposed method for three-day forecasting, five-day
forecasting, and seven-day forecasting which is proposed in [48]. The mean squared error (MSE) is used to evaluate the
performance of the proposed method, which is defined as follows:

n
MSE = (Predictt − Realt )2 /n (9)
t=1

where n denotes the number of dates needed to be forecasted.


Table 12 shows the comparison of the average MSEs for different methods on forecasting the NTD/USD exchange rates
from March 1, 2006 to March 1, 2007. Similar to the experiments in index forecasting, we averaged the four predicted values
of different methods, as shown in Table 13.
From Table 12, we can observe that for one-day, three-day, five-day and seven-day forecasting, our proposed method
achieves the best result and outperforms all the other models. This is enough to demonstrate the efficiency of our proposed
method. Moreover, Table 13 shows that our proposed method has the smallest average MSE among all of the methods. That
is, the proposed method has obvious advantages compared with the methods presented in [48] and [49].

7. Conclusions

In this paper, a novel financial forecasting model is presented based on multi-order fuzzy time series, technical analy-
sis, and a genetic algorithm. The proposed model exploits a genetic algorithm to iteratively seek for the optimal partition
method of the universe of discourse. Compared with traditional fuzzy time series models of this field, in addition to the
first-order fuzzy time series, we use a hybrid multi-order fuzzy time series to forecast financial time series. Technical anal-
ysis is used to construct multivariate fuzzy time series to show the effect of different technical indicators. Exponential
F. Ye et al. / Information Sciences 367–368 (2016) 41–57 55

The proposed method

FTSGA model [47]

ACO-AR model [53]

TAIEX Dow Jones NASDAQ HIS SP500


Fig. 7. A comparison of the average DARs of the three methods on forecasting five indexes.

Table 12
A comparison of the average MSEs for different methods on forecasting the exchange rates.

Methods Average MSEs

One-day forecasting Random walk (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.011100
Radial basis function neural network (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.035900
Leu et al.’s method (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.0 0650 0
Two-factors second-order fuzzy-trend logical relationship groups and PSO techniques [49] Use PY/USD 0.004954
Use RW/USD 0.0 0510 0
Use NY/USD 0.004961
Use TAIEX 0.004869
Average 0.004971
The proposed method 0.004670
Three-day forecasting Random walk (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.037600
Radial basis function neural network (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.077200
Leu et al.’s method (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.028300
Two-factors second-order fuzzy-trend logical relationship groups and PSO techniques [49] Use PY/USD 0.005668
Use RW/USD 0.004963
Use NY/USD 0.004727
Use TAIEX 0.004759
Average 0.005029
The proposed method 0.004664
Five-day forecasting Random walk (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.058200
Radial basis function neural network (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.0870 0 0
Leu et al.’s method (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.050100
Two-factors second-order fuzzy-trend logical relationship groups and PSO techniques [49] Use PY/USD 0.004986
Use RW/USD 0.004967
Use NY/USD 0.004954
Use TAIEX 0.005493
Average 0.0 0510 0
The proposed method 0.004666
Seven-day forecasting Random walk (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.1010 0 0
Radial basis function neural network (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.0650 0 0
Leu et al.’s method (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.053200
Two-factors second-order fuzzy-trend logical relationship groups and PSO techniques [49] Use PY/USD 0.005675
Use KRW/USD 0.005758
Use CNY/USD 0.005936
Use TAIEX 0.005979
Average 0.005837
The proposed method 0.004665

smoothing method is used to filter the noise in the time series. By using hybrid multi-order fuzzy time series, our model
is proved to be more stable and efficient, and with different evaluation standards, it performs well on different stock in-
dexes. Compared with different fuzzy time series prediction models, our experiments demonstrate that the proposed model
significantly outperforms other models and has excellent generalization ability.
Our contributions are shown as follows:
56 F. Ye et al. / Information Sciences 367–368 (2016) 41–57

Table 13
A comparison of the average MSEs of the four predicted values for different methods in Table 12.

Methods Average MSEs

Random walk (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.051975
Radial basis function neural network (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.066275
Leu et al.’s method (use the combination of JPY/USD,KRW/USD,CNY/USD and TAIEX) [48] 0.034525
Two-factors second-order fuzzy-trend logical relationship groups and PSO techniques [49] 0.005234
The proposed method 0.004666

1. The directional accuracy rate (DAR) of a forecasting model is selected as one way of performance measures. As far as we
know, this is the first work that measures the performance of different models according to DAR and RMSE.
2. Exponential smoothing is used to eliminate noise in the time series.
3. Multi-order and multivariate fuzzy time series are combined to forecast financial time series.
4. The proposed method is a novel forecasting model and performs very well for different financial time series.

The future work is to find a better partition method of the optimal fuzzy subset and to apply the proposed method for
solving other forecasting problems.

Acknowledgment

The authors thank the anonymous referees for their helpful comments and suggestions which contributed to the im-
provement of the presentation and the contents of this paper. This work has been supported by the National Nature Science
Foundation of China (grant no. 61272003) and the Major Program of The National Social Science Foundation of China(grant
no. 13&ZD148).

References

[1] Q. Song, B.S. Chissom, Fuzzy time series and its models, Fuzzy Sets Syst. 54 (1993) 269–277.
[2] Q. Song, B.S. Chissom, Forecasting enrollments with fuzzy time series, Part I, Fuzzy Sets Syst. 54 (1993) 1–9.
[3] S.M. Chen, Forecasting enrollments based on fuzzy time series, Fuzzy Sets Syst. 81 (1996) 311–319.
[4] J.R. Hwang, S.M. Chen, C.H. Lee, Handling forecasting problems using fuzzy time series, Fuzzy Sets Syst. 100 (1998) 217–228.
[5] M.H. Lee, R. Efendi, Z. Ismail, Modified Weighted for Enrollment Forecasting Based on Fuzzy Time Series, MATEMATIKA 25 (2009) 67–78.
[6] K.H. Huarng, Ratio-based lengths of intervals to improve fuzzy time series forecasting, IEEE Trans. Syst. Man. Cybern. Part B Cybern. 36 (2006)
328–340.
[7] H.J. Teoh, C.H. Cheng, H.H. Chu, Fuzzy time series model based on probabilistic approach and rough set rule induction for empirical research in stock
markets, Data Knowl. Eng. 67 (2008) 103–117.
[8] T.A. Jilani, S.M.A. Burney, A refined fuzzy time series model for stock market forecasting, Physica A 387 (2008) 2857–2862.
[9] H.K. Yu, A refined fuzzy time-series model for forecasting, Physica A, Stat. Theor. Phys. 346 (2005) 657–681.
[10] C.H. Aladag, M.A. Basaran, E. Egrioglu, Forecasting in high order fuzzy time series by using neural networks to define fuzzy relations, Expert Syst. Appl.
36 (2009) 4228–4231.
[11] C.H. Aladag, U. Yolcu, E. Egrioglu, A high order fuzzy time series forecasting model based on adaptive expectation and artificial neural networks, Math.
Comput. Simul 81 (2010) 875–882.
[12] U. Yolcu, E. Egrioglu, V.R. Uslu, A new approach for determining the length of intervals for fuzzy time series, Appl. Soft Comput. 9 (2009) 647–651.
[13] E. Egrioglu, C.H. Aladag, U. Yolcu, A new hybrid approach based on SARIMA and partial high order bivariate fuzzy time series forecasting model, Expert
Syst. Appl. 36 (2009) 7424–7434.
[14] E. Egrioglu,C.H. Aladag, U. Yolcu, Fuzzy time series forecasting method based on Gustafson-Kessel fuzzy clustering, Expert Syst. Appl. 38 (2011)
10355–10357.
[15] S.M. Chen, N.Y. Chung, Forecasting enrollments using high-order fuzzy time series and genetic algorithms, Int. J. Intell. Syst. 21 (2006) 485–501.
[16] T.L. Chen, C.H. Cheng, H.J. Teoh, Fuzzy time-series based on Fibonacci sequence for stock price forecasting, Physica A 380 (2007) 377–390.
[17] C.H. Cheng, T.L. Chen, L.Y. Wei, A hybrid model based on rough sets theory and genetic algorithms for stock price forecasting, Inf. Sci. 180 (2010)
1610–1629.
[18] S.M. Chen, N.Y. Wang, J.S. Pan, Forecasting enrollments using automatic clustering techniques and fuzzy logic relationships, Expert Syst. Appl. 36
(2009) 11070–11076.
[19] S.M. Chen, K. Tanuwijaya, Multivariate fuzzy forecasting based on fuzzy time series and automatic clustering techniques, Expert Syst. Appl. 38 (2011)
10594–10605.
[20] C.H. Cheng, G.W. Cheng, J.W. Wang, Multi-attribute fuzzy time series method based on fuzzy clustering, Expert Syst. Appl. 34 (2008) 1235–1242.
[21] N.Y. Wang, S.M. Chen, Temperature prediction and TAIFEX forecasting based on automatic clustering techniques and two-factor high-order fuzzy time
series, Expert Syst. Appl. 36 (2009) 2143–2154.
[22] S.T. Li, Y.C. Cheng, S.Y. Lin, A FCM-based deterministic forecasting model for fuzzy time series, Comput. Math. Appl. 56 (2008) 3052–3063.
[23] C.D. Kirkpatrick, J.R. Dahlquist, Technical Analysis, The Complete Resource for Financial Market Technicians, Financial Times Press, 2006 ISBN
0-13-153113-1.
[24] M. Stevenson, E.P. John, Fuzzy time series forecasting using percentage change as the universe of discourse, world academy of science, Eng. Technol.
55 (2009) 154–157.
[25] L.W. Lee, L.H. Wang, S.M. Chen, Y.H. Leu, Handling forecasting problem based on two-factors high-order fuzzy time series, IEEE Trans. Fuzzy Syst. 14
(1996) 468–477.
[26] K.H. Huarng, H.K. Yu, Y.W. Hsu, A multivariate heuristic model for fuzzy time-series forecasting, IEEE Trans. Syst. Man Cybern. Part B Cybern. 37 (2007)
836–846.
[27] H.K. Yu, K.H. Huarng, A bivariate fuzzy time series model to forecast the TAIEX, Expert Syst. Appl. 34 (2008) 2945–2952.
[28] H.K. Yu, K.H. Huarng, Corrigendum to A bivariate fuzzy time series model to forecast the TAIEX, Expert Syst. Appl. 37 (2010) 5529.
[29] S.M. Chen, Y.C. Chang, Multi-variable fuzzy forecasting based on fuzzy clustering and fuzzy interpolation techniques, Inf. Sci. 180 (2010) 4772–4783.
[30] S.M. Chen, C.D. Chen, TAIEX forecasting based on fuzzy time series and fuzzy variation groups, IEEE Trans. Fuzzy Syst. 19 (2011) 1–12.
F. Ye et al. / Information Sciences 367–368 (2016) 41–57 57

[31] S.M. Chen, G.M.T. Manalu, S.C. Shih, T.W. Sheu, H.C. Liu, A new method for fuzzy forecasting based on two-factors high-order fuzzy-trend logical rela-
tionship groups and particle swarm optimization techniques, in: Proceedings of 2011 IEEE International Conference on Systems, Man, and Cybernetics,
Anchorage, Alaska, 2011, pp. 2301–2306.
[32] M.J. Kim, S.H. Min, I. Han, An evolutionary approach to the combination of multiple classifiers to predict a stock price index, Expert Syst. Appl. 32
(2006) 241–247.
[33] E. Egrioglu, C.H. Aladag, U U. Yolcu, V.R. Uslu, A new approach based on artificial neural networks for high order multivariate fuzzy time series, Expert
Syst. Appl. 36 (2009) 10589–10594.
[34] M. Avazbeigi, S.H.H. Doulabi, B. Karimi, Choosing the appropriate order in fuzzy time series: a new N-factor fuzzy time series for prediction of the
auto industry production, Expert Syst. Appl. 37 (2010) 5630–5639.
[35] J. Park, D.J. Lee, C.K. Song, M.G. Chun, TAIFEX and KOSPI 200 forecasting based on two-factors high-order fuzzy time series and particle swarm
optimization, Expert Syst. Appl. 37 (2010) 959–967.
[36] D.F. Zhang, Q.S. Jiang, X. Li, Application of neural networks in financial data mining, Int.J. Comput. Intell. 1 (2004) 106–109.
[37] R.G. Brown, Exponential Smoothing for Predicting Demand, Cambridge, Massachusetts, 1956, p. 15. Arthur D. Little Inc.
[38] M.T. Leung, H. Daouk, A.S. Chen, Forecasting stock indices: a comparison of classification and level estimation models, Int. J. Forecast. 16 (20 0 0)
173–190.
[39] K. Huarng, H.K. Yu, The application of neural networks to forecast fuzzy time series, Physica A 363 (2006) 481–491.
[40] J. Sullivan, W.H. Woodall, A comparison of fuzzy forecasting and Markov modeling, Fuzzy Sets Syst. 64 (1994) 279–293.
[41] R. Lakshman Naik, D. Ramesh, B. Manjula, D.A. Govardhan, Prediction of Stock Market Index Using Genetic Algorithm, Comput. Eng. Intell. Syst. 3
(2012) 162–172.
[42] Y. Hiemstra, A stock market forecasting support system based on fuzzy logic, in: Proceedings of the Twenty-Seventh Hawaii International Conference
on System Sciences, 3, 1994, pp. 281–287.
[43] A. Kazem, E. Sharifi, F.K. Hussain, M. Saberi, O.K. Hussain, Support vector regression with chaos-based firefly algorithm for stock market price fore-
casting, Appl. Soft Comput. 13 (2013) 947–958.
[44] J.J. Wang, J.Z. Wang, Z.G. Zhang, S.P. Guo, Stock index forecasting based on a hybrid model, Omega 40 (2012) 758–766.
[45] H.K. Yu, Weighted fuzzy time-series models for TAIEX forecasting, Physica A 349 (2004) 609–624.
[46] S.M. Chen, P.K. Kao, TAIEX forecasting based on fuzzy time series, particle swarm optimization techniques and support vector machines, Inf. Sci. 247
(2013) 62–71.
[47] Q.S. Cai, D.F. Zhang, B. Wu, C.H. Leung, A novel stock forecasting model based on fuzzy time series and genetic algorithm, Proc. Comput. Sci. 18 (2013)
1155–1162.
[48] Y. Leu, C.P. Lee, Y.Z. Jou, A distance-based fuzzy time series model for exchange rates forecasting, Expert Syst. Appl. 36 (2009) 8107–8114.
[49] S.M. Chen, J.S. Pan, H.C. Liu, Fuzzy forecasting based on two-factors second-order fuzzy-trend logical relationship groups and particle swarm optimiza-
tion techniques, IEEE Trans. Cybern. 43 (2013) 1102–1117.
[50] L.Y. Wei, C.H. Cheng, H.H. Wu, A hybrid ANFIS based on n-period moving average model to forecast TAIEX stock, Appl. Soft Comput. 19 (2014) 86–92.
[51] M.Y. Chen, B.T. Chen, A hybrid fuzzy time series model based on granular computing for stock price forecasting, Inf. Sci. 294 (2015) 227–241.
[52] S.M. Chen, S.W. Chen, Fuzzy forecasting based on two-factors second-order fuzzy-trend logical relationship groups and the probabilities of trends of
fuzzy logical relationships, IEEE Trans. Cybern. 45 (2015) 405–417.
[53] Q.S. Cai, D.F. Zhang, W. Zheng, C.H. Leung, A new fuzzy time series forecasting model combined with ant colony optimization and auto-regression,
Knowl. Based Syst. 74 (2015) 61–68.
[54] S.H. Wan, D.F. Zhang, Yain-Whar Si, Evolutionary computation with multi-variates hybrid multi-order fuzzy time series for stock forecasting, in: Inter-
national Conference on Computational Science and Engineering, 2014, pp. 217–223.

You might also like