Download
Download
Download
net/publication/232895023
CITATIONS READS
6 522
1 author:
Pawan Lingras
Saint Mary's University
220 PUBLICATIONS 5,006 CITATIONS
SEE PROFILE
All content following this page was uploaded by Pawan Lingras on 21 May 2014.
Pawan Lingras
Abstract
This paper describes rough neural networks which consists of a combination of rough
neurons and conventional neurons. Rough neurons use pairs of upper and lower bounds
as values for input and output. In some practical situations, it is preferable to develop
prediction models that use ranges as values for input and/or output variables. A need to
provide tolerance ranges is an example of such a situation. Inability to record precise
values of the variables is another situation where ranges of values must be used. In the
example used in this study, a number of input values are associated with a single value of
the output variable. Hence, it seems appropriate to represent the input values as ranges.
The predictions obtained using rough neural networks are significantly better than the
conventional neural network model.
2
1. Introduction
The concept of upper and lower bound has been used in a variety of applications
in artificial intelligence (Shafer 1976; Pawlak 1982). In particular, theory of rough sets
(Pawlak, 1992, 1984) has demonstrated the usefulness of upper and lower bounds in rule
generation. Further developments in rough set theory (Polkowski, 1994; Wong, 1994;
Yao, et al, 1994), have shown that the general concept of upper and lower bounds provide
a wider framework that may be useful for different types of applications. This paper uses
rough patterns for predictions using neural networks. Each value in a rough pattern is a
pair of upper and lower bound. Conventional neural network models generally use a
precise input pattern in their estimations. The conventional neural network models need
to be modified to accommodate rough patterns. Rough neurons proposed in this paper
provide an ability to use rough patterns. Each rough neuron stores the upper and lower
bounds of the input and output values. Depending upon the nature of the application, two
rough neurons in the network can be connected to each other using either two or four
connections. A rough neuron can also be connected to a conventional neuron using two
connections. A rough neural network consists of a combination of rough and
conventional neurons connected each other. The paper outlines procedures for
feedforward and backpropagation in a rough neural network.
The paper also compares two different rough neural network models with a
conventional neural network model for prediction of the design hourly traffic volume
(DHV) for a highway section. The prediction is based on traffic volumes recorded over a
short period of time. The input to the network consists of traffic volumes for each day of
the week, i.e. Sunday, Monday, Tuesday, ..., Saturday, over the given time period. There
are several Mondays in the data collection period. Hence, the traffic volume for a Monday
cannot be a single value but must be a set of values. The conventional neural network
alternative uses the average of all the values for each Monday. Similar argument applies
to the rest of the days of the week. The use of average values tends to ignore some of the
available information. The rough neural network models use rough input pattern
consisting of upper and lower bounds of daily traffic volumes.
3
2. Overview
This section briefly reviews some of the essential concepts of neural networks. A
brief description of highway data collection and analysis program is also provided.
In the testing stage, the network is tested for another set of examples for which the
output from the output layer neurons is known. After the neural net model is tested
successfully, it is used for predictions.
s s s s s s
r r r r r r
to both s and s . If a rough neuron r is fully connected to s, then there are four
connections from r to s. In Fig. 1(b) and 1(c), there only two connections from r to s. If
the rough neuron r excites the activity of s (i.e. increase in the output of r will result in the
increase in the output of s), then r will be connected to s as shown in Fig. 1(b). On the
other hand, if r inhibits the activity of s (i.e. increase in the output of r corresponds to the
decrease in the output of s), then r will be connected to s as shown in Fig. 1(c).
This paper uses multi-layered, feed-forward, and backpropagation design outlined
in section 2.1 to describe the methodology of rough neural networks. Rough neural
networks used in this study consist of one input layer, one output layer and one hidden
layer of rough/conventional neurons. The input layer neurons accept input from the
external environment. The outputs from input layer neurons are fed to the hidden layer
neurons. The hidden layer neurons feed their output to the output layer neurons which
send their output to the external environment. The output of a rough neuron is a pair of
upper and lower bounds, while the output of a conventional neuron is a single value.
The input of a conventional, lower, or upper neuron is calculated using the
weighted sum as:
(1)
input i = ∑w ji× output j ,
there is a connection from j to i
6
where i and j are either the conventional neurons or upper/lower neurons of a rough
neuron. The outputs of a rough neuron r is calculated using a transfer function as:
( )
output r = max transfer (input r ), transfer (input r ) , (2)
( )
output r = min transfer (input r ), transfer (input r ) . (3)
1 (5)
transfer (u) = − gain× u ,
1+ e
where gain is a system parameter determined by the system designer to specify the slope
of the sigmoid function around input value of zero. There are several other functions for
determining the output from a neuron. The sigmoid transfer function is chosen because it
produces a continuous value in the 0 to 1 range.
If two rough neurons are partially connected, then the excitatory or inhibitory
nature of the connection is determined dynamically by polling the connection weights.
The network designer can make initial assumptions about the excitatory or inhibitory
nature of the connections. If a partial connection from a rough neuron r to another rough
neuron s is assumed to be excitatory and wr s < 0 and wr s < 0 , then the connection from
are enabled. On the other hand, if the neuron r is assumed to have an inhibitory partial
connection to s and wr s > 0 and wr s > 0 , then the connection between rough neuron r and
7
where transfer ′( input i ) is the derivative of the transfer function evaluated at inputi and α
is the learning parameter which represents the speed of learning. For the sigmoid transfer
function used in this study,
As mentioned before, in the testing stage, the network is tested for another set of
examples for which the output from the output layer neurons is known. After the neural
net model is tested successfully, it is used for predictions.
Hidden Layer
Input Layer
Fig. 2. The conventional Neural Network Model Used in the Estimation of DHV
9
Fig. 2 shows the conventional neural network model used for the estimation. The
conventional model has seven input neurons, four hidden layer neurons and one output
neuron. Neurons in the input layer are fully connected to neurons in the hidden layer.
Neurons in the hidden layer are fully connected to the neuron in the output layer. The
input to the conventional neural network model consists of average weekly pattern, i.e.
average daily volumes on Sundays, Mondays, Tuesdays, ..., Saturdays for an object. The
output is the DHV for the object.
The first rough neural network model shown in Fig. 3 has seven rough input
neurons, and eight hidden layer conventional neurons and one output neuron. Rough
neurons in the input layer are fully connected to conventional neurons in the hidden layer.
Conventional neurons in the hidden layer are fully connected to the conventional neuron
in the output layer. Since the hidden and output layer neurons are conventional neurons,
this network can be easily implemented using existing neural network packages such as
Stuttgart Neural Network Simulator (SNNS) (Zell, et al.).
Output Layer
Hidden Layer
Input Layer
Fig. 3. The Rough Neural Network Model Hidden Layer Conventional Neurons
10
Output Layer
Hidden Layer
Input Layer
Fig. 4. The Rough Neural Network Model With Hidden Layer Rough Neurons
The second rough neural network model shown in Fig. 4 has seven rough input
neurons, and four hidden layer rough neurons and one output neuron. Rough neurons in
the input layer are fully connected to rough neurons in the hidden layer. Rough neurons in
the hidden layer are fully connected to the conventional neuron in the output layer. The
rough network shown in Fig. 4 can also be implemented using SNNS. However it was
necessary to add two activation functions to implement eq. (2) and eq. (3).
The input to both the rough neural network models consists of rough weekly
pattern, i.e. upper and lower bounds of daily volumes on Sundays, Mondays, Tuesdays,
..., Saturdays for an object. The output is the DHV for the object. Since the output is a
unique value, the output layer for both the models used a conventional neuron.
Both the networks were trained using 211 objects from the training set and tested
using 53 objects in the test set. Errors in estimation for the test set may originate from two
sources. One of the sources of errors is the sampling process. The number of patterns in
training and test sets might be very small, or the samples may not provide a good
representation of the universe. The other source of error is the estimation method itself.
In order to get an indication of the errors from these two different sources, the
conventional and rough neural networks were tested for training set as well as the test set.
Testing the models using the training set indicates how well the training method works by
itself.
The values of estimated and actual values of DHV are compared using the
following percent difference measure.
11
estimated - actual
∆= × 100
actual
where
∆ = percent error
actual = actual DHV
estimated = estimated DHV
The maximum and average errors for each set are used to compare the results of
estimation. The average error provides a measure of the overall accuracy, while the
maximum error describes the worst case.
0.5
0
0 5000 10000 15000 20000 25000
Iterations
Fig. 5 Sum of Squares of Errors for Conventional and Rough Neural Networks
The first rough neural network with conventional neurons in the hidden layer
results in somewhat higher errors than the second rough neural network which uses rough
neurons in the hidden layer. This indicates that use of rough neurons in place of
conventional neurons where ever possible will improve the performance of the neural
networks. Another interesting observation from the experiment is the change in the sum
of squared errors during the training process. Fig. 5 shows reduction of errors during the
training process for all the three networks. The reduction in errors is more dramatic for
rough neural networks than the conventional network. The errors for second rough neural
network are consistently lower than the first neural network confirming earlier
observation regarding the use of hidden layer rough neurons.
13
References
DeGarmo, E.P., Sullivan, W.G. and Canada, J.R. 1984. Engineering Economy,
Macmillan Publishing Co., New York, N.Y., pp. 264-266.
Garber, N.J. and Hoel, L.A. 1988. Traffic and Highway Engineering, West Publishing
Co., New York, N.Y., pp. 97-118.
Hecht-Nielsen, R. Neurocomputing, Addison-Wesley Publishing, Don Mills, Ontario.
Lingras, P.J. and Adamo, M. (1995). Estimation of AADT Volume Using Neural
Networks, Computing in Civil and Building Engineering, Pahl & Werner (Eds.),
1355-1362.
Pawlak, Z. (1982). Rough sets, International Journal of Information and Computer
Sciences, 11, pp. 145-172.
Pawlak, Z. (1984). Rough classification, International Journal of Man-Machine Studies,
20, pp. 469-483.
14
Polkowski, L., Skowron, A. and Zytkow, J. (1994). Rough Foundations for Rough Sets,
Conference Proceeding of Third International Workshop on Rough Sets and Soft
Computing, November 10-12, San Jose, California, pp. 142-149.
Shafer, G. (1976). A Mathematical Theory of Evidence, Princeton University Press,
Princeton, New Jersey.
Sharma S.C. and Allipuram, R.R. 1993. Duration and Frequency of Seasonal Traffic
Counts, Journal of Transportation Engineering, American Society of Civil
Engineers, 116, 3, pp. 344-359.
Sharma S.C. and Werner, A. 1991. Improved Method of Grouping Provincewide
Permanent Traffic Counters, Transportation Research Record 815, Transportation
Research Board, Washington D.C., pp. 13-18.
White, H. 1989. Neural Network Learning and Statistics, AI Expert, 4, 12, 48-52 .
Wong, S.K.M. (1994). Rough Sets and Extended Models, invited paper in the Third
International Workshop on Rough Sets and Soft Computing, November 10-12, San
Jose, California.
Yao, Y.Y., Li X., Lin, T.Y. and Liu, Q. (1994). Representation and Classification of
Rough Set Models, Conference Proceeding of Third International Workshop on
Rough Sets and Soft Computing, November 10-12, San Jose, California,
pp. 630-637.
Zell, et al. (1995). Stuttgart Neural Network Simulator: User Manual Version 4.0,
University of Stuttgart, Institute of Parallel and Distributed High Performance
Systems, Report No. 6/95.