Radial Basis Network: An Implementation OF Adaptive Centers: Nivas Durairaj Final Project For ECE539

RADIAL BASIS NETWORK:
AN IMPLEMENTATION
OF
ADAPTIVE CENTERS
Nivas Durairaj
Final Project for ECE539
Table of Contents
(ctrl+click to follow contents)
TABLE OF CONTENTS...............................................................................................................................2
LIST OF FIGURES........................................................................................................................................3
INTRODUCTION..........................................................................................................................................4
BACKGROUND.............................................................................................................................................4
METHODOLOGY & DEVELOPMENT OF PROGRAM........................................................................5
Adaptation Formulas..............................................................................................................................6
TESTING & COMPARISON OF RESULTS............................................................................................10
SINUSOID FUNCTION TESTING....................................................................................................................12
PIECEWISE-LINEAR FUNCTION...................................................................................................................14
POLYNOMIAL FUNCTION............................................................................................................................16
CONCLUSION OF RESULTS....................................................................................................................18
APPENDIX...................................................................................................................................................19
MANUAL FOR RBN_ADAPTIVE.M..............................................................................................................20
MANUAL FOR RBN_FIXED_SELFGEN.M......................................................................................................25
DERIVATION OF PARTIAL DERIVATIVES (ADAPTIVE RBF NETWORK)........................................................28
Linear Weights Partial Derivative Term...............................................................................................28
Positions of Centers Partial Derivative Term (hidden layer)...............................................................28
Spreads of Centers Partial Derivative Term(hidden layer)..................................................................29
EXCEL SPREADSHEET DATA FOR SINUSOIDAL, POLYNOMIAL,...................................................................30
& PIECEWISE LINEAR FUNCTIONS.............................................................................................................30
REFERENCES.............................................................................................................................................32
List of Figures
(ctrl+click to follow contents)
Figure 1: An RBF network with one output........................................................................4

Figure 2: An RBF network with multiple outputs...............................................................5
Figure 3: Training Set Plot from Trainset1.txt...................................................................10
Figure 4: Output with 3 Radial Basis Function Inputs......................................................10
Figure 5: Output with 2 Radial Basis Functions................................................................11
Figure 6:RBF network output (Sinusoid Function) with 7 Radial Basis Functions..........12
Figure 7: Sinosoid Function Cost Function Output...........................................................13
Figure 8:Adaptive RBF Network with 10 Radial Basis Functions....................................14
Figure 9: Adaptive RBF Network with 6 Radial Basis Functions.....................................14
Figure 10: Piecewise-Linear Cost Function Output..........................................................15
Figure 11:Adaptive center RBF network for Polynomical Function (6 Radial Basis
Functions)..................................................................................................................16
Figure 12: Polynomial Cost Function Output....................................................................16
Introduction
What neural network model has the same benefits as a feedforward neural
network? Of course, it is the Radial Basis Function Network. Similar to feedforward
networks such as backpropagation and multilayer perceptron, the radial basis function
network aids us in function approximation, classification, and modeling of dynamic
systems. They have actually been used to produce results in stock market prediction and
speech recognition.
I chose to implement my Intro to Artificial Neural Networks project on RBFs
(Radial Basis Functions) because they are still an active research area and there is a lot to
be learned from them. These functions were first introduced in the solution of
multivariate interpolation problems and now it is one of the main fields of research in
numerical analysis. Since I was well acquainted with simple feedforward networks, I
decided to implement an adaptive center RBF. In addition, I have some interest in
Economics. The thought of producing an algorithm that could help predict the stock
market was very appealing to me.
Background
In its most basic form, an RBF consists of three layers with entirely different
roles. The input layer is made up of nodes that connect the network to its environment.
The second layer is the hidden layer of neurons. At the input of each neuron, the distance
between the neuron center and the input vector is calculated. By applying the radial basis
function (Gaussian bell function) to this distance, the output of the neuron is formed.
Figure 1: An RBF network with one output
Figure 2: An RBF network with multiple outputs
The last layer is the output layer. It is linear and supplies the response of the
network to the activation pattern. The rationale of a nonlinear transformation followed
by a linear transformation can be justified in a paper by Cover. [1] A pattern-classification
problem is more likely to be linearly separable in high-dimensional space. Therefore, this
is the reason for making the dimension of the hidden space in an RBF network high. It is
also important to note that the higher the dimension of the hidden space, the more
accurate it will be in smoothing the input-output mapping.
Radial basis functions have different learning strategies in the way they approach
a problem. Their linear weights tend to evolve on a different time scale compared to the
nonlinear activation function. Thus, to optimize the layers, it is best to operate on
different time scales. The different learning strategies depend mostly on changing how
the centers of the radial-basis functions of the network are specified. My project is based
on the particular learning strategy known as supervised selection of centers. Such a RBF
network is founded on the interpolation theory.
The easiest approach is to assume fixed radial-basis functions when defining the
activation functions of the hidden units. However, with additional computations, one can
create an RBF network whose centers of functions undergo a supervised learning process.
Methodology & Development of Program

In developing such a system, the first step should be to develop a cost function as
shown below. The cost function is implemented using a gradient-descent procedure that
represents a generalization of the least means squares algorithm. Least Mean Squares
(LMS) algorithm is widely used to determine the transfer function of an unknown system.
By using inputs and outputs of that system, the LMS algorithm is applied in an adaptive
process based on the minimum mean squares error.
E
1 N 2
ej
2 j 1
Cost function
e j d j F * ( x j )
M
d j wi G ( x j t i
i 1
Ci
N is the size of the training sample, ej is the error signal and || . ||2 is the Euclidean
Distance or norm.
Ej consists of Greens function. The basic idea of a Greens function is to play an
important role in the solution of linear ordinary and partial differential equations. They
are also a key component in the development of integral equation methods.
G( x j t i
Ci
) exp(1 * ( x j t i ) t * C it * C i * ( x j t i ))
We can substitute
Greens function
Cit * Ci 0.5 * i1 where i1 is the inverse covariance
matrix. x j is training set sample j and t i is the ith cluster center.

Finally, here is the Greens function I used to produce the RBF network.
G( x j ti
Ci
) exp(0.5 * ( x j t i ) t * i1 *( x j t i ))
As you can see, it represents a multivariate Gaussian distribution with mean

vector ti and covariance matrix
. The vectors and matrix span the space R
where m
is the feature dimension of t and x. Thus, the Greens function results in a single number.
Ex. 1xm vector*mxm matrix*mx1 vector gives 1x1 number.
As seen from above, we need to find the parameters, wi, ti, and i such that it
minimizes the cost function. The adaptation formulas for the linear weights, positions,
and spreads of centers of RBF networks are given below. I was able to get this
information from Haykin on page 303. The derivations for the partial derivatives are
given in the appendix. [1]
1
Adaptation Formulas
1. Linear weights (output layer)
N
E (n)
e j ( n)G ( x j t i
wi (n) j 1
Ci
wi (n 1) wi (n) 1
E (n)
where i = 1, 2..c
wi (n)
2. Positions of centers (hidden layer)

N
E (n)
2 * wi (n) * e j (n)G ( x j t i
t i (n)
j 1
Ci
) i1 [ x j t i (n)]
results in a 1xm vector where m is the feature dimension of t and x.

1
Ex. [1x1*mxm (matrix (
)) * mx1 (vector( [ x j ti ( n)] ))]
i
t i (n 1) t i (n) 2
E (n)
t i ( n)
where i=1, 2..c
3. Spreads of centers (hidden layer)

N
E (n)
wi (n) * e j (n)G ( x j t i
i1 (n)
j 1
Ci
)[ x j t i (n)][ x j t i (n)]t
results in a mxm matrix where m is the feature dimension of t and x.

t
Ex. [1x1*PxP ] [ x j ti (n)][ x j ti (n)] is equivalent to multiplying a mx1
vector and 1xm vector(in this case the transpose) to create a mxm matrix.
i1 (n 1) i1 (n) 3
E (n)
i1 (n)
where i = 1, 2..c
Note: c is the number of radial basis functions used.

To calculate the linear weights, I first had to calculate Greens function which
output a single number. Then I found the new wi by substituting the old wi.
%Calculation of linear weights
weightdiff=0;
for j=1:n
g=exp(-0.5*((x(j,:)-t(i,:)))*covinv(:,:,i)*((x(j,:)-t(i,:))'));
weightdiff = weightdiff + e(j)*g;
end
w(i)=w(i) - (eta1*weightdiff); %single number
The positions of centers were also computed in a similar way. However, ti was
going to be a vector that spans Rm where m is the feature dimension.
7
%Calculation of positions of centers(hidden layers)

postdiff=0;
for j=1:n
postdiff = postdiff + (e(j)*g*covinv(:,:,i)*(x(j,:)-t(i,:))');
end
t(i,:)=t(i,:)-(eta2*2*w(i)*postdiff)'; %1xm vector
Spreads of centers were output in matrix form which was expected as the
updating inverse covariance was a matrix with mxm dimensions.
%Calculation of Spreads of centers (hidden layer)
spreaddiff=0;
for j=1:n
spreaddiff=spreaddiff + (e(j)*g*(x(j,:)-t(i,:))'*(x(j,:)-t(i,:)));
end
covinv(:,:,i)=covinv(:,:,i) - (eta3*-1*w(i)*spreaddiff); %mxm matrix
In regards to the power of Matlab, I probably should have coded the above using
matrix and vector operations. A for loop in Matlab takes up a lot of overhead. However,
since I am more used to C, I implemented it as I would in C to avoid confusion in my
calculations. Therefore, I believe this program can be further optimized to make full use
of the Matlab.
According to Haykin, there are a few points that need to be understood when
dealing with an adaptive center RBF network.
The cost function will be convex with respect to wi, but it is nonconvex
determining t , and
with respect to ti, and
. This can cause a problem when
since the parameters could get stuck at a local

minima. I tried to get around this problem by using the Matlab command,
pinv. Although it takes longer to compute than the usual inv command, it
uses the Moore-Penrose pseudo-inverse algorithm and avoids singular
matrix division.
1
The parameters wi, ti, and i are usually assigned different learning
rate parameters 1, 2, 3. In my program, these parameters are input at the
beginning. They should be values from 0<<1.
This procedure uses the gradient-based steepest descent algorithm unlike
the feedforward network, back-propagation. Thus, it does not use error
back-propagation.
i
1
i
To prevent infinite values, it is sometimes better to begin the search from a structured
initial condition that limits the parameter space to a known area. Before running the RBF
network, it may be useful to run it through a standard pattern classifier. This reduces the
chance of converging on a local minima.
The algorithm begins with the parameters w, t, and i which are given
below. It was very important that I set the variables at values that would allow the
network to run with the minimum errors. At the beginning, I had first initialized w to w
=0.005*randn(c, 1). Unfortunately, this was not a good method of initializing w, because
my RBF network produced results that were flagrantly incorrect. I tried many times to
find proper eta parameters but that was not possible. Since I was trying to produce a RBF
network that would be comparable a fixed-center RBF, I decided to set my initial weights
to w=pinv(G)*d. This improved my results immensely because my weights were limited
1
to a known area. The vector t was initialized using the kmeans algorithm. i was
initialized to an identity matrix of size m by m by c where m is the number of features
and c is the number of cluster centers. I thought that this was a good starting point since
it reduced any chances of getting stuck in a local minimum at initialization itself.
1
%Initialization of initial linear weights

G=gauss(x,t,covinv);
w=pinv(G)*d;
%Initialization of t vector
t=cinit(x,2,c); % spread initial cluster center over entire range
t=kmeansf(x,t,.0001,50);
%Initial covariance matrix, identity matrix
cov = eye(m);
%need to take inverse of covariance matrix, makes calculations easier
for i=1:c
covinv(:,:,i)=pinv(cov);
end
Testing & Comparison of Results

To test my adaptive center RBF, I first took some data files from homework 3 of
ECE539. The training set (train.txt) consisted of 10 samples of x and d and feature
dimension of 1. The testing set (test.txt) consisted of 20 samples. The training set and
the output of my RBF network is plotted below:
1.2
Training Set
0.8
0.6
0.4
0.2
-0.2
-0.4
-0.3
-0.2
-0.1
0.1
0.2
0.3
Figure 3: Training Set Plot from Trainset1.txt
10
0.4
0.5
1.2
test samples
approximated curve
train samples
radial basis
0.8
0.6
0.4
0.2
-0.2
-0.4
-0.3
-0.2
-0.1
0.1
0.2
0.3
0.4
0.5
Figure 4: Output with 3 Radial Basis Function Inputs
In this case, eta1=eta2=eta3=0.5. This helped prove to me that my adaptive center

RBF network was working correctly. I ran the same data on a fixed center RBF network
and received a similar looking output. I could not see any perceptive differences just by
examining the graph so I produced a cost function for the fixed center RBF network. It
turned out that the cost function outputs from each network were not too different.
Cost for Adaptive Center RBF Network
with 3 input radial basis functions
1.1439e-5
Cost for Fixed Center RBF Network with 3

input radial basis functions
1.1648e-5
Next, I decided to input only 2 radial basis functions.
11
1.2
test samples
approximated curve
train samples
radial basis
0.8
0.6
0.4
0.2
-0.2
-0.4
-0.3
-0.2
-0.1
0.1
0.2
0.3
0.4
0.5
Figure 5: Output with 2 Radial Basis Functions
Again, I only found a slight difference between both RBF networks.

Cost for Adaptive Center RBF Network
with 2 input radial basis functions
0.404
Cost for Fixed Center RBF Network with 2

input radial basis functions
0.404
To see if I could reduce the cost of the adaptive center RBF network, I tried
modifying the eta parameters from 0.5. My conclusion was that modifying the eta
parameters can reduce the costs but they may not be significantly lower than costs of a
fixed center RBF network.
Eta1
0.3
0.2
0.8
Eta2
0.3
0.5
0.2
Eta3
0.3
0.9
0.3
Cost
0.403
0.403
0.404
Using Dr. Hus function generator, I was able to generate a few functions to test
on my RBF networks. I wanted to see if a certain type of RBF network would actually
perform better in certain situations. The function generation output training and testing
data for 3 functions, namely sinusoid, piecewise-linear and polynomial. I decided to use
12
the sinusoid, piecewise-linear, polynomial functions to compare the results of the two
RBF networks.
Sinusoid Function Testing

RBF with Adaptive Centers
1
test samples
approximated curve
train samples
radial basis
0.5
-0.5
-1
-1.5
-0.8
-0.6
-0.4
-0.2
0.2
0.4
0.6
Figure 6:RBF network output (Sinusoid Function) with 7 Radial Basis Functions
13
Figure 7: Sinosoid Function Cost Function Output
Testing the radial basis function networks against the sinusoid data, the data
seemed to show that for fewer radial basis functions, the adaptive center RBF network
performs slightly better. However, after that, a fixed-center RBF network achieves
results that are similar if not better than the other RBF network. As a side note, we can
probably forget about the cost output of two radial basis functions since two is too few a
number to correctly match the sinusoid function. The data for the above is chart is given
in the appendix.
14
Piecewise-Linear Function
1.5
0.5
-0.5
-1
-1.5
test samples
approximated curve
train samples
radial basis
-2
-2.5
-0.5
-0.4
-0.3
-0.2
-0.1
0.1
0.2
0.3
0.4
0.5
Figure 8:Adaptive RBF Network with 10 Radial Basis Functions
0.5
-0.5
-1
test samples
approximated curve
train samples
radial basis
-1.5
-0.5
-0.4
-0.3
-0.2
-0.1
0.1
0.2
0.3
0.4
Figure 9: Adaptive RBF Network with 6 Radial Basis Functions
15
0.5
Figure 10: Piecewise-Linear Cost Function Output
For this function, the adaptive center RBF network performed better till the
number of radial basis functions reached 6. After 6, the fixed-center RBF network began
to gain better results. I stopped compiling the cost outputs at 10 radial basis functions as
the differences were in the powers of negative 7. Nevertheless, at 9 radial basis functions,
both the adaptive center and fixed center network models were providing similar
approximations of the piecewise-linear function. At 10 radial basis functions, the
adaptive center RBF network provided the best model with a cost function output of
3.7823x10-7. Data for the chart is given in the appendix.
16
Polynomial Function
0.1
0.08
0.06
0.04
0.02
0
test samples
approximated curve
train samples
radial basis
-0.02
-0.04
-0.06
-0.08
-0.5
-0.4
-0.3
-0.2
-0.1
0.1
0.2
0.3
0.4
0.5
Figure 11:Adaptive center RBF network for Polynomical Function (6 Radial Basis Functions)
Figure 12: Polynomial Cost Function Output
17
The adaptive center RBF network was clearly the winner in the approximation of
the polynomial function. I ran it a number of times but I stopped at 6 radial basis
functions as the cost function gave me an output of 4.1883x10-12. The results of the cost
function were too minute for Excel to plot them on the chart. However, you can find the
relevant data in the appendix.
18
Conclusion of Results
Depending on the application, RBF networks can gain a lot by adapting the
positions of the centers of the radial-basis function. For example in speech recognition, it
was found that when a minimal network was required, it was beneficial to use a RBF
with nonlinear optimization of parameters defining the activation functions of the hidden
layer. However, it was also true that a bigger RBF network with more fixed centers could
attain a similar kind of performance.
From my results, I can say that a RBF network with adaptive centers can perform
a little better than a fixed-center RBF network. If fewer radial basis functions are
required, then it is probably true that the RBF network with adaptive centers would work
best in such a situation. However, an RBF with fixed centers may prove to be more
useful in certain cases. With respect to my adaptive-center RBF network program, the
RBF network with fixed centers computed faster results. My program took a longer time
since it had to update each individual weight, cluster center vector, and inverse
covariance matrix. I also spent a lot of time modifying the eta values in the adaptive
center model to prevent infinite values. This was a major advantage, the fixed center
RBF network had. To optimize the adaptive RBF network program, I would probably
have to implement it using matrix and vector operations instead of loops. In conclusion, I
would like to say that both RBF network models are important and one cannot rightly say
that a particular model is better unless the situation is known.
I learnt a lot from programming the adaptive center RBF network. Although the
programming was not very difficult, I had to understand the equations of the supervised
selection of centers algorithm. This took some time since I sometimes received outputs
with incorrect dimensions. (Ex. matrices instead of vectors) The project gave me a
chance to appreciate the beauty of neural networks and I enjoyed completing it.
19
APPENDIX
20
Manual For RBN_adaptive.m

This program loads two data files, the training and test set. It then computes uses a
Radial Basis Network with supervised selection of centers to compute an approximate
function to the data. The result is the cost function output at each step. There will also be
two graphs, one of the training set and the other of the approximated curve and test
samples.
Input
Eta1: Parameter for linear weights (output layer)
Eta2: Parameter for positions of centers (hidden layer)
Eta3: Parameter for spreads of centers (hidden layer)
Number of Radial Basis Functions: The more, usually the better the network will be
approximated.
Files to be Loaded
Train.txt Data file with training samples
Test.txt Data file with testing samples
Function Generator option also possible by commenting out the above data file inputs.
Output
Figure 1: Graph of training set
Figure 2: Graph of test samples, approximated curve, training samples & radial basis
points
Cost: Cost function is evaluated at every stage.
21
%
% rbn_adaptive.m - RBF demonsration program of Supervised Selection of
%Centers
% Based on RBNdemo By Dr. Yu Hen Hu
% call fungenf.m, cinit.m, gauss.m, kmeansf.m
%
%
%
% Data points in matrix x (n by k)
% cluster centers in matrix t (v by m)
%
%
% n: number of samples
% v: size of t
% k, m: dimension of feature space
% c: number of radial basis functions used
% spread of center - spread matrix
% G - Green's matrix
% Specify:
% eta1, eta2, eta3
%
%
%
%Initialization of data including testing and training.

% generate training and testing data samples
clear all, figure(1)
%eta1 for linear weights

eta1=input('Input eta1 for linear weights: ');
%eta2 for positions of centers(hidden layer)
eta2=input('Input eta2 for positions of centers(hidden layer): ');
%eta3 for spreads of centers(hidden layer)
eta3=input('Input eta3 for spreads of centers(hidden layer: ');
%Adjust eta values to prevent convergence
eta1=eta1/(1*10^(5));
eta2=eta2/(1*10^(5));
eta3=eta3/(1*10^(5));
%%COMMENT OUT IF USING FUNCTION GENERATOR
% % generate 2D data trainf, testf
% Nr=input('# of training samples = ');
% Nt=input('# of testing samples = ');
%
% % generate the training and testing data samples
22
% funtype=input('1. Sinusoids, 2. piecewise linear, or 3. polynomial.

Enter choice: ');
% switch funtype
%
case 1 % a sinusoidal signal is to be generated
%
tp=[.7 -.2]; % y = cos(4*pi*0.7*x + (-.2))
%
case 2 % piecewise linear function
%
tp=[-.5 0 -.1 .2 .1 .2 .3 1 .5 0];
%
case 3 % polynomial specified by roots
%
tp=[2 -.3 0 0.2];
%
end
% xgen=0;
% only regularly spaced data samples are generated
% xorder=2; % training and testing data are evenly interlaced
% [trainf,testf]=fungenf(Nr,Nt,xgen,funtype,tp,xorder);
%COMMENT OUT IF USING FUNCTION GENERATOR ABOVE
load train.txt;
trainf=train;
load test.txt;
testf=test;
x=trainf(:,1); d=trainf(:,2);
xmean=mean(x); % xmean is 1 by n
y=testf(:,1); yd=testf(:,2);
[n,k]=size(x); % n # of samples, k: dim of feature space
% determine radial basis centers and cluster numbers
% decide # of radial basis functions
figure(1),plot(x,d,'o'),drawnow
legend('Training Set');
c=input('number of radial basis functions used: ');
t=cinit(x,2,c); % spread initial cluster center over entire range
t=kmeansf(x,t,.0001,50);
[v m]=size(t) ; %v stores size of ti
%Initial covariance matrix is identity matrix

cov = eye(m);
%need to take inverse of covariance matrix, makes calculations easier
for i=1:c
covinv(:,:,i)=pinv(cov);
end
%Initialization of initial weight vectors

%w =0.005*randn(c, 1); % first column is the bias weight
G=gauss(x,t,covinv);
w=pinv(G)*d;
23
%Initialize cost storage

costfunc=0;
for h=1:10
%Run for 10 times only, Running for more time is possible
but chances of convergence is higher
% Calculation of Cost Function Begins
cost=0;
sum=0;
costd=[d;yd];
fhat=gauss([x;y],t,covinv)*w;
e=costd-fhat;
for j=1:n
cost=cost+e(j)^2;
end
%Actual cost value

cost=0.5*cost
%CHANGE
if h==1
costfunc=cost
minw=w
mint=t
mincovinv=covinv
elseif costfunc>cost
costfunc=cost
minw=w
mint=t
minconinv=covinv
end
for i=1:c
% Calculation of Linear Weights (output layer)

weightdiff=0;
for j=1:n
g=exp(-0.5*((x(j,:)-t(i,:)))*covinv(:,:,i)*((x(j,:)t(i,:))'));
weightdiff = weightdiff + e(j)*g;
end
w(i)=w(i) - (eta1*weightdiff);
%Calculation of Positions of centers (hidden layer)

postdiff=0;
24
for j=1:n
postdiff = postdiff + (e(j)*g*covinv(:,:,i)*(x(j,:)t(i,:))');
end
t(i,:)=t(i,:)-(eta2*2*w(i)*postdiff)';
%Calculation of Spreads of centers (hidden layer)

spreaddiff=0;
for j=1:n
spreaddiff=spreaddiff + (e(j)*g*(x(j,:)-t(i,:))'*(x(j,:)t(i,:)));
end
covinv(:,:,i)=covinv(:,:,i) - (eta3*-1*w(i)*spreaddiff);
end
[c,n]=size(mint);
% note that sigma is n by n by c
% fhat=w(1)*ones(size([x;y]));
fhat=gauss([x;y],mint,mincovinv)*minw;
fd=gauss(mint,mint,mincovinv)*minw;
figure(2),%subplot(122)
plot(y,yd,'ob',[x;y],fhat,'+b',x,d,'.r',mint,fd,'dr'),
legend('test samples','approximated curve','train samples','radial
basis',0)
title('RBF Network with Adaptive Centers');
end
25
Manual For rbn_fixed_selfgen.m

This program used the function generator by Professor Hu. It then computes uses a
Radial Basis Network with unsupervised selection of centers to compute an approximate
function to the data. The result is the cost function. There will also be one graph of the
approximated curve and test samples.
Input
Number of Training Samples
Number of Testing Samples
Choice of Function: Polynomial, Sinusoidal, or Piece-wise Linear
Number of RBF
Output
Figure 1: Graph of test samples, approximated curve, training samples & radial basis
points
Cost: Cost function evaluation
26
%
%
%
%
%
%
%
%
%
%
Slight Modification of RBNdemo by Professor Hu

Changed it to use only TypeII RBN
Addition of cost function
Modified Plot of TypeII RBN
RBNdemo.m - RBF demonsration program using rbn.m
copyright (C) 2000 by Yu Hen Hu
created: March 17, 2000
modified: Feb. 11, 2001
call fungenf.m, cinit.m, rbn.m, gauss.m, kmeansf.m
clear all,
close all;
% generate 2D data trainf, testf
Nr=input('# of training samples = ');
Nt=input('# of testing samples = ');
% generate the training and testing data samples
funtype=input('1. Sinusoids, 2. piecewise linear, or 3. polynomial.
Enter choice: ');
switch funtype
case 1 % a sinusoidal signal is to be generated
tp=[.7 -.2]; % y = cos(4*pi*0.7*x + (-.2))
case 2 % piecewise linear function
tp=[-.5 0 -.1 .2 .1 .2 .3 1 .5 0];
case 3 % polynomial specified by roots
tp=[2 -.3 0 0.2];
end
xgen=0;
% only regularly spaced data samples are generated
xorder=2; % training and testing data are evenly interlaced
[trainf,testf]=fungenf(Nr,Nt,xgen,funtype,tp,xorder);
[k,n]=size(x); % m # of samples, n: dim of feature space
[k,n]=size(x); % m # of samples, n: dim of feature space
for type=2:2,
% determine radial basis centers and cluster numbers
if type==1,
xi=x; c=k;
elseif type==2;
% decide # of radial basis functions
%figure(1),subplot(122),plot(x,d,'o'),axis square,drawnow
c=input('number of radial basis functions used: ');
xi=cinit(x,2,c); % spread initial cluster center over entire range
xi=kmeansf(x,xi,.0001,50);
27
end
% find weights w, and approximated curve fhat
if type==1,
lambda=input('smoothing parameter, lambda (>=0) = ');
elseif type==2,
lambda=0;
[w,xi,sigma, G, G0]=rbn(x,d,xi,lambda,2);
% the rbn.m routine may change the # of clusters!
[c,n]=size(xi);
% note that sigma is n by n by c
% fhat=w(1)*ones(size([x;y]));
fhat=gauss([x;y],xi,sigma)*w;
fd=gauss(xi,xi,sigma)*w;
figure(1),%subplot(122)
plot(y,yd,'ob',[x;y],fhat,'+b',x,d,'.r',xi,fd,'dr'),
legend('test samples','approximated curve','train samples','radial
basis',0)
title('RBN with fixed centers')
%Cost function added to evaluate the RBF Network with Fixed Centers
costd=[d;yd];
e=costd-fhat;
cost=0;
for j=1:n
cost=cost+e(j)^2;
end
%Actual cost function
cost=0.5*cost
end
end
28
Derivation of Partial Derivatives (Adaptive RBF Network)

Consider E
1 N 2
ej
2 j 1
e j d j F * ( x j )
where
d j wi G ( x j t i
i 1
Ci
Linear Weights Partial Derivative Term

e j
E ( n)
1 N
2e j
wi ( n) 2 j 1
wi
N
e j (n)G ( x j t i
j 1
Ci
Positions of Centers Partial Derivative Term (hidden layer)

e j
E ( n) 1 N
2e j
t i ( n) 2 j 1
t i
M
since ej= d j wiG ( x j ti

i 1
e j
t i
G ( x j t i
Ci
t i
Ci
= wi G (
)
x j ti
Ci
( x j ti
t i
Ci
(chain rule in several variables)
where t ( x j t i
i
Ci
( x j t i ( n)) t i1 ( x j t i ( n))
Therefore,
N
E (n)
2 * wi (n) * e j (n)G ( x j t i
t i (n)
j 1
Ci
) i1 [ x j t i (n)]
Spreads of Centers Partial Derivative Term(hidden layer)
29
e j
E (n) 1 N
2e j 1
1
i 2 j 1 i
e j
wi G( x j t i
1
where
( x j ti
i1
Ci
)
C
i
( x j ti
1
i
Ci
) [ x j t i (n)][ x j t i (n)]t
Therefore,
N
E (n)
wi (n) * e j (n)G ( x j t i
i1 (n)
j 1
Ci
)[ x j t i (n)][ x j t i (n)]t
30
Excel Spreadsheet Data for Sinusoidal, Polynomial,

& Piecewise Linear Functions
Sinosoid Function Data
# of Training Samples - 20
# of Testing Samples - 40
Eta parameters were changed a few times to prevent convergence at local minima.
Usually, eta1=eta2=eta3=0.000001
Cost Function Outputs
No. of Radial Basis
Functions
2
3
4
5
6
7
Fixed Center RBF Network

0.0029
0.4987
0.1629
0.0217
0.0043
6.56E-05
Adaptive Center RBF

Network
0.0029
0.497
0.1751
0.0236
0.0036
8.16E-05
Polynomial Function
No. of Radial Basis Functions

2
3
4
5
6

6.87E-04
7.29E-04
7.78E-07
3.62E-07
6.27E-11
31
Adaptive Center RBF

Network
6.67E-04
7.17E-04
3.39E-07
3.57E-07
4.19E-12
Piecewise-Linear Function
No. of Radial Basis Functions

2
3
4
5
6
7
8
9
10

0.0039
8.79E-05
0.0016
0.0016
2.00E-04
7.32E-07
4.69E-07
4.69E-07
5.30E-07
32
Adaptive Center RBF

Network
0.0039
8.21E-05
0.0015
0.0015
1.94E-04
2.00E-04
5.23E-05
2.08E-07
3.78E-07
References
[1] Haykin, S., Neural Networks a Comprehensive Foundation, New Jersey,
Prentice Hall, 1994.
[2] Hu, Yu Hen Introduction to Neural Networks and Fuzzy Systems Retrieved
October 15, 2003. from http://www.cae.wisc.edu/~ece539
[3] Mehrotra K., Mohan C., et Ranka S., Elements of Artificial Neural Networks,
Cambridge, The MIT Press, 1997.
[4] Orr, Mark., Radial Basis Function Networks, www.anc.ed.ac.uk/~mjo,
Edinburgh University, Edinburgh, Scotland February 2000.
[5] Mathworks. Radial Basis Functions. Retrieved November 25, 2003, from
www.mathworks.com
[6] University of Tubingen, Radial Basis Functions (RBFs). Retrieved November
30, 2003, from
http://www-ra.informatik.uni-tuebingen.de/SNNS/UserManual/node182.html
33

Radial Basis Network: An Implementation OF Adaptive Centers: Nivas Durairaj Final Project For ECE539

Uploaded by

Copyright:

Available Formats

Radial Basis Network: An Implementation OF Adaptive Centers: Nivas Durairaj Final Project For ECE539

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Radial Basis Network: An Implementation OF Adaptive Centers: Nivas Durairaj Final Project For ECE539

Uploaded by

Copyright:

Available Formats

RADIAL BASIS NETWORK:

Figure 1: An RBF network with one output........................................................................4

Figure 1: An RBF network with one output

Figure 2: An RBF network with multiple outputs

Methodology & Development of Program

Cit * Ci 0.5 * i1 where i1 is the inverse covariance

matrix. x j is training set sample j and t i is the ith cluster center.

As you can see, it represents a multivariate Gaussian distribution with mean

. The vectors and matrix span the space R

2. Positions of centers (hidden layer)

results in a 1xm vector where m is the feature dimension of t and x.

where i=1, 2..c

3. Spreads of centers (hidden layer)

results in a mxm matrix where m is the feature dimension of t and x.

Note: c is the number of radial basis functions used.

%Calculation of positions of centers(hidden layers)

. This can cause a problem when

since the parameters could get stuck at a local

%Initialization of initial linear weights

Testing & Comparison of Results

Figure 3: Training Set Plot from Trainset1.txt

Figure 4: Output with 3 Radial Basis Function Inputs

In this case, eta1=eta2=eta3=0.5. This helped prove to me that my adaptive center

Cost for Fixed Center RBF Network with 3

Next, I decided to input only 2 radial basis functions.

Figure 5: Output with 2 Radial Basis Functions

Again, I only found a slight difference between both RBF networks.

Cost for Fixed Center RBF Network with 2

Sinusoid Function Testing

Figure 7: Sinosoid Function Cost Function Output

Figure 8:Adaptive RBF Network with 10 Radial Basis Functions

RBF with Adaptive Centers

Figure 9: Adaptive RBF Network with 6 Radial Basis Functions

Figure 10: Piecewise-Linear Cost Function Output

Figure 12: Polynomial Cost Function Output

Manual For RBN_adaptive.m

%Initialization of data including testing and training.

clear all, figure(1)

%eta1 for linear weights

% funtype=input('1. Sinusoids, 2. piecewise linear, or 3. polynomial.

%Initial covariance matrix is identity matrix

%Initialization of initial weight vectors

%Initialize cost storage

%Actual cost value

% Calculation of Linear Weights (output layer)

%Calculation of Positions of centers (hidden layer)

%Calculation of Spreads of centers (hidden layer)

Manual For rbn_fixed_selfgen.m

Slight Modification of RBNdemo by Professor Hu

Derivation of Partial Derivatives (Adaptive RBF Network)

Linear Weights Partial Derivative Term

Positions of Centers Partial Derivative Term (hidden layer)

since ej= d j wiG ( x j ti

(chain rule in several variables)

Spreads of Centers Partial Derivative Term(hidden layer)

Excel Spreadsheet Data for Sinusoidal, Polynomial,

Fixed Center RBF Network

Adaptive Center RBF

No. of Radial Basis Functions