Module1-Introduction To Soft Computing and Neural Network
Module1-Introduction To Soft Computing and Neural Network
Computing and
Neural Network
Dr. Bhakti Palkar
Content
1.1 Concept of computing systems, "Soft" computing versus "Hard"
computing, Characteristics of Soft computing, Some applications of Soft
computing techniques.
1.2 Biological neurons and its working, ANN – Terminologies, Basic Models,
Linearly and non-linearly separable classification, McCulloch Pitts Neuron
Model
2
Concept of Computing
3
Important Characteristics of Computing
1. Should provide precise solution.
4
Hard Computing
In 1996, LA Zade (LAZ) introduced the term hard computing.
5
Examples of Hard Computing
Example:
6
Soft Computing
• The term soft computing was proposed by the
inventor of fuzzy logic, Lotfi A.Zadeh. He describes
it as follows.
Definition: Soft computing
7
Soft Computing
● It does not require any mathematical modeling of problem solving
● It may not yield the precise solution
● Algorithms are adaptive (i.e. it can adjust to the change of dynamic
environment)
● Use some biological inspired methodologies such as genetics, evolution,
Ant’s behaviors, particles swarming, human nervous systems etc.
8
Soft computing
● Soft Computing is the fusion of methodologies designed to model and
enable solutions to real world problems, which are not modeled or too
difficult to model mathematically.
● Soft Computing is a term used in computer science to refer to problems
in whose solutions are unpredictable, uncertain and between 0 and 1.
9
Goals of soft computing
● To develop intelligent machines to provide solutions to real world
problems, which are not modelled or too difficult to model
mathematically.
● Well suited for real world problems where ideal solutions are not there.
● To exploit the tolerance for approximation, uncertainty, imprecision and
partial truth in order to achieve close resemblance with human like
decision making.
10
contd..
● imprecision – the model features (quantities) are not the same as that of
the real ones, but close to them.
● uncertainty – we are not sure that the features of the model are the same
as that of the entity (belief).
● Approximate Reasoning – the model features are similar to the real ones,
but not the same.
● The guiding principle of soft computing is to exploit these tolerance to
achieve tractability, robustness and low solution cost.
● The role model for soft computing is the human mind.
11
Applications of SC
Handwritten
script
recognition
Soft Decision
computing support
based system
architecture
Automotive
Systems and
Soft Power system
analysis
Manufacturing Computing
Image
processing and Bioinformatics
data
compression
Investment
and trading
12
Difference
Sr. No. Hard Computing Soft Computing
It requires precisely stated analytic model SC techniques are tolerant of imprecision, uncertainty,
1
partial truth and approximation
5 requires exact input data can deal with ambiguous and noisy data
14
Neural Network
DARPA Neural Network Study (1988, AFCEA International Press, p. 60):
15
Neural Network
According to Haykin (1994), p. 2:
Interneuron connection strengths known as synaptic weights are used to store the
knowledge
16
Neural Network
According to Nigrin (1993), p. 11:
Artificial neural systems, or neural networks, are physical cellular systems which can
acquire, store and utilize experiential knowledge.
17
Multidisciplinary view of neural network
18
Fuzzy Logic
● It was developed in 1965, by Professor Lofti Zadeh, at University of California,
Berkley. The first application was to perform computer data processing based on
natural values.
● In more simple words, A Fuzzy logic stat can be 0, 1 or in between these
numbers i.e. 0.17 or 0.54.
● For example, In Boolean, we may say glass of hot water ( i.e 1 or High) or glass of
cold water i.e. (0 or low), but in Fuzzy logic, We may say glass of warm
water (neither hot nor cold).
19
Advantages of Fuzzy Logic
1. A Fuzzy Logic System is flexible and allow modification in the rules.
2. Even imprecise, distorted and error input information is also accepted by
the system.
3. The systems can be easily constructed.
4. Since these systems involve human reasoning and decision making, they
are useful in providing solutions to complex solutions in different types of
applications.
20
Applications of Fuzzy Logic
● Fuzzy Logic system can be used in Automotive systems, for applications like 4-
Wheel steering, automatic gearboxes etc.
● Applications in the field of Domestic Applications include Microwave Ovens, Air
Conditioners, Washing Machines, Televisions, Refrigerators, Vacuum Cleaners etc.
● Other applications include Hi-Fi Systems, Photo-Copiers, Humidifiers etc.
21
Genetic Algorithm
• A genetic algorithm is an adaptive heuristic search algorithm inspired by "Darwin's
producing new children, and the process is repeated over various generations.
22
Genetic Algorithm
● In this way we keep “evolving” better individuals or solutions over generations, till
we reach a stopping criterion.
23
Applications of Genetic Algorithms
● Data mining and clustering.
● Image processing.
● Wireless sensor network.
● Traveling salesman problem (TSP)
● Vehicle routing problems.
● Mechanical engineering design.
● Manufacturing system.
24
Hybrid Computing
It is a combination of the conventional hard computing and emerging soft
computing
25
Hybrid Systems
• Hybrid systems enables one to combine various soft computing paradigms
and result in a best solution. The major three hybrid systems are as follows:
26
Neural Network
● Biological nervous system is the most important part of many living
things, in particular, human beings.
● There is a part called brain at the center of human nervous system.
● In fact, any biological nervous system consists of a large number of
interconnected processing units called neurons.
● Each neuron is approximately 10µm long and they can operate in parallel.
● Typically, a human brain consists of approximately 1011 neurons
communicating with each other with the help of electrical impulses.
27
Neuron:Basic unit of nervous system
28
Neuron and its working
• Dendrite (Input): receives signals from
other neurons
29
Neuron and its working
• Synapse (weighted connections): The point of
strength)
An artificial neuron 33
Artificial Neurons
Four basic components of a human biological The components of a basic artificial neuron
neuron
34
Terminology Relation Between Biological And
Artificial Neuron
Biological Neuron Artificial Neuron
Cell Neuron
Axon Output
35
Model Of A Neuron
Wa
X1
Wb Y
X2 µ f(µ)
Wc
X3
41
Evolution of neural networks
Year Neural network Designer Description
1943 McCulloch and Pitts neuron McCulloch and Pitts Arrangement of neurons is
combination of logic gate.
Unique feature is thresh
hold
42
Contd…
Year Neural network Designer Description
1972 Kohonen self- Kohonen Inputs are clustered to
organizing feature map obtain a fired output
neuron.
1982, Hopfield network John Hopfield and Tank Based on fixed weights.
1984, Can act as associative
1985, memory nets
1986,
1987
44
Basic models of ANN
● Models are based on three entities
○ The model’s synaptic interconnections.
○ The training or learning rules adopted for updating and adjusting the
connection weights.
○ Their activation functions
● The arrangement of neurons to form layers and the connection pattern formed
within and between layers is called the network architecture.
45
Five types of ANN
1. Single layer feed forward network
2. Multilayer feed-forward network
3. Single node with its own feedback
4. Single-layer recurrent network
5. Multilayer recurrent network
46
Single layer Feed- Forward Network
● Layer is formed by taking processing
elements and combining it with other
processing elements.
● Input and output are linked with each
other
● Inputs are connected to the processing
nodes with various weights, resulting in
series of outputs one per node.
47
Multilayer feed-forward network
● Formed by the interconnection of several
layers.
● Input layer receives input and buffers input
signal.
● Output layer generates output.
● Layer between input and output is called
hidden layer.
● Hidden layer is internal to the network.
● Zero to several hidden layers in a network.
● More the hidden layer, more is the
complexity of network, but efficient output
is produced.
48
Feed back network
● If no neuron in the output layer is an
input to a node in the same layer /
proceeding layer – feed forward
network.
● If outputs are directed back as input
to the processing elements in the Recurrent networks are networks with
feedback networks with closed loop.
same layer/proceeding layer – Fig 2.8 (A) –simple recurrent neural network
feedback network. having a single neuron with feedback to
itself.
● If the output are directed back to the Fig 2.9 – single layer network with feedback
input of the same layer then it is from output can be directed to processing
element itself or to other processing
lateral feedback. element/both.
49
Single layer recurrent layer
Processing element output can be directed
to processing element itself or to other
processing element in the same layer.
50
Multilayer recurrent network
Processing element output
can be directed back to the
nodes in the preceding layer,
forming a multilayer
recurrent network.
51
Learning
Two broad kinds of learning in ANNs is :
i) parameter learning – updates connecting weights in a neural net.
ii) Structure learning – focus on change in the network.
Apart from these, learning in ANN is classified into three categories as
i) supervised learning
ii) unsupervised learning
Iii) reinforcement learning
52
Supervised learning
In ANN, each input vector requires a
corresponding target vector, which represents the
desired output.
The input vector along with target vector is called
training pair.
The input vector results in output vector.
The actual output vector is compared with desired
output vector.
If there is a difference means an error signal is
generated by the network.
It is used for adjustment of weights until actual
output matches desired output.
53
Unsupervised learning
● Learning is performed without the help of a
teacher.
● Example: tadpole – learn to swim by itself.
● In ANN, during training process, network receives
input patterns and organize it to form clusters.
● From the Fig. it is observed that no feedback is
applied from environment to inform what output
should be or whether they are correct.
● The network itself discover patterns, regularities,
features/ categories from the input data and
relations for the input data over the output.
● Exact clusters are formed by discovering similarities
& dissimilarities so called as self – organizing. 54
Reinforcement learning
● Similar to supervised learning.
● For example, the network might be told chat
its actual output is only "50% correct" or so.
Thus, here only critic information is available,
nor the exact information.
● Learning based on critic information is called
reinforcement learning & the feedback sent The external reinforcement signals
is called reinforcement signal. are processed in the critic signal
● The network receives some feedback from generator, and the obtained critic
the environment. signals are sent to the ANN for
● Feedback is only evaluative. adjustment of weights properly to
get critic feedback in future.
55
Activation functions
● To make work more efficient and for exact output, some force or
activation is given.
● Activation function is applied over the net input to calculate the output of
an ANN. Information processing of processing element has two major
parts: input and output. An integration function (f) is associated with input
of processing element.
56
Activation functions
1. Identity function:
○ it is a linear function which is defined as
f(x) =x for all x
○ The output is same as the input.
57
Activation functions
2. Binary step function
58
Activation functions
3. Bipolar step function:
It is defined as
59
Activation functions
4. Sigmoid function:
used in Back propagation nets.
a) binary sigmoid function—
logistic sigmoid function or unipolar sigmoid
function.-it is defined as
f’(x) = λ f(x)[1-f(x)].
60
Activation functions
b) Bipolar sigmoid function
61
Activation functions
5. Ramp function:
62
Common models of neurons
Binary perceptrons
Continuous perceptrons
63
Weights
● Each neuron is connected to every other neuron by means of directed
links
● Links are associated with weights
● Weights contain information about the input signal and is represented as
a matrix
● Weight matrix also called connection matrix
64
Weight matrix
W= é ù
é w11w12 w13...w1m ù
T
ê w1
ú
ê
ê w
T
ú
ú
ê ú
ê w21w 22 w 23...w 2 m ú
2
êw ú
T
ê 3
ú
ê. ú ê.................. ú
ê. ú
ê ú
=
ê ú
ê.
ê
ú
ú ê................... ú
ê. ú ê ú
ê.
ê T
ê wn ú
ú
ú ê
ë w n 1 w n 2 w n 3
...w nm ú
û
ê
ë ú
û
65
Weights contd…
● wij –is the weight from processing element ”i” (source node) to processing
element “j” (destination node)
n
1
y = å x i wij
X1 bj inj
i =0
w1j = xw 0 0j
+ x1w1 j + x 2 w 2 j + .... + x n w nj
Xi Yj n
wij = w0 j + å x i wij
i =1
n
Xn wnj y = b j + å x i wij
inj
i =1 66
Bias
● Bias has an impact in calculating net input.
● Bias is included by adding x0 to the input vector x. 1
● The net output is calculated by b
X1 W1
Y
X2 W2
● The bias is of two types
■ Positive bias -Increase the net input
■ Negative bias-Decrease the net input
67
Threshold
○ It is a set value based upon which the final output is calculated.
○ Calculated net input and threshold is compared to get the network
output.
○ The activation function of threshold is defined as
68
Learning rate
● Denoted by α.
● Used to control the amount of weight adjustment at each step of training
● Learning rate ranging from 0 to 1 determines the rate of learning in each
time step
69
Other terminologies
● Momentum factor:
○ used for convergence when momentum factor is added to weight updation process.
● Vigilance parameter:
○ Denoted by ρ
○ Used to control the degree of similarity required for patterns to be assigned to the same
cluster
70
Mcculloch-pitts neuron
● Discovered in 1943.
● Usually called as M-P neuron.
● M-P neurons are connected by directed weighted paths.
● Activation of M-P neurons is binary (i.e) at any time step the neuron may fire
or may not fire.
● Weights associated with communication links may be excitatory(wgts are
positive)/inhibitory(wgts are negative).
● Threshold plays major role here. There is a fixed threshold for each neuron
and if the net input to the neuron is greater than the threshold then the
neuron fires.
● They are widely used in logic functions.
71
● A simple M-P neuron is shown in the
figure.
● It is excitatory with weight (w>0) /
inhibitory with weight –p (p<0).
● In the Fig., inputs from x1 to xn possess
excitatory weighted connection and Xn+1
to xn+m has inhibitory weighted
interconnections.
● Since the firing of neuron is based on
threshold, activation function is defined as
72
Mcculloch-pitts neuron
● For inhibition to be absolute, the threshold with the activation function
should satisfy the following condition:
θ >nw –p
● Output will fire if it receives “k” or more excitatory inputs but no inhibitory
inputs where
kw≥θ>(k-1) w
- The M-P neuron has no particular training algorithm.
- An analysis is performed to determine the weights and the
threshold.
- It is used as a building block where any function or phenomenon is
modeled based on a logic function.
73
Geometrical interpretation of OR function MP-
neuron
74
Geometrical interpretation of AND function MP-
neuron
75
Geometrical interpretation of Tautology MP-neuron
76
Geometrical interpretation of OR function MP-
neuron with 3 inputs
77
Geometrical interpretation of OR function MP-
neuron with 3 inputs
78
XOR Problem
79
80
Thus..
• A single McCulloch Pitts Neuron can be used to
represent boolean functions which are linearly
separable
• Linear separability (for boolean functions) : There
exists a line (plane) such that all inputs which
produce a 1 lie on one side of the line (plane) and
all inputs which produce a 0 lie on other side of the
line (plane)
81
Limitation of MP-neuron
• What about non-boolean (say, real) inputs?
• Do we always need to hand code the threshold?
• Are all inputs equal? What if we want to assign
more importance to some inputs?
• What about functions which are not linearly
separable? Say XOR function.
82
Linear separability
● It is a concept wherein the separation of the input space into regions is
based on whether the network response is positive or negative.
● A decision line is drawn to separate positive or negative response.
● The decision line is also called as decision-making line or decision-support
line or linear-separable line.
● The net input calculation to the output unit is given as
83
Contd..
• Linear separability of the network is based on
decision-boundary line.
• If there exists weights(bias) for which training input
vectors have positive response (+1 ve) input lie on
one side of line and other having negative
response(-ve) lie on other side of line, we conclude
the problem is linearly separable
84
Contd.. Net input of the network
Yin= b+ x1w1 + x2w2
Separating line for which the boundary lies
between the values of x1 and x2
b+ x1w1 + x2w2 =0
If weight of w2 is not equal to 0 then we get
x2 = -w1/w2 –b/w2
Net requirement for positive response
b+ x1w1 + x2w2 >0
During training process , from training data w1, w2
and b are decided.
85
Contd..
• Consider a network having positive
response in the first quadrant and
negative response in all other quadrants
with either binary or bipolar data.
• Decision line is drawn separating two
regions as shown in Fig.
• Using bipolar data
representation, missing data can be
distinguished from mistaken data. Hence
bipolar data is better than binary data.
• Missing values are represented by 0 and
mistakes by reversing the input values
from +1 to -1 or vice versa.
Using the linear separability concept, obtain the response for OR function (rake bipolar inputs and
bipolar targets).
Solution:
Assuming tbe coordinates
as ( -1, 0) and (0,-1); as
(x1,y1) and (x2,y2), the
slope
"m" of the straight line can
be obtained as
We now calculate c:
‘c=y1-mx1
= 0- (-1)(-1) = -1
88
Flowchart
89
Steps:
○ 0: First initialize the weights.
○ 1: Steps 2-4 have to be performed for each input training vector and target output
pair, s:t
○ 2: Input activations are set. The activation function for input layer is identity function.
● Xi =Si for i=1 to n
○ 3: Output activations are set.
○ 4: Weight adjustment and bias adjustments are performed.
■ Wi(new) = wi(old)+xiy
■ b(new)=b(old)+y
○ In step 4, the weight updation formula can be written in vector form as
■ w(new)=w(old)+y.
○ Change in weight is expressed as
■ Δw=xy
Hence,
w(new)=w(old)+Δw
Hebb rule is used for pattern association, pattern categorization,
pattern classification and over a range of other areas
90
Design a Hebb net to implement OR function
Use bipolar data in the place of binary data Initially the weights and bias are
set to zero w1=w2=b=0
Inputs Target
X1 X2 B y
1 1 1 1
1 -1 1 1
-1 1 1 1
-1 -1 1 -1
91
• First input [x1 x2 b] = [1 1 1] and target = 1 [i.e. y = 1], setting the intial
weights as old weights and applying the hebb rule, we get
wi (new) = wi(old) + xi*y
w1(new) = w1(old) + x1*y = 0+1*1 = 1
w2(new) = w2(old) + x2*y = 0+1*1 = 1
b(new) = b(old) +y = 0+1 = 1
The weight change here is Δwi = xi*y
Δw1 = x1*y = 1
Δw2 = x2*y = 1
Δ=b=y=1
• Similarly for second input [x1 x2 b] = [1 -1 1] and y = 1
w1(new) = w1(old) + x1*y = 1+1*1 = 2 & Δw1 = 1
w2(new) = w2(old) + x2*y = 1+(-1)*1 = 0 & Δw2 = -1
b(new) = b(old) +y = 1+1 = 2 & Δb = 1
• Similarly for third and fourth input, output can be calculated.
92
Inputs y Weight changes Weights
X1 X2 b Y ΔW1 ΔW2 Δb W1 W2 b
1 1 1 1 1 1 1 1 1 1
1 -1 1 1 1 -1 1 2 0 2
-1 1 1 1 -1 1 1 1 1 3
-1 -1 1 -1 1 1 -1 2 2 2
93
Applications of NNs
● classification
in marketing: consumer spending pattern classification
In defence: radar and sonar image classification
In agriculture & fishing: fruit and catch grading
In medicine: ultrasound and electrocardiogram image classification, EEGs, medical
diagnosis
● recognition and identification
In general computing and telecommunications: speech, vision and handwriting
recognition
In finance: signature verification and bank note verification
● assessment
In engineering: product inspection monitoring and control
In defence: target tracking
In security: motion detection, surveillance image analysis and fingerprint matching
● forecasting and prediction
In finance: foreign exchange rate and stock market forecasting
In agriculture: crop yield forecasting
In marketing: sales forecasting
In meteorology: weather prediction
94
Engine Management
• The behaviour of a car engine is influenced by a large number of
parameters
– temperature at various points
– fuel/air mixture
– lubricant viscosity.
• Major companies have used neural networks to dynamically tune
an engine depending on current settings.
95
Signature Recognition
• Each person's signature is different.
• There are structural similarities which are difficult to quantify.
• One company has manufactured a machine which recognizes
signatures to within a high level of accuracy.
– Considers speed in addition to gross shape.
– Makes forgery even more difficult.
96
Sonar Target Recognition
• Distinguish mines from rocks on sea-
bed
• The neural network is provided with
a large number of parameters which
are extracted from the sonar signal.
• The training set consists of sets of
signals from rocks and mines.
97
Stock Market Prediction
• “Technical trading” refers to trading based solely on known statistical
parameters; e.g. previous price
• Neural networks have been used to attempt to predict changes in prices.
• Difficult to assess success since companies using these techniques are
reluctant to disclose information.
98
Classification of Neural Networks
99