Lecture_2 (1)
Lecture_2 (1)
Axon
Axon
Dendrites Soma
Soma
Modeling the single
neuron
Learning in simple
neurons
If we have two groups of objects, one
group of several written A's, and the
other of B's, we may want our
neuron to tell the A's from the B's, as
in figure.
We want it to output a 1 when an A
is presented and a 0 when it sees a
B.
Biology analogy
Biological Artificial
Soma Node/neuron
Dendrites Input
Axon Output
Synapse Weight
The perceptron
The simplest kind of neural network is a single-layer
perceptron network, which consists of a single layer
of output nodes; the inputs are fed directly to the
outputs via a series of weights. The sum of the
products of the weights and the inputs is calculated in
each node, and if the value is above some threshold
the neuron fires and takes the activated value;
otherwise it takes the deactivated value.
Neurons with this kind of activation function are also
called artificial neurons or linear threshold units.
In the literature the term perceptron often refers to
networks consisting of just one of these units.
The perceptron (cont’d)
theperceptron is an algorithm for learning
a binary classifier called a
threshold function: a function that maps its
input x(a real-valued vector) to an output
value f(X) (a single binary value):
x2 Weighted Transfer
(PE) Sum Function
Y1
x3 (S) (f)
(PE)
(PE) (PE)
Output
(PE)
Layer
Hidden
(PE)
Layer
Input
Layer
MLP processing
(a) Single neuron (b) Multiple neurons
x1 x1 w11 (PE) Y1
w1
w21
(PE) Y
w1 w12
x2 Y X 1W1 X 2W2
x2 w22 (PE) Y2
PE: Processing Element (or neuron)
Y1 X1W11 X 2W21
w23
Y2 X1W12 X2W22
Y3 X 2W 23 (PE) Y3
MLP processing (cont’d)
X3 = 2
Designing the MLP
Before training can begin, the user must decide on
the network topology by specifying:
the number of units in the input layer,
the number of hidden layers (if more than one), the
number of units in each hidden layer, and
the number of units in the output layer.
Normalizing the input values (between 0.0 and 1.0)
for each attribute measured in the training tuples
will help speed up the learning phase and prevent
the exploding gradient problem.
Discrete-valued attributes may be encoded such
that there is one input unit per domain value.
Choice of the transfer function
Transformation (Transfer)
Function
Linear function
Sigmoid (logical activation) function [0
1]
Tangent Hyperbolic function [-1 1]
MLP: Design issues