Lec-04-Logistic Regression and Neural Networks PDF
Lec-04-Logistic Regression and Neural Networks PDF
Logistic regression
and neural networks
Machine Learning
Andrey Filchenkov
08.06.2016
Lecture plan
• Logistic regression
• Single-layer neural network
• Completeness problem of neural
networks
• Multilayer neural networks
• Backpropagation
• Modern neural networks
Derivative:
σ 𝑠 = σ 𝑠 σ(−𝑠).
Gradient:
ℓ
µ∇𝑄 𝑤 [ ] =− 𝑦 𝑥 σ −𝑀 𝑤 .
Hebb’s rule:
if − 𝑤 , 𝑥 𝑦 > 0, then 𝑤 [ ] = 𝑤 [ ] + µ𝑥 𝑦 .
Marginal [𝑀 < 0] and smoothed σ −𝑀 :
Hebb’s rule:
if − 𝑤 , 𝑥 𝑦 > 0, then 𝑤 [ ] = 𝑤 [ ] + µ𝑥 𝑦 .
Marginal [𝑀 < 0] and smoothed σ −𝑀 :
Weka: Logistic
𝑎 𝑥, 𝑇 ℓ = σ 𝑤 𝑓 𝑥 −𝑤 ,
Let 𝐿 𝑎 , 𝑥 = 〈𝑤, 𝑥〉 − 1 .
Delta-rule for weight learning is for each object
𝑥( ) change weight vector:
𝑤 [ ] ≔ 𝑤 − η 𝑤, 𝑥 −𝑦 .
Logical AND
𝑥 ∧ 𝑥 = [𝑥 + 𝑥 − 3/2 > 0]
Logical OR
𝑥 ∨ 𝑥 = [𝑥 + 𝑥 − 1/2 > 0]
Logical NOT
¬𝑥 = [−𝑥 + 1/2 > 0]
Example (Minkovski):
𝑥 ⊕𝑥
𝑤[ ] = 𝑤 [ ] − η𝛻𝐿 𝑤, 𝑥 , 𝑦 ,
𝑎 𝑥 =𝜎 𝑤 𝑢 𝑥 ;
𝑢 𝑥 =𝜎 𝑤 𝑓 𝑥 ;
Let 𝐿 𝑤 = ∑ 𝑎 𝑥 −𝑦 .
Find partial derivatives
∂𝐿 (𝑤) ∂𝐿 (𝑤)
; .
∂𝑎 ∂𝑢
Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 27
27
Errors on layers
∂𝐿 (𝑤)
=𝑎 𝑥 −𝑦
∂𝑎
ε =𝑎 𝑥 −𝑦 is error on output layer.
∂𝐿 (𝑤)
= 𝑎 𝑥 −𝑦 σ 𝑤 = ε σ 𝑤
∂𝑢
ε =∑ ε σ 𝑤 is error on hidden layer.
Advantages:
• efficacy: gradient can be computed in a time,
which is comparable to time of the network
processing;
• can be easily applied for any σ, 𝐿 ;
• can be applied in dynamical learning;
• not all the sample objects can be used;
• can be paralleled.
Disadvantages:
• do not always converge;
• can stuck in local optima;
• number of neurons in the hidden layer should be
fixed;
• the more ties, the probable overfitting is;
• “paralysis” of a single neuron and for network.