Artificial neural networks-optimization

Artificial neural networks (ANN) are computational models that learn from complex data using layers of interconnected neurons. The learning process involves adjusting synaptic weights to minimize a cost function, often using optimization algorithms like gradient descent, which has limitations that advanced techniques aim to address. Techniques such as momentum, learning rate scheduling, and adaptive learning rates enhance gradient descent, with optimizers like Adam and RMSprop providing effective solutions for training neural networks.

Uploaded by

jrn.begum

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Artificial neural networks-optimization

Uploaded by

jrn.begum

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Artificial neural networks (ANN) are computational

models inspired by the functioning of the human brain,

capable of learning from complex, nonlinear data. A neural
network is composed of elementary units called neurons,
which receive input signals from other units or external
sources, process them via an activation function and
transmit them as output to other units. Neurons are
organized into layers, which can be of three types: input
layer, hidden layer and output layer. The input layer
receives the data to be analyzed, the hidden layer performs
the processing operations, and the output layer returns the
learning results. A neural network can have one or more
hidden layers, depending on its complexity.
The learning process of a neural network consists of
modifying the synaptic weights, i.e. the numerical values
that regulate the intensity of the connection between two
neurons. The goal is to minimize a cost function, which
measures the discrepancy between the desired output and
the actual output of the network. To do this, optimization
algorithms are used, which update the synaptic weights
according to the gradient of the cost function with respect to
the weights themselves. The gradient indicates the direction
and direction of the maximum slope of the function, and
therefore the opposite direction to that in which it must move
to reach the minimum. An example of an optimization
algorithm is the gradient descent (GD), which calculates
the gradient on all training data and updates the weights with
a step proportional to the negative gradient.
However, gradient descent has some disadvantages,
including:

 The need to choose a fixed value for the learning

rate, which can affect the speed and quality of
convergence.
 Sensitivity to data noise, which can cause
oscillations or deviations from the global minimum.

 The difficulty in treating non-convex functions or

with many local minimums.

For these reasons, more advanced optimization

techniques have been developed, which we will try to
illustrate in the next paragraphs.

Techniques for improving gradient descent

In neural network optimization, there are several techniques

aimed at improving gradient descent, the fundamental
algorithm for training models. The goal is to make
convergence faster and more stable, avoiding problems such
as local minimum and gradient instability.
Some of the most common techniques to improve gradient
descent include:

 Momentum: Adding a momentum term to the

weight update helps to break local lows and
accelerates convergence. The momentum term
tracks the accumulation of past gradients and
influences the current update based on this historical
information.

 Learning Rate Scheduling: consists of dynamically

changing the learning rate during training. This can
be done by gradually reducing the rate of learning
over the ages or in response to certain conditions,
such as a plateau in the loss function. An example of
a scheduling algorithm is ReduceLROnPlateau,
which reduces the learning rate when model
improvement stops.
 Adaptive Learning Rate: This technique adjusts the
learning rate based on the gradients calculated for
weights. For example, the AdaGrad algorithm adapts
the learning rate to each weight based on their
gradient history, reducing the rate for weights that
receive higher updates and vice versa.

 Batch Normalization: Batch normalization is a

technique that normalizes the input values of each
training batch, ensuring an average of zero and a
standard deviation of one. This stabilizes data
distribution and accelerates convergence.

 Weight Initialization: a correct initialization of the

weights of neurons is essential for an effective
descent of the gradient. A good practice is to select
random initializations of weights that satisfy certain
properties, such as limited variance and update
symmetry.

Optimizers :

 Adam: A widely used optimizer, combining adaptive

momentum and learning rate refresh. It is effective in
most neural model training problems.

 RMSprop: a useful optimizer to deal with problems

with dispersed gradients. Adapt the learning rate for
weights based on their gradient history.

 SGD (Stochastic Gradient Descent): a basic

optimizer that can be effective with a properly
regulated learning rate. It is based on gradient
estimation using a random subset of the training
data.
These techniques can be combined and adapted according to
the specific needs of model training, allowing for more
efficient and stable gradient descent during neural network
optimization.

Supervised Deep Learning
No ratings yet
Supervised Deep Learning
28 pages
Asdfvvasdfr
No ratings yet
Asdfvvasdfr
1 page
Pure Optimization
No ratings yet
Pure Optimization
23 pages
Unit 2.4
No ratings yet
Unit 2.4
31 pages
DL Mod2
No ratings yet
DL Mod2
45 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
Survey of FNN
No ratings yet
Survey of FNN
25 pages
AI & ML Unit 5 Notes
No ratings yet
AI & ML Unit 5 Notes
23 pages
Comparison of Gradient Descent Algorithms On Training Neural Networks
No ratings yet
Comparison of Gradient Descent Algorithms On Training Neural Networks
20 pages
ANN Doc
No ratings yet
ANN Doc
2 pages
Soft Mod 2
No ratings yet
Soft Mod 2
11 pages
A Weight Decides How Much Influence The Input Will Have On The Output
No ratings yet
A Weight Decides How Much Influence The Input Will Have On The Output
1 page
Basic Neural Networks
No ratings yet
Basic Neural Networks
9 pages
2 Deep Neural Network_241120_095158
No ratings yet
2 Deep Neural Network_241120_095158
47 pages
Geoffrey Hinton With Nitish Srivastava Kevin Swersky: Neural Networks For Machine Learning
No ratings yet
Geoffrey Hinton With Nitish Srivastava Kevin Swersky: Neural Networks For Machine Learning
31 pages
2022 - Neural Optimization Machine-A Neural Network Approach For Optimization
No ratings yet
2022 - Neural Optimization Machine-A Neural Network Approach For Optimization
22 pages
Neural-Network(Basics)
No ratings yet
Neural-Network(Basics)
48 pages
Unit 1 (1)
No ratings yet
Unit 1 (1)
72 pages
QB Unit 3
No ratings yet
QB Unit 3
14 pages
Module 2
No ratings yet
Module 2
67 pages
Cs3491-Artificial Intelligence and Machine Learning-1221091049-Unit 5 Aiml
No ratings yet
Cs3491-Artificial Intelligence and Machine Learning-1221091049-Unit 5 Aiml
38 pages
cst414-deep learning module 2
No ratings yet
cst414-deep learning module 2
13 pages
ITNN Week3
No ratings yet
ITNN Week3
21 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
37 pages
Lecture 5
No ratings yet
Lecture 5
34 pages
Deep Learning UNIT-II Part1
No ratings yet
Deep Learning UNIT-II Part1
48 pages
Lecture 7 - Optimization Part I
No ratings yet
Lecture 7 - Optimization Part I
38 pages
Unit 5
No ratings yet
Unit 5
219 pages
Training NNs
No ratings yet
Training NNs
34 pages
Unit III
No ratings yet
Unit III
37 pages
Deep Learning Lab Manual
No ratings yet
Deep Learning Lab Manual
73 pages
AD601 Deep Learning Unit-2 Notes
No ratings yet
AD601 Deep Learning Unit-2 Notes
14 pages
Optimization Techniques in Deep Learning
No ratings yet
Optimization Techniques in Deep Learning
14 pages
Gradient-Based Optimizers
No ratings yet
Gradient-Based Optimizers
54 pages
Artificial Neural NetworkIV
No ratings yet
Artificial Neural NetworkIV
6 pages
AIMLR
No ratings yet
AIMLR
6 pages
ML-U2
No ratings yet
ML-U2
15 pages
Lecture15 NeuronNetworks
No ratings yet
Lecture15 NeuronNetworks
61 pages
Deep Learning
100% (1)
Deep Learning
2 pages
ML3 Unit 4-3
No ratings yet
ML3 Unit 4-3
13 pages
Soft Module 1
No ratings yet
Soft Module 1
14 pages
Lecture 5 - CS50's Introduction to Artificial Intelligence with Python
No ratings yet
Lecture 5 - CS50's Introduction to Artificial Intelligence with Python
16 pages
Topic 4 (Part 2) - NN learning
No ratings yet
Topic 4 (Part 2) - NN learning
92 pages
Neural Networks
No ratings yet
Neural Networks
27 pages
Module1 ECO-598 AI & ML Aug 21
No ratings yet
Module1 ECO-598 AI & ML Aug 21
45 pages
DeepLearing Theory
No ratings yet
DeepLearing Theory
51 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
Neural Networks
No ratings yet
Neural Networks
40 pages
Fundamentals of Artificial Neural Networks
No ratings yet
Fundamentals of Artificial Neural Networks
27 pages
Op Tim Ization
No ratings yet
Op Tim Ization
22 pages
Part 1.1.neural Network and Training Algorithm
No ratings yet
Part 1.1.neural Network and Training Algorithm
34 pages
Op Tim Ization
No ratings yet
Op Tim Ization
9 pages
Unit 6 Application of AI
No ratings yet
Unit 6 Application of AI
91 pages
Part 13 MD
No ratings yet
Part 13 MD
41 pages
cours5
No ratings yet
cours5
23 pages
Optimization and Tips For Neural Network Training: Geena Kim
No ratings yet
Optimization and Tips For Neural Network Training: Geena Kim
24 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
34 pages
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
From Everand
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
Fouad Sabry
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
LM Physics Section 2 LVersion
No ratings yet
LM Physics Section 2 LVersion
48 pages
Java Implementation of Elliptic Curve Integrated Encryption Scheme
No ratings yet
Java Implementation of Elliptic Curve Integrated Encryption Scheme
41 pages
Solution To Test 1 (Version A) : Part I. Multiple-Choice Questions (6 2 12 Marks) Ddaefa
No ratings yet
Solution To Test 1 (Version A) : Part I. Multiple-Choice Questions (6 2 12 Marks) Ddaefa
3 pages
Image Processing VII Semester Elective Bca Syllabus
No ratings yet
Image Processing VII Semester Elective Bca Syllabus
2 pages
Analysis of Differential Calculus in Economics
No ratings yet
Analysis of Differential Calculus in Economics
7 pages
dataanalysistechniquesSHSOctober (1)
No ratings yet
dataanalysistechniquesSHSOctober (1)
58 pages
Circular Motion, 2D Motion, Mechanics Revision Notes From A-Level Maths Tutor
100% (2)
Circular Motion, 2D Motion, Mechanics Revision Notes From A-Level Maths Tutor
6 pages
IGCSE Physics Syllabus Overview
No ratings yet
IGCSE Physics Syllabus Overview
13 pages
Direct Variation
No ratings yet
Direct Variation
49 pages
Introduction To Abstract Algebra: Samir Siksek
No ratings yet
Introduction To Abstract Algebra: Samir Siksek
135 pages
Fundamental Numerical Methods and Data Analysis 1st Edition by George Collins ISBN 9783110936001 3110936003 - Instantly access the complete ebook with just one click
100% (4)
Fundamental Numerical Methods and Data Analysis 1st Edition by George Collins ISBN 9783110936001 3110936003 - Instantly access the complete ebook with just one click
83 pages
Discordant Uranium-Lead Ages, I: Vol. 37, N O - 3 Transactions, American Geophysicaljunion
No ratings yet
Discordant Uranium-Lead Ages, I: Vol. 37, N O - 3 Transactions, American Geophysicaljunion
7 pages
BSc-NEP Semester4
No ratings yet
BSc-NEP Semester4
65 pages
Exercise 6 - Gap Acceptance and Priority Junction Aditya Nugroho e
No ratings yet
Exercise 6 - Gap Acceptance and Priority Junction Aditya Nugroho e
10 pages
Teacher Job Satisfaction the Importance of School Working Conditions and Teacher Characteristics
No ratings yet
Teacher Job Satisfaction the Importance of School Working Conditions and Teacher Characteristics
28 pages
Numerical Ability 1
No ratings yet
Numerical Ability 1
3 pages
ELECTROMAGNETIC WAVES Class Notes Jee Advanced
No ratings yet
ELECTROMAGNETIC WAVES Class Notes Jee Advanced
16 pages
Mathematical Symbols: Good Problems: March 25, 2008
No ratings yet
Mathematical Symbols: Good Problems: March 25, 2008
2 pages
Data Handling
No ratings yet
Data Handling
16 pages
Plug Flow Reactor Design
No ratings yet
Plug Flow Reactor Design
3 pages
Traffic Impact Assesment Practice in Indonesia
No ratings yet
Traffic Impact Assesment Practice in Indonesia
6 pages
PDE and Hypersurfaces With Prescribed Mean Curvature: Yunelsy N. Alvarez
No ratings yet
PDE and Hypersurfaces With Prescribed Mean Curvature: Yunelsy N. Alvarez
78 pages
Damage and Failure For Ductile Metals
No ratings yet
Damage and Failure For Ductile Metals
10 pages
Higgs Boson Physics, Part IV: Laura Reina
No ratings yet
Higgs Boson Physics, Part IV: Laura Reina
38 pages
A
No ratings yet
A
20 pages
Statistical Techniques Lab (BCSL-044)
No ratings yet
Statistical Techniques Lab (BCSL-044)
3 pages
Measures of Relative Position
No ratings yet
Measures of Relative Position
20 pages
Margham Publications: Price List Cum Order Form For The Year 2010-2011
No ratings yet
Margham Publications: Price List Cum Order Form For The Year 2010-2011
6 pages
ECE 553/531L (Feedback and Control System) : University Vision
No ratings yet
ECE 553/531L (Feedback and Control System) : University Vision
3 pages
03 - TS - CH-3 Polynomials - Sa-1 Booklet - 2023-2024
No ratings yet
03 - TS - CH-3 Polynomials - Sa-1 Booklet - 2023-2024
9 pages