Abstract
The learning process in supervised learning consists of tuning the network parameters (weights and biases) until a certain cost function is minimized. Since the number of parameters is quite large (they can easily be into thousands), a robust minimization algorithm is needed. This chapter presents a number of minimization algorithms of different flavors, and emphasizes their advantages and disadvantages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This is sometimes called the Euler system of equations.
- 2.
This means that \(F_{|\mathbb {B}(0, \rho )}\) is bijective with both F and its inverse differentiable.
- 3.
This is more transparent in the case of \({\mathbb R}^3\), when \(\langle \nabla f(x^0), v\rangle = \Vert \nabla f(x^0)\Vert \, \Vert v\Vert \cos \theta \). The minimum is realized for \(\theta =\pi \), i.e. when the vectors have opposite directions.
- 4.
This proximity condition can be waived if f is a convex function.
- 5.
For instance, shaking a basket filled with potatoes of different sizes will bring the large ones to the bottom of the basket and the small ones to the top – this corresponds to the state of the system with the smallest gravitational energy.
- 6.
For discrete time steps this can be written equivalently as
$$ v^n = \gamma v^{n-1} + (1- \gamma ) (g_{n-1})^2. $$ - 7.
From the Physics point of view, in this case the substance does not reach to the state of a crystalline structure, but rather to an amorphous structure.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Calin, O. (2020). Finding Minima Algorithms. In: Deep Learning Architectures. Springer Series in the Data Sciences. Springer, Cham. https://doi.org/10.1007/978-3-030-36721-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-36721-3_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36720-6
Online ISBN: 978-3-030-36721-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)