Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
73 views

Why Convexity Is The Key To Optimization: Convex Sets

Convexity is key to optimization because convex cost functions have important properties. Convex sets are shapes where any line joining two points is contained within the set. A convex function is one where its epigraph, or set of points above the graph, forms a convex set. For gradient descent optimization to find the global minimum, the cost function must be convex so there are no local minima. With a non-convex cost function, gradient descent can get stuck in local minima instead of reaching the true global minimum. Understanding convexity is important for machine learning as optimization is central to building models.

Uploaded by

Navneet Lalwani
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views

Why Convexity Is The Key To Optimization: Convex Sets

Convexity is key to optimization because convex cost functions have important properties. Convex sets are shapes where any line joining two points is contained within the set. A convex function is one where its epigraph, or set of points above the graph, forms a convex set. For gradient descent optimization to find the global minimum, the cost function must be convex so there are no local minima. With a non-convex cost function, gradient descent can get stuck in local minima instead of reaching the true global minimum. Understanding convexity is important for machine learning as optimization is central to building models.

Uploaded by

Navneet Lalwani
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Why convexity is the key to optimization

It is easy with convex cost functions

Convex Sets

To simply things, think of convex sets as shapes where any line joining 2 points in this set is
never outside the set. This is called a convex set.
Take a look at the examples below.

It is evident that any line joining 2 points on a circle or say a square (the shapes on extreme left
and middle), will have all the line segments within the shape. These are examples of convex
sets.
On the other hand, the shape on the extreme right in the figure above has part of a line outside the
shape. Thus, this is not a convex set.
A convex set C can be represented as follows.

Convex set condition

Epigraph

Consider the graph of a function f.


An epigraph is a set of points lying on or above the function’s graph.
Epigraph of a function

Convex Function
Okay, now that you understand what are convex sets and epigraphs, we can talk about
convex functions.

A function f is said to be a convex function if its epigraph is a convex set (as seen in the green
figure below on the left).
This means that every line segment drawn on this graph is always equal to or above the function
graph. Pause a minute and check for yourself.
This means that a function f is not convex if there exist two points x, y such that the line
segment joining f(x) and f(y), is below the curve of the function f. This causes the loss of
convexity of the epigraph (as seen in the red-figure above on the right ).
This means that every line segment drawn on this graph is not always equal to or above the
function graph. The same can be proven by taking points on the bends.

Testing for convexity


Most of the cost functions in the case of neural networks would be non-convex. Thus you must
test a function for convexity.
A function f is said to be a convex function if the seconder-order derivative of that function is
greater than or equal to 0.

Condition for convex functions.


Examples of convex functions: y=eˣ, y=x². Both of these functions are differentiable twice.
If -f(x) (minus f(x)) is a convex function, then the function is called a concave function.

Condition for concave functions.


Examples of concave functions: y=-eˣ. The function is differentiable twice.
Convexity in gradient descent optimization
As said earlier, gradient descent is a first-order iterative optimization algorithm that is used to
minimize a cost function.
To understand how convexity plays a crucial role in gradient descent, let us take the example of
convex and non-convex cost functions.
For a Linear Regression model, we define the cost function Mean Square Error(MSE), which
measures the average squared difference between actual and predicted values. Our goal is to
minimize this cost function in order to improve the accuracy of the model. MSE is a convex
function (it is differentiable twice). This means there is no local minimum, but only the global
minimum. Thus gradient descent would converge to the global minimum.

MSE equation. Image by the author.


Now let us consider a non-convex cost function. In this case, take an arbitrary non-convex

function as plotted below.

Instead of converging to the global minimum, you can see that gradient descent would stop at the
local minimum because the gradient at that point is zero (slope is 0) and minimum in the
neighborhood. One way to solve this issue is by using the concept momentum.

Conclusion

Convex functions play a huge role in optimization. Optimization is the core of a machine learning
model. Understanding convexity is really important for the same, which I believe you did from
this article.
https://towardsdatascience.com/understand-convexity-in-optimization-db87653bf920

You might also like