0% found this document useful (0 votes)

3 views

SVM Part2

Uploaded by

kritimalik1

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

SVM Part2

Uploaded by

kritimalik1

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

SVM Tutorial 

SVM - Understanding the math - Part 2

November 9, 2014 by Alexandre KOWALCZYK

This is Part 2 of my series of tutorial about the math behind Support Vector Machines.
If you did not read the previous article, you might want to start the serie at the beginning
by reading this article: an overview of Support Vector Machine.

In the first part, we saw what is the aim of the SVM. Its goal is to find the hyperplane
which maximizes the margin.
But how do we calculate this margin?
SVM = Support VECTOR Machine
In Support Vector Machine, there is the word vector.
That means it is important to understand vector well and how to use them.
Here a short sum-up of what we will see today:
What is a vector?
its norm
its direction
How to add and subtract vectors ?
What is the dot product ?
How to project a vector onto another ?
Once we have all these tools in our toolbox, we will then see:
What is the equation of the hyperplane?
How to compute the margin?
What is a vector?
If we define a point A(3, 4) in ℝ 2 we can plot it like this.

Figure 1: a point

Definition: Any point x = (x1 , x2 ), x ≠ 0, in ℝ 2 specifies a vector in

the plane, namely the vector starting at the origin and ending at x.
This definition means that there exists a vector between the origin and A.
Figure 2 - a vector
If we say that the point at the origin is the point O(0, 0) then the vector above is the
vector OA
→
. We could also give it an arbitrary name such as .
u

Note: You can notice that we write vector either with an arrow on top of them, or in bold,
in the rest of this text I will use the arrow when there is two letters like OA
→
and the bold
notation otherwise.
Ok so now we know that there is a vector, but we still don't know what IS a vector.

Definition: A vector is an object that has both a magnitude and a

direction.
We will now look at these two concepts.
1) The magnitude
The magnitude or length of a vector x is written ‖x‖ and is called its norm.
For our vector OA
→
, ‖OA‖ is the length of the segment OA
Figure 3
From Figure 3 we can easily calculate the distance OA using Pythagoras' theorem:
OA2 = OB2 + AB2

OA2 = 32 + 42

OA2 = 25

‾‾
‾
OA = √25

‖OA‖ = OA = 5
2) The direction
The direction is the second component of a vector.

Definition
u u
: The direction of a vector u (u1 , u2 ) is the vector
w ( ‖u1‖ , ‖u2‖ )

Where does the coordinates of come from ?

Understanding the definition

To find the direction of a vector, we need to use its angles.
Figure 4 - direction of a vector
Figure 4 displays the vector u (u1 , u2 ) with u1 = 3 and u2 = 4
We could say that :
Naive definition 1: The direction of the vector is defined by the angle θ with respect to
u

the horizontal axis, and with the angle α with respect to the vertical axis.
This is tedious. Instead of that we will use the cosine of the angles.
In a right triangle, the cosine of an angle β is defined by :
adjacent
cos(β) =
hypotenuse

In Figure 4 we can see that we can form two right triangles, and in both case the adjacent
side will be on one of the axis. Which means that the definition of the cosine implicitly
contains the axis related to an angle. We can rephrase our naïve definition to :
Naive definition 2: The direction of the vector is defined by the cosine of the angle
u

θ and the cosine of the angle α.

Now if we look at their values :

u1
cos(θ) =
‖ u‖
u2
cos(α) =
‖ u‖
Hence the original definition of the vector . That's why its coordinates are also called
w

direction cosine.
Computing the direction vector
We will now compute the direction of the vector from Figure 4.:
u

u1 3
cos(θ) = = = 0.6
‖ u‖ 5
and
u2 4
cos(α) = = = 0.8
‖ u‖ 5
The direction of (3, 4) is the vector (0.6, 0.8)
u w

If we draw this vector we get Figure 5:

Figure 5: the direction of u

We can see that as indeed the same look as except it is smaller. Something
w u

interesting about direction vectors like is that their norm is equal to 1. That's why we
w

often call them unit vectors.

The sum of two vectors

Figure 6: two vectors u and v

Given two vectors u (u1 , u2 ) and v (v1 , v2 ) then :
u + v = (u1 + v1 , u2 + v2 )

Which means that adding two vectors gives us a third vector whose coordinate are the
sum of the coordinates of the original vectors.
You can convince yourself with the example below:

Figure 7: the sum of two vectors

The difference between two vectors
The difference works the same way :
u − v = (u1 − v1 , u2 − v2 )

Figure 8: the difference of two vectors

Since the subtraction is not commutative, we can also consider the other case:
v − u = (v1 − u1 , v2 − u2 )

Figure 9: the difference v-u

The last two pictures describe the "true" vectors generated by the difference of and .
u v

However, since a vector has a magnitude and a direction, we often consider that parallel
translate of a given vector (vectors with the same magnitude and direction but with a
different origin) are the same vector, just drawn in a different place in space.
So don't be surprised if you meet the following :

Figure 10: another way to view the difference v-u

and

Figure 11: another way to view the difference u-v

If you do the math, it looks wrong, because the end of the vector is not in the right
u − v

point, but it is a convenient way of thinking about vectors which you'll encounter often.
The dot product
One very important notion to understand SVM is the dot product.

Definition: Geometrically, it is the product of the Euclidian

magnitudes of the two vectors and the cosine of the angle between
them
Which means if we have two vectors and and there is an angle θ (theta) between
x y

them, their dot product is :

x ⋅ y = ‖x‖‖y‖cos(θ)

Why ?
To understand let's look at the problem geometrically.

Figure 12
In the definition, they talk about cos(θ), let's see what it is.
By definition we know that in a right-angled triangle:
adjacent
cos(θ) =
hypotenuse

In our example, we don't have a right-angled triangle.

However if we take a different look Figure 12 we can find two right-angled triangles
formed by each vector with the horizontal axis.
Figure 13
and

Figure 14
So now we can view our original schema like this:
Figure 15
We can see that
θ=β−α

So computing cos(θ) is like computing cos(β − α)

There is a special formula called the difference identity for cosine which says that:
cos(β − α) = cos(β)cos(α) + sin(β)sin(α)

(if you want you can read the demonstration here)

Let's use this formula!
adjacent x1
cos(β) =
hypotenuse ‖x‖
=

opposite x2
sin(β) =
hypotenuse ‖x‖
=

adjacent y1
cos(α) =
hypotenuse ‖y‖
=

opposite y
sin(α) = = 2
hypotenuse ‖y‖

So if we replace each term

cos(θ) = cos(β − α) = cos(β)cos(α) + sin(β)sin(α)

x1 y1 x2 y2
cos(θ) = +
‖ x ‖ ‖ y‖ ‖ x ‖ ‖ y‖
x1 y1 + x2 y2
cos(θ) =
‖x‖‖y‖
If we multiply both sides by ‖x‖‖y‖ we get:
‖x‖‖y‖cos(θ) = x1 y1 + x2 y2
Which is the same as :
‖x‖‖y‖cos(θ) = x ⋅y

We just found the geometric definition of the dot product !

Eventually from the two last equations we can see that :
2
x ⋅ y = x1 y1 + x2 y2 = (xi yi )
∑
i=1

This is the algebraic definition of the dot product !

A few words on notation
The dot product is called like that because we write a dot between the two vectors.
Talking about the dot product ⋅ is the same as talking about
x y

the inner product ⟨x, y⟩ (in linear algebra)

scalar product because we take the product of two vectors and it returns a scalar
(a real number)
The orthogonal projection of a vector
Given two vectors and , we would like to find the orthogonal projection of onto .
x y x y
Figure 16
To do this we project the vector onto
x y

Figure 17
This give us the vector z

Figure 18 : z is the projection of x onto y

By definition :
cos(θ) =
‖z ‖
‖x ‖
‖z‖ = ‖x‖cos(θ)
We saw in the section about the dot product that
x⋅y
cos(θ) =
‖x‖‖y‖
So we replace cos(θ) in our equation:
‖z‖ = ‖x‖ ‖x‖‖
⋅
y‖
x y

‖z‖ = ‖y⋅ ‖ x y

If we define the vector as the direction of then

u y

y
=
u
‖ y‖
and
‖z ‖ = u ⋅x

We now have a simple way to compute the norm of the vector . z

Since this vector is in the same direction as it has the direction

y u

z
=
u
‖z ‖
z = ‖z ‖u

And we can say :

The vector z = (u ⋅ x )u is the orthogonal projection of onto . x y

Why are we interested by the orthogonal projection ? Well in our example, it allows us to
compute the distance between and the line which goes through .
x y
Figure 19
We see that this distance is ‖x − z‖
‖x − z‖ = ‾(3‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
√
− 4)2 + (5 − 1)2‾ = ‾‾
‾
17 √

The SVM hyperplane

Understanding the equation of the hyperplane
You probably learnt that an equation of a line is : y = ax + b. However when reading
about hyperplane, you will often find that the equation of an hyperplane is defined by :
T
w x =0

How does these two forms relate ?

In the hyperplane equation you can see that the name of the variables are in bold. Which
means that they are vectors ! Moreover, T is how we compute the inner product of
w x

two vectors, and if you recall, the inner product is just another name for the dot product !
Note that
y = ax + b

is the same thing as

y − ax − b = 0
⎛ −b ⎞ ⎛ 1⎞
Given two vectors w
⎜
⎜
−a
⎟
⎟
and x
⎜
⎜
x
⎟
⎟
⎝ 1 ⎠ ⎝y⎠

T
w x = −b × (1) + (−a) × x + 1 × y

T
w x = y − ax − b

The two equations are just different ways of expressing the same thing.
It is interesting to note that w0 is −b, which means that this value determines the
intersection of the line with the vertical axis.
Why do we use the hyperplane equation w
Tx instead of y = ax + b ?
For two reasons:
it is easier to work in more than two dimensions with this notation,
the vector will always be normal to the hyperplane(Note: I received a lot of
w

questions about the last remark. will always be normal because we use this
w

vector to define the hyperplane, so by definition it will be normal. As you can

see this page, when we define a hyperplane, we suppose that we have a vector that
is orthogonal to the hyperplane)
And this last property will come in handy to compute the distance from a point to the
hyperplane.
Compute the distance from a point to the hyperplane
In Figure 20 we have an hyperplane, which separates two group of data.
Figure 20
To simplify this example, we have set w0 = 0.
As you can see on the Figure 20, the equation of the hyperplane is :
x2 = −2x1

which is equivalent to
T
w x =0

with w
2
(1)
and x
x1
(x )
2

Note that the vector is shown on the Figure 20. (w is not a data point)
w

We would like to compute the distance between the point A(3, 4) and the hyperplane.
This is the distance between A and its projection onto the hyperplane
Figure 21
We can view the point A as a vector from the origin to A.
If we project it onto the normal vector w

Figure 22 : projection of a onto w

We get the vector p
Figure 23: p is the projection of a onto w
Our goal is to find the distance between the point A(3, 4) and the hyperplane.
We can see in Figure 23 that this distance is the same thing as ‖p‖.
Let's compute this value.
We start with two vectors, = (2, 1) which is normal to the hyperplane, and
w a = (3, 4)
which is the vector between the origin and A.
‖w‖ = √ ‾2‾‾‾‾‾
2
+ 12‾ = ‾
5 √

Let the vector be the direction of

u w

2 1
=( , )
√‾ ‾
u
5 √5

p is the orthogonal projection of onto so :

a w

p = ( u ⋅ a) u

2 1
= (3 × +4× )u
√‾ ‾
p
5 √5
6 4
=( + )u
√‾ ‾
p
5 √5

10
=
√‾
p u
5

10 2 10 1
=( × , × )
√‾ ‾ √5‾ √5‾
p
5 √5

20 10
p =( , )
5 5

p = (4, 2)

‖p‖ = √ ‾4‾‾‾‾‾
2
+ 22‾ = 2 ‾
5 √

Compute the margin of the hyperplane

Now that we have the distance ‖p‖ between A and the hyperplane, the margin is
defined by :
margin = 2‖p‖ = 4√5
‾
We did it ! We computed the margin of the hyperplane !
Conclusion
This ends the Part 2 of this tutorial about the math behind SVM.
There was a lot more of math, but I hope you have been able to follow the article without
problem.
What's next?
Now that we know how to compute the margin, we might want to know how to select the
best hyperplane, this is described in Part 3 of the tutorial : How to find the optimal
hyperplane ?

Alexandre KOWALCZYK
I am passionate about machine learning and Support Vector Machine. I like to explain things
simply to share my knowledge with people from around the world.

Buy me a coffee!

 Mathematics, SVM Tutorial

 hyperplane, math, vectors
 SVM - Understanding the math - Part 1 - The margin
 SVM Tutorial: How to classify text in R

Categories
SVM in Practice
SVM in C#
SVM in R
Text classification
SVM Tutorial
Mathematics

You can download my free e-book

Elements of Urban Stormwater
50% (6)
Elements of Urban Stormwater
234 pages
Vectors o Level PDF
100% (4)
Vectors o Level PDF
13 pages
C1_Vectors
No ratings yet
C1_Vectors
22 pages
Linear Algebra Notes
No ratings yet
Linear Algebra Notes
89 pages
Walter Lewin Notes 3
No ratings yet
Walter Lewin Notes 3
6 pages
lec15
No ratings yet
lec15
16 pages
Chapter 10 - Vector and Geometry Space
No ratings yet
Chapter 10 - Vector and Geometry Space
66 pages
Lecturenotes_weeks1_4_2320
No ratings yet
Lecturenotes_weeks1_4_2320
102 pages
Phy 111
No ratings yet
Phy 111
9 pages
Vectors
No ratings yet
Vectors
13 pages
Tut01 - Geometry Crash Course
No ratings yet
Tut01 - Geometry Crash Course
22 pages
Maths Xii Chapter 10 Vector Algebra
No ratings yet
Maths Xii Chapter 10 Vector Algebra
22 pages
Notes On Vectors and Planes
No ratings yet
Notes On Vectors and Planes
10 pages
Foundations of Vector Calculus Complete
No ratings yet
Foundations of Vector Calculus Complete
18 pages
Vector
No ratings yet
Vector
15 pages
L21-Vectors - Dot and Cross Products
No ratings yet
L21-Vectors - Dot and Cross Products
39 pages
Precalculus Vector
No ratings yet
Precalculus Vector
22 pages
Vectors Summary Sheet122211
No ratings yet
Vectors Summary Sheet122211
2 pages
Unit 5. Vectors, Lines, Planes and Surfaces Math 37 Lecture Guide
No ratings yet
Unit 5. Vectors, Lines, Planes and Surfaces Math 37 Lecture Guide
16 pages
13 Introduction To Vectors PDF
No ratings yet
13 Introduction To Vectors PDF
18 pages
005 Vectors
No ratings yet
005 Vectors
8 pages
Vectors For Beginners
No ratings yet
Vectors For Beginners
10 pages
Calculus III Final Review Notes
100% (1)
Calculus III Final Review Notes
54 pages
Chapter 1
No ratings yet
Chapter 1
64 pages
Chapter 13 Vectors in Two Dimensions
No ratings yet
Chapter 13 Vectors in Two Dimensions
15 pages
Chapter 6 Vectors: 1 Learning Outcomes
No ratings yet
Chapter 6 Vectors: 1 Learning Outcomes
20 pages
vectors
No ratings yet
vectors
4 pages
Unit 1 Module 2: Trigonometry, Geometry and Vectors Vectors Specific Objectives
No ratings yet
Unit 1 Module 2: Trigonometry, Geometry and Vectors Vectors Specific Objectives
9 pages
Vector I 2015
No ratings yet
Vector I 2015
11 pages
Edoc 2934 2521
No ratings yet
Edoc 2934 2521
34 pages
Vecto R: Nur Islami, PHD
No ratings yet
Vecto R: Nur Islami, PHD
34 pages
Vector Algebra
No ratings yet
Vector Algebra
15 pages
Vector and Matrix Algebra
No ratings yet
Vector and Matrix Algebra
9 pages
Physics Sample
No ratings yet
Physics Sample
3 pages
Class 12 Chapter 10 Maths Important Formulas
No ratings yet
Class 12 Chapter 10 Maths Important Formulas
7 pages
Vectors PDF
No ratings yet
Vectors PDF
5 pages
AppliedMath1 (2) - 240302 - 144226
No ratings yet
AppliedMath1 (2) - 240302 - 144226
43 pages
02 01 Vectors A
No ratings yet
02 01 Vectors A
36 pages
The Dot and Cross Product
100% (1)
The Dot and Cross Product
9 pages
Vector Algebra
No ratings yet
Vector Algebra
7 pages
Lecture 2
No ratings yet
Lecture 2
76 pages
Ch 6 Vecteurs
No ratings yet
Ch 6 Vecteurs
27 pages
Vectors in Euclidean Space
No ratings yet
Vectors in Euclidean Space
11 pages
Ch.11 Vectors
No ratings yet
Ch.11 Vectors
1 page
CH 3
No ratings yet
CH 3
20 pages
Vectors2
No ratings yet
Vectors2
2 pages
Darve Cme100 Notes
No ratings yet
Darve Cme100 Notes
131 pages
Darve Cme100 Notes
No ratings yet
Darve Cme100 Notes
131 pages
Vector Operations
No ratings yet
Vector Operations
6 pages
Es202 3
No ratings yet
Es202 3
10 pages
Vector Scalar Equal Vectors: Right 2
No ratings yet
Vector Scalar Equal Vectors: Right 2
3 pages
Chapter 2
No ratings yet
Chapter 2
24 pages
Vectors - 3.1, 3.2, 3.3, 3.4, 3.5
No ratings yet
Vectors - 3.1, 3.2, 3.3, 3.4, 3.5
70 pages
Calc2 6a Vectors and 3d Geometry PDF
No ratings yet
Calc2 6a Vectors and 3d Geometry PDF
7 pages
Scalar Quantities: A 2 I 2 I A
No ratings yet
Scalar Quantities: A 2 I 2 I A
13 pages
Applied Mathematics I
100% (1)
Applied Mathematics I
97 pages
Computer Graphics Using Opengl, 3 Edition F. S. Hill, Jr. and S. Kelley
No ratings yet
Computer Graphics Using Opengl, 3 Edition F. S. Hill, Jr. and S. Kelley
29 pages
Notes 02
No ratings yet
Notes 02
150 pages
AssignmentFile_96_29072024104802
No ratings yet
AssignmentFile_96_29072024104802
36 pages
Frederick
No ratings yet
Frederick
145 pages
Master the Fundamentals of Electromagnetism and EM-Induction
From Everand
Master the Fundamentals of Electromagnetism and EM-Induction
Space Learn
No ratings yet
Uimo Sample Paper
100% (2)
Uimo Sample Paper
5 pages
Investigating Teaching Strategies in Mathematics Classrooms - A C
No ratings yet
Investigating Teaching Strategies in Mathematics Classrooms - A C
56 pages
Practice Sheet - 4.3,4.4
No ratings yet
Practice Sheet - 4.3,4.4
7 pages
TCS CODING QUESTIONS
No ratings yet
TCS CODING QUESTIONS
31 pages
AIATS Planner
No ratings yet
AIATS Planner
1 page
Computer Programming and Utilization
No ratings yet
Computer Programming and Utilization
8 pages
Pecora 1998 Master
No ratings yet
Pecora 1998 Master
4 pages
ASME PTC 30-1991 Air Cooled Heat Exchangers
No ratings yet
ASME PTC 30-1991 Air Cooled Heat Exchangers
19 pages
Free - Space W - Band Setup For The Electrical Characterization of Materials and MM - Wave Components
No ratings yet
Free - Space W - Band Setup For The Electrical Characterization of Materials and MM - Wave Components
44 pages
Applications Involving Fourier Transforms
No ratings yet
Applications Involving Fourier Transforms
60 pages
Data Structure and Algorithm (CS-102) : Ashok K Turuk
No ratings yet
Data Structure and Algorithm (CS-102) : Ashok K Turuk
58 pages
My Document 39
No ratings yet
My Document 39
6 pages
Mar HMMT2021 Qual Int Bee
No ratings yet
Mar HMMT2021 Qual Int Bee
1 page
Tcscpower PDF
No ratings yet
Tcscpower PDF
7 pages
Examples of Markov Chains: - Random Walk On A Line
No ratings yet
Examples of Markov Chains: - Random Walk On A Line
23 pages
MS (Se) Syllabus
No ratings yet
MS (Se) Syllabus
93 pages
12 6 3 Notes
No ratings yet
12 6 3 Notes
4 pages
Project Report Hotel Management New
100% (4)
Project Report Hotel Management New
110 pages
AN704: SCM7B: Application Note: Failure Rate Calculation and Prediction
No ratings yet
AN704: SCM7B: Application Note: Failure Rate Calculation and Prediction
1 page
01 Basic R
No ratings yet
01 Basic R
111 pages
VLMs basics
No ratings yet
VLMs basics
29 pages
PHD Thesis
No ratings yet
PHD Thesis
218 pages
Joseph, Kristen - thesis
No ratings yet
Joseph, Kristen - thesis
57 pages
Computing 7.3 Knowledge Organiser
No ratings yet
Computing 7.3 Knowledge Organiser
1 page
Ijimsep M 2 2018
No ratings yet
Ijimsep M 2 2018
13 pages
Arithmetic 1
No ratings yet
Arithmetic 1
92 pages
Sat Math Practice Test 9 Answers
No ratings yet
Sat Math Practice Test 9 Answers
6 pages
Harkness Test 9b
No ratings yet
Harkness Test 9b
1 page
Electro Magnetic Field PDF
No ratings yet
Electro Magnetic Field PDF
42 pages