SVM Using Python

This document provides an overview of support vector machines (SVMs) and how they can be used for classification problems in Python. It explains that SVMs find the optimal hyperplane that separates classes by maximizing the margin between them. The document discusses support vectors, kernels, different kernel types (linear, polynomial, radial basis function), and provides an example of using SVMs for image classification with character recognition. It also covers classification reports, precision, recall, f1 score, support, and confusion matrices for evaluating SVM performance.

Uploaded by

Ravindra Ambilwade

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

200 views

SVM Using Python

Uploaded by

Ravindra Ambilwade

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

Support Vector Machine using Python

What is support vector?

• “Support Vector Machine” (SVM) is a supervised
machine learning algorithm which can be used for
both classification or regression challenges.
However, it is mostly used in classification
problems.
• In this algorithm, we plot each data item as a point
in n-dimensional space (where n is number of
features you have) with the value of each feature
being the value of a particular coordinate.
• Then, we perform classification by finding the
hyperplane that differentiate the two classes
very well.
Support Vector Machine
• Generally, Support Vector Machines is considered to be
a classification approach, it but can be employed in
both types of classification and regression problems.
• It can easily handle multiple continuous and
categorical variables.
• SVM constructs a hyperplane in multidimensional space
to separate different classes. SVM generates optimal
hyperplane in an iterative manner, which is used to
minimize an error.
• The core idea of SVM is to find a maximum
marginal hyperplane(MMH) that best divides the
dataset into classes.
Decision Vectors
Definitions
• Support Vectors
– Support vectors are the data points, which are
closest to the hyperplane. These points will
define the separating line better by
calculating margins. These points are more
relevant to the construction of the classifier.
• Hyperplane
– A hyperplane is a decision plane which
separates between a set of objects
having different class memberships.
Conclusions
• Margin
– A margin is a gap between the two lines
on the closest class points.
– This is calculated as the perpendicular
distance from the line to support vectors
or closest points.
– If the margin is larger in between the
classes, then it is considered a good margin,
a smaller margin is a bad margin.
How SVM works ?
• The main objective is to segregate the given dataset
in the best possible way.
• The distance between the either nearest points is
known as the margin.
• The objective is to select a hyperplane with the
maximum possible margin between support vectors
in the given dataset. SVM searches for the maximum
marginal hyperplane in the following steps:
– Generate hyperplanes which segregates the classes in
the best way.
– Select the right hyperplane with the maximum
segregation from the either nearest data points.
How SVM works ?
Non-linear and inseparable planes
• Some problems can’t be solved using
linear hyperplane.
• In such situation, SVM uses a kernel trick
to transform the input space to a higher
dimensional space as shown on the right.
• The data points are plotted on the x-axis and
z- axis (Z is the squared sum of both x and y:
z=x^2=y^2).
• Now you can easily segregate these points
using linear separation.
Non-linear and inseparable planes
SVM Kernels
• The SVM algorithm is implemented in practice using a
kernel. A kernel transforms an input data space into
the required form.
• SVM uses a technique called the kernel trick. Here,
the kernel takes a low-dimensional input space and
transforms it into a higher dimensional space.
• In other words, you can say that it converts non-
separable problem to separable problems by adding
more dimension to it.
• It is most useful in non-linear separation problem.
Kernel trick helps you to build a more accurate
classifier.
Kernel Types
• Linear Kernel
• Polynomial Kernel
• Radial Basis Function
Kernel
Linear Kernel
• A linear kernel can be used as normal
dot product any two given
observations.
• The product between two vectors is
the sum of the multiplication of each
pair of input values.

K(x, xi) = sum(x * xi)

Polynomial Kernel
• A polynomial kernel is a more generalized
form of the linear kernel.
• The polynomial kernel can distinguish curved
or nonlinear input space.
K(x,xi) = 1 + sum(x * xi)^d
• Where d is the degree of the polynomial. d=1
is similar to the linear transformation.
• The degree needs to be manually specified
in the learning algorithm.
Radial Basis Function Kernel
• The Radial basis function kernel is a popular kernel
function commonly used in support vector machine
classification. RBF can map an input space in infinite
dimensional space.

K(x,xi) = exp(-gamma * sum((x – xi^2))

• Here gamma is a parameter, which ranges from 0 to 1. A

higher value of gamma will perfectly fit the training
dataset, which causes over-fitting. Gamma=0.1 is
considered to be a good default value.
• The value of gamma needs to be manually specified in
the learning algorithm.
Example: Image Processing
• Image processing is a difficult task for many types
of machine learning algorithms.
• The relationships linking patterns of pixels to
higher concepts are extremely complex and hard
to define.
• For instance, it's easy for a human being to recognize
a face, a cat, or the letter "A", but defining these
patterns in strict rules is difficult.
• Furthermore, image data is often noisy. There can be
many slight variations in how the image was
captured, depending on the lighting, orientation,
and positioning of the subject.
Example: Data Collection
• When OCR software first processes a document, it
divides the paper into a matrix such that each cell
in the grid contains a single glyph, which is just a
term referring to a letter, symbol, or number.
• Next, for each cell, the software will attempt
to match the glyph to a set of all characters
it recognizes.
• Finally, the individual characters would be
combined back together into words, which
optionally could be spell-checked against a
dictionary in the document's language.
The Dataset
• We'll use a dataset donated to the UCI
Machine Learning Data Repository (
http://archive.ics.uci.edu/ml ) by W. Frey and D.
J. Slate.
• The dataset contains 20,000 examples of 26 English
alphabet capital letters as printed using 20 different
randomly reshaped and distorted black and white
fonts.
• The following figure, published by Frey and Slate,
provides an example of some of the printed
glyphs.
• Distorted in this way, the letters are challenging for
a computer to identify, yet are easily recognized by
The svc function()
• C : float, optional (default=1.0)
– Penalty parameter C of the error term.
• kernel : string, optional (default=’rbf’)
– Specifies the kernel type to be used in the algorithm. It
must be one of ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’,
‘precomputed’ or a callable. If none is given, ‘rbf’ will be
used.
• random_state : int, RandomState instance or
None, optional (default=None)
– The seed of the pseudo random number generator to use
when shuffling the data. If int, random_state is the seed
used by the random number generator; If RandomState
instance, random_state is the random number generator;
Classification report: Precision
• Precision is the ability of a classifier not
to label an instance positive that is
actually negative.
• For each class it is defined as as the ratio
of true positives to the sum of true and
false positives.
• Said another way, “for all instances
classified positive, what percent
was correct?”
Classification report: Recall
• Recall is the ability of a classifier to
find all positive instances.
• For each class it is defined as the ratio of
true positives to the sum of true
positives and false negatives.
• Said another way, “for all instances that
were actually positive, what percent
was classified correctly?”
Classification report: f1 score
• The F1 score is a weighted harmonic
mean of precision and recall such that
the best score is 1.0 and the worst is 0.0.
• Generally speaking, F1 scores are lower
than accuracy measures as they embed
precision and recall into their
computation.
• As a rule of thumb, the weighted average
of F1 should be used to compare
classifier models, not global accuracy.
Classification report: Support
• Support is the number of actual occurrences of
the class in the specified dataset.
• Imbalanced support in the training data may
indicate structural weaknesses in the reported
scores of the classifier and could indicate the
need for stratified sampling or rebalancing.
• Support doesn’t change between models but
instead diagnoses the evaluation process.
Confusion Matrix
• This is also known as an error matrix, is a specific table
layout that allows visualization of the performance of an
algorithm, typically a supervised learning one.
• Each row of the matrix represents the instances in a
predicted class while each column represents the instances
in an actual class (or vice versa).
• The name stems from the fact that it makes it easy to see if
the system is confusing two classes (i.e. commonly
mislabeling one as another).
• It is a special kind of contingency table, with two dimensions
("actual" and "predicted"), and identical sets of "classes" in
both dimensions (each combination of dimension and class is
a variable in the contingency table).

(Latest 2k24) IAPP AIGP Exam Free Sample Questions - TRY These Questions For Success
No ratings yet
(Latest 2k24) IAPP AIGP Exam Free Sample Questions - TRY These Questions For Success
6 pages
Learn Programming and Electronics With Proteus Visual Designer A Beginners Guide To Programming Arduino Using Proteus Visual Designer
0% (1)
Learn Programming and Electronics With Proteus Visual Designer A Beginners Guide To Programming Arduino Using Proteus Visual Designer
288 pages
Lecture 01 - UML Case Tools
No ratings yet
Lecture 01 - UML Case Tools
44 pages
NLP handwritten notes_copy
No ratings yet
NLP handwritten notes_copy
26 pages
Lecture - 2 Classification (Machine Learning Basic and KNN)
No ratings yet
Lecture - 2 Classification (Machine Learning Basic and KNN)
94 pages
NLP Notes Unit 2
No ratings yet
NLP Notes Unit 2
21 pages
Evaluation Metrics in Machine Learning
No ratings yet
Evaluation Metrics in Machine Learning
14 pages
NLP Part1
No ratings yet
NLP Part1
67 pages
Deep Learning RNN
100% (1)
Deep Learning RNN
53 pages
ML UNIT-IV Notes
100% (1)
ML UNIT-IV Notes
23 pages
ML Unit-1
No ratings yet
ML Unit-1
26 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Data Literacy Questions All Types
No ratings yet
Data Literacy Questions All Types
2 pages
NATURAL LANGUAGE PROCESSING IN Education
No ratings yet
NATURAL LANGUAGE PROCESSING IN Education
17 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Agents in Artificial Intelligence Book
No ratings yet
Agents in Artificial Intelligence Book
29 pages
Machine Learning and Neural Networks: Riccardo Rizzo
100% (1)
Machine Learning and Neural Networks: Riccardo Rizzo
113 pages
Notes of NLP - Unit-2
No ratings yet
Notes of NLP - Unit-2
23 pages
Deep Learning (MODULE-3) (1)
No ratings yet
Deep Learning (MODULE-3) (1)
85 pages
NLP Notes
No ratings yet
NLP Notes
71 pages
Unit I R Data Structures
No ratings yet
Unit I R Data Structures
30 pages
Implementation of N-Gram Technique
No ratings yet
Implementation of N-Gram Technique
6 pages
FineTuning Process Using OpenAI 1703440516
No ratings yet
FineTuning Process Using OpenAI 1703440516
14 pages
Two Stage Job Title Identification-1
No ratings yet
Two Stage Job Title Identification-1
77 pages
Data Science Chapitre 0
No ratings yet
Data Science Chapitre 0
25 pages
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages
Unit 5 - Data Mining - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Data Mining - WWW - Rgpvnotes.in
15 pages
Independent Component Analysis: Bhagesh Bhutani (20) Chayan Sharma (21) Deepak
No ratings yet
Independent Component Analysis: Bhagesh Bhutani (20) Chayan Sharma (21) Deepak
15 pages
Optimization Techniques in Deep Learning
No ratings yet
Optimization Techniques in Deep Learning
14 pages
Data Mining:: Concepts and Techniques
100% (1)
Data Mining:: Concepts and Techniques
63 pages
71A Machine Learning
No ratings yet
71A Machine Learning
8 pages
NLP Notes
No ratings yet
NLP Notes
92 pages
unit V
No ratings yet
unit V
67 pages
A-Simple-Neural-Network-From-Scratch - Jupyter Notebook
No ratings yet
A-Simple-Neural-Network-From-Scratch - Jupyter Notebook
9 pages
Distributed System
100% (1)
Distributed System
119 pages
Single Layer Perceptron
No ratings yet
Single Layer Perceptron
14 pages
DATA ANAYTICS Notes UNIT4
No ratings yet
DATA ANAYTICS Notes UNIT4
45 pages
Support Vector Machine - Explanation
No ratings yet
Support Vector Machine - Explanation
12 pages
Shanthi ML PPT
No ratings yet
Shanthi ML PPT
26 pages
Guide To RAG System Evaluation Metrics
No ratings yet
Guide To RAG System Evaluation Metrics
21 pages
Unit1 of AI
No ratings yet
Unit1 of AI
214 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
BDA Lab ManuaL[1]
No ratings yet
BDA Lab ManuaL[1]
83 pages
Introduction To Data Visualization With Python
No ratings yet
Introduction To Data Visualization With Python
47 pages
Lab Program
100% (1)
Lab Program
15 pages
ML Lab Observation
100% (1)
ML Lab Observation
44 pages
Transformer Architecture
No ratings yet
Transformer Architecture
18 pages
List of SIH Based Machine Learning Projects From Smart India Hackathon (SIH)(2)
No ratings yet
List of SIH Based Machine Learning Projects From Smart India Hackathon (SIH)(2)
32 pages
Bias and Variance
No ratings yet
Bias and Variance
6 pages
2.2 ML Session Bias Variance Tradeoffs
No ratings yet
2.2 ML Session Bias Variance Tradeoffs
38 pages
UE20CS302 Unit4 Slides
No ratings yet
UE20CS302 Unit4 Slides
312 pages
Introduction To Natural Language Processing (NLP)
No ratings yet
Introduction To Natural Language Processing (NLP)
87 pages
Tableau Lab Example
No ratings yet
Tableau Lab Example
9 pages
Review On NLP Paraphrase Detection Approaches
No ratings yet
Review On NLP Paraphrase Detection Approaches
4 pages
Machine Learning Tutorial PDF
No ratings yet
Machine Learning Tutorial PDF
56 pages
PPT1
No ratings yet
PPT1
93 pages
Core Java
No ratings yet
Core Java
217 pages
Answers For End-Sem Exam Part - 2 (Deep Learning)
No ratings yet
Answers For End-Sem Exam Part - 2 (Deep Learning)
20 pages
Ai Course File
No ratings yet
Ai Course File
67 pages
Machine Learning Midterm
No ratings yet
Machine Learning Midterm
18 pages
Optimizing Hadoop for MapReduce
From Everand
Optimizing Hadoop for MapReduce
Khaled Tannir
No ratings yet
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
02 Python - Getting Started
No ratings yet
02 Python - Getting Started
73 pages
03 Python - Decision Making and Looping
No ratings yet
03 Python - Decision Making and Looping
52 pages
World of Python
No ratings yet
World of Python
26 pages
World of Python
No ratings yet
World of Python
28 pages
7 CPC Order
No ratings yet
7 CPC Order
6 pages
PCA Using Python
No ratings yet
PCA Using Python
18 pages
Brochure Edn
No ratings yet
Brochure Edn
1 page
Data Visualization Rubic
No ratings yet
Data Visualization Rubic
48 pages
Python Practical File
No ratings yet
Python Practical File
27 pages
08_47S_SPP_PP&E_Technical_Sessions
No ratings yet
08_47S_SPP_PP&E_Technical_Sessions
27 pages
Mu320e TM en 1.7
No ratings yet
Mu320e TM en 1.7
122 pages
SPM - PPM - Workshop Preparation Guide - San Diego
100% (1)
SPM - PPM - Workshop Preparation Guide - San Diego
21 pages
$17k Finding Lost Crypto With WalletCracker
67% (3)
$17k Finding Lost Crypto With WalletCracker
11 pages
Dominando o SQL Server
No ratings yet
Dominando o SQL Server
75 pages
Sezg681- Sszg681- Cyber Security - Cs - 10 - 11
No ratings yet
Sezg681- Sszg681- Cyber Security - Cs - 10 - 11
103 pages
HELIAX® SureGuard Version 2.0 Wea Therproofing System For 1 - 4 3 - 8 and 1 - 2 SureFlex® Jumpers
No ratings yet
HELIAX® SureGuard Version 2.0 Wea Therproofing System For 1 - 4 3 - 8 and 1 - 2 SureFlex® Jumpers
2 pages
1.4 1 Install App Software
No ratings yet
1.4 1 Install App Software
26 pages
ESX-200 EXT-200 ECX-200 IME Rev-J v1.31
No ratings yet
ESX-200 EXT-200 ECX-200 IME Rev-J v1.31
48 pages
Instagram Automation
No ratings yet
Instagram Automation
5 pages
SAP K088 Tutorial: How To Settle Individual Production Order
100% (4)
SAP K088 Tutorial: How To Settle Individual Production Order
7 pages
Dentwiton Company Profile (En) 20221019-V2.1
No ratings yet
Dentwiton Company Profile (En) 20221019-V2.1
20 pages
09 - TB1300 - 03 - UDO - Solutions
No ratings yet
09 - TB1300 - 03 - UDO - Solutions
2 pages
FTI 2023 Trend Report
No ratings yet
FTI 2023 Trend Report
820 pages
WorkForce Software REST API Guide
No ratings yet
WorkForce Software REST API Guide
109 pages
Gantt Chart Sample For Thesis
100% (1)
Gantt Chart Sample For Thesis
8 pages
Conducting Research in Psychology Measuring The Weight of Smoke 4th Edition Pelham Test Bank
No ratings yet
Conducting Research in Psychology Measuring The Weight of Smoke 4th Edition Pelham Test Bank
25 pages
Cisa Leaflet PDF
No ratings yet
Cisa Leaflet PDF
2 pages
Snappr Photo Analyzer
No ratings yet
Snappr Photo Analyzer
4 pages
Design Patterns in Java
No ratings yet
Design Patterns in Java
32 pages
CBS131 - Assessment 1 - 20240603
No ratings yet
CBS131 - Assessment 1 - 20240603
9 pages
TOC Notes
No ratings yet
TOC Notes
104 pages
Asus Prime h410m-f1.00 PDF
No ratings yet
Asus Prime h410m-f1.00 PDF
112 pages
1.5 Introduction To Networks - Internet Connections
No ratings yet
1.5 Introduction To Networks - Internet Connections
3 pages
NC7SZ08
No ratings yet
NC7SZ08
8 pages
Q2 CSS Performance Task 2
No ratings yet
Q2 CSS Performance Task 2
3 pages
Naval Artificial Intelligence: June 2017
No ratings yet
Naval Artificial Intelligence: June 2017
8 pages

SVM Using Python

Uploaded by

SVM Using Python

Uploaded by

Support Vector Machine using Python

What is support vector?

K(x, xi) = sum(x * xi)

K(x,xi) = exp(-gamma * sum((x – xi^2))

• Here gamma is a parameter, which ranges from 0 to 1. A

You might also like