Module - 4 - ECE3047 - Machine Learning
Module - 4 - ECE3047 - Machine Learning
Learning Fundamentals
Prepared By
Dr. Rohith G
Assistant Professor (Senior)
School of Electronics Engineering (SENSE), VIT-Chennai
Under the Guidance and Materials mentored by
Dr. Sathiya Narayanan S
Assistant Professor (Senior)
School of Electronics Engineering (SENSE), VIT-Chennai
Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 1
• Module 1: Introduction (to • Module 5: Clustering
Machine Learning) • Module 6: Optimization
• Module 2: Data Preprocessing
• Module 7:
• Module 3: Regression Reinforcement Learning
• Module 4: Classification
Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 2
Topics in Module-4
Classification
• Introduction – Hyperplane – Radial Basis Function (RBF) –
Support Vector Machine (SVM) – Support Vector Regression
(SVR)- Random Forest (RF)- Case Study.
• The Classification algorithm is a Supervised Learning technique that is used to identify the
category of new observations on the basis of training data.
• In Classification, a program learns from the given dataset or observations and then classifies new
observation into a number of classes or groups.
• If a patient has stiff neck, what’s the probability he/she has meningitis?
P( S | M ) P( M ) 0.5 1 / 50000
P( M | S ) 0.0002
P( S ) 1 / 20
Step-1 There are six categories for computing a classification task in the Training samples
Step-4
X X X
This line
represents the
decision
boundary:
ax + by − c = 0
Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 42
Types of Support Vector Machine (SVM)
• Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset
can be classified into two classes by using a single straight line, then such data is
termed as linearly separable data.
• Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which
means if a dataset cannot be classified by using a straight line.
Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 43
Types of Support Vector Machine (SVM)
Temperature
Humidity
= play tennis
= do not play tennis
SVM
f(x) =-1
=+1
Disadvantages:
•If the number of features is much greater than the number
of samples, avoid over-fitting in choosing Kernel
functions and regularization term is crucial.
•SVMs do not directly provide probability estimates, these
are calculated using an expensive five-fold cross-validation
Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 53
Support Vector Machine (SVM)-Solved Example#1
Suppose, we have positively labeled data points
There are 4 independent variables - Outlook, Temperature, Humidity, and Wind to determine
the dependent variable-whether to play football or not.
Prepared by Dr. Rohith G, AP(Senior),VIT Chennai 66
Decision tree Solved Example
1 Calculation of Information gain(difference between parent entropy and average weighted entropy)
and Entropy (determines how a decision tree chooses to split data)