Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Machine Learning: B.Tech (CSBS) V Semester

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 17

Machine Learning

B.Tech (CSBS) Vth Semester

By:
Prof. Moumita Pal
B.Tech (CSE), M.Tech (CI), Ph.D (pursuing)
Contact No.: +919952038867
Mail id: moumitafdp@gmail.com
Contents: Unit 1 Lecture 2 & 3

 Machine Learning Models


 Supervised Learning
 Types: Classification
 Types: Regression
 Unsupervised Learning
 Types: Clustering, Association, Dimensionality Reduction
 Comparison
 Applications of ML

Prof. Moumita Pal 2


Machine Learning Models
Task Driven Data Driven

Supervised Learning Unsupervised Learning


(Pre-categorized data) (Unlabeled Data)
Predications + Predictive Models Pattern/Structure Recognition

Clustering
Divide by similarity

Association
Regression Classification Identify Sequences

Divide the ties by length Divide the socks by color

Dimensionality Reduction
Linear Regression Logistic Regression
Compress data based on features

Decision Tree

Support Vector Machine

Random Forest
Prof. Moumita Pal 3

Neural Networks Naïve Bayes


Supervised Learning
Supervised learning is when we teach or train the machine using data that is well labeled,
which means some data is already tagged with the correct answer. After that, the machine is
provided with a new set of examples(data) so that the supervised learning algorithm analyses
the training data(set of training examples) and produces a correct outcome from labeled data. 
Types: Regression
Linear Regression:
 Linear Regression is a machine learning algorithm based on supervised learning. It
performs a regression task.
 Regression models a target prediction value based on independent variables. It is mostly
used for finding out the relationship between variables and forecasting.
 Different regression models differ based on – the kind of relationship between dependent
and independent variables they are considering and the number of independent variables
being used.
 While training the model we are given :
x: input training data (univariate – one input variable(parameter))
y: labels to data (supervised learning)

Prof. Moumita Pal 5


Decision Tree:
 A Decision tree is a flowchart like tree structure, where each internal node denotes a test on an
attribute, each branch represents an outcome of the test, and each leaf node (terminal node)
holds a class label. 
 It is a graphical representation for getting all the possible solutions to a problem/decision based
on given conditions.

Prof. Moumita Pal 6


Random Forest:
 Every decision tree has high variance, but when we combine all of them together in
parallel then the resultant variance is low as each decision tree gets perfectly trained on
that particular sample data and hence the output doesn’t depend on one decision tree
but multiple decision trees.
 In the case of a classification problem, the final output is taken by using the majority
voting classifier. In the case of a regression problem, the final output is the mean of all
the outputs.

Prof. Moumita Pal 7


Neural Networks:
 Neural networks are artificial systems that were inspired by biological neural
networks.
 These systems learn to perform tasks by being exposed to various datasets and
examples without any task-specific rules.

Prof. Moumita Pal 8


Types: Classification
Logistic Regression:
 The model builds a regression model to predict the probability that a given data entry
belongs to the category numbered as “1”.
 Just like Linear regression assumes that the data follows a linear function, Logistic
regression models the data using the sigmoid function.
 Types are: binomial, multinomial, ordinal.

Prof. Moumita Pal 9


Support Vector Machine:
 SVM finds a hyper-plane that creates a boundary between the types of data. In 2-
dimensional space, this hyper-plane is nothing but a line.
 In SVM, we plot each data item in the dataset in an N-dimensional space, where N is
the number of features/attributes in the data. Next, find the optimal hyperplane to
separate the data.

Prof. Moumita Pal 10


Naïve Bayes:
 Naive Bayes classifiers are a collection of classification algorithms based on Bayes’
Theorem.
 It is not a single algorithm but a family of algorithms where all of them share a
common principle, i.e. every pair of features being classified is independent of each
other.

Prof. Moumita Pal 11


Unsupervised Learning
• Unsupervised learning is the training of a machine using information that is neither
classified nor labeled and allowing the algorithm to act on that information without
guidance.
• Here the task of the machine is to group unsorted information according to similarities,
patterns, and differences without any prior training of data.

Prof. Moumita Pal 12


Types
Clustering:
A clustering problem is where you want to discover the inherent groupings in the data, such
as grouping customers by purchasing behavior.

Association:
 An association rule learning problem is where you want to discover rules that describe
large portions of your data, such as people that buy X also tend to buy Y.

Prof. Moumita Pal 13


Dimensionality Reduction:
 In machine learning classification problems, there are often too many factors on the
basis of which the final classification is done.
 These factors are basically variables called features. The higher the number of features,
the harder it gets to visualize the training set and then work on it.
 Sometimes, most of these features are correlated, and hence redundant. This is where
dimensionality reduction algorithms come into play.
 Dimensionality reduction is the process of reducing the number of random variables
under consideration, by obtaining a set of principal variables.

Prof. Moumita Pal 14


Comparison

Prof. Moumita Pal 15


Any Questions?

Prof. Moumita Pal 16


Applications of ML

Prof. Moumita Pal 17

You might also like