0% found this document useful (0 votes)

10 views

Chapter 01 Introduction To Machine Learning

Uploaded by

Pavan Sai Krishna Kotti

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Chapter 01 Introduction To Machine Learning

Uploaded by

Pavan Sai Krishna Kotti

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 59

Introduction to Machine Learning

Course Code: CSE 311

Dr. ShuvenduRana
SRM University-AP, Amaravati, India

shuvendu.r@srmap.edu.in

1
Machine Learning
Objectives:
1. This course will serve as a comprehensive introduction to
various topics in machine learning.
2. Introduce various algorithms related to classification,
regression, clustering.

Learning Outcomes: The students should be able to

1. Visualise, analyse and interpret available data.
2. Design and implement machine learning solutions
(classification, regression, and clustering problems)
3. Able to evaluate and interpret the results of the algorithms
and improve their result.
2
Approach

Theory is when you know everything but nothing works

Practice is when everything works but no one knows why

In our Subject. Theory and Practice are combined:

Nothing works and no one knows why

Actually
Most work and you know why
3
Marks Distribution

Assessment Conducting Converting Final

Internal
Tool Marks Marks Conversion
Mid-term-I 25 10
Theory Mid-term-II 25 10 30
Assignment 10 10
Lab
Practical 20 20 20
Performance
Total 50

Conducting Final
Final Assessment tool
Marks Conversion
End semester theory
Final exam 100 30
exam
End semester Practical Final Practical/Term
100 20
exam Project
Total 50
4
What is Learning?

• Herbert Simon: “Learning is any process by

which a system improves performance from
experience.”
Programmatic approach Machine Learning approach

Input Input
Machine Machine
Input Output Model
(Program)
Output Output

5
Machine Learning

6
Types of Learning

7
Syllabus: Machine Learning

Module 1: Introduction, Linear regression, Logistic Regression

Module II: Decision Tree, Instance based Learning, PCA

Module III: Bayes Theory, Support Vector Machine

Module IV: Artificial Neural Network

Module V: Classifier Ensemble, Clustering

8
Books: Machine Learning

Machine Learning. Tom Mitchell. Mastering machine learning with

First Edition, McGraw- Hill, 1997. python-in-six-steps, A Practical
Implementation Guide to Predictive
Data Analytics Using Python
1st ed. Edition, Manohar Swamynathan
9
Machine Learning:
Introduction

10
Topics

➢Introduction to Machine Learning

➢Type of Machine Learning
➢Approach to design ML application
➢Different Models
➢Hypothesis Space Inductive Learning
➢Issues in Machine Learning Design
➢Evaluation of ML models

11
What is Learning?

• Herbert Simon: “Learning is any process by

which a system improves performance from
experience.”
Programmatic approach Machine Learning approach

Input Input
Machine Machine
Input Output Model
(Program)
Output Output

12
Formal Definition

• A …….. is said to learn from experience E

with respect to some class of tasks T and
performance measure P, if its performance
at tasks in T, as measured by P, improves
with experience E.

13
Defining the Learning Task
(Well-posed Learning Problem)
Improve on task, T, with respect to
performance metric, P, based on experience, E.
Example
T: Playing checkers
P: Percentage of games won against an arbitrary
opponent
E: Playing practice games against itself

Example
T: Recognizing hand-written words
P: Percentage of words correctly classified
E: Database of human-labeled images of handwritten
words

14
Defining the Learning Task
(Well-posed Learning Problem)
Improve on task, T, with respect to
performance metric, P, based on experience, E.
Example:
T: Driving on four-lane highways using vision sensors
P: Average distance traveled before a human-judged
error
E: A sequence of images and steering commands
recorded while observing a human driver.

Example
T: Categorize email messages as spam or legitimate.
P: Percentage of email messages correctly classified.
E: Database of emails, some with human-given labels

15
Types of Learning

16
Types of Learning?

17
Types of Learning?

18
Types of Learning?

19
Types of Learning?

20
Types of Learning

21
Types of Learning?

22
Task solved using Supervised Learning

23
Classification (Examples)
• Assign object/event to one of a given finite set of
categories.
– Medical diagnosis
– Credit card applications or transactions
– Fraud detection in e-commerce
– Worm detection in network packets
– Spam filtering in email
– Recommended articles in a newspaper
– Recommended books, movies, music, or jokes
– Financial investments
– DNA sequences
– Spoken words
– Handwritten letters
– Astronomical images

24
Task solved using Supervised Learning

25
Regression (Linear/Non-linear)
• Prediction/Curve fitting
– Weather forecasting/prediction
– Predicting market value of a share
– Predicting the price for real estate

26
Features

27
Features Space (2D)

28
Supervised Approach (Details)

29
Hypothesis Space

30
Hypothesis Space

31
Inductive Learning

32
Inductive Learning Hypothesis

33
Bias and Variance

The bias is an error from erroneous assumptions in

the learning algorithm. High bias can cause an algorithm to miss the
relevant relations between features and target outputs (underfitting).

34
Overfitting and Underfitting

35
Representation (Models)

36
Representation (Models)

37
Representation (Models)

38
Problem Solving / Planning / Control
• Performing actions in an environment in order to
achieve a goal.
– Solving calculus problems
– Playing checkers, chess, or backgammon
– Balancing a pole
– Driving a car or a jeep
– Flying a plane, helicopter, or rocket
– Controlling an elevator
– Controlling a character in a video game
– Controlling a mobile robot

39
Related Disciplines
• Artificial Intelligence
• Data Mining
• Probability and Statistics
• Information theory
• Numerical optimization
• Computational complexity theory
• Control theory (adaptive)
• Psychology (developmental, cognitive)
• Neurobiology
• Linguistics
• Philosophy

40
Issues: Designing a Learning System
• Choose the training experience (Database)
• Choose exactly what is to be learned. (Target
function).
• Choose how to represent the target function
(Models).
• Choose a learning algorithm to infer the target
function from the experience.
Learner

Environment/
Experience Knowledge

Performance
Element 41
Evaluation of Learning Systems
• Experimental
– Conduct controlled cross-validation experiments to
compare various methods on a variety of benchmark
datasets.
– Gather data on their performance, e.g. test accuracy,
training-time, testing-time.
– Analyze differences for statistical significance
(ANOVA).
• Theoretical
– Analyze algorithms mathematically and prove theorems
about their:
• Computational complexity
• Ability to fit training data
• Sample complexity (number of training examples needed to
learn an accurate function)
42
Measuring Performance

• Evaluating Regression models

– Absolute error
– Mean Square error
• Evaluating Classification models
– TP, TN, FP, FN
– Accuracy
– ROC
• Speed of performance

43
Measuring Performance

• Evaluating Regression models

– Absolute error
– Mean Square error
• Evaluating Classification models
– TP, TN, FP, FN
– Accuracy
– ROC
• Speed of performance

44
How to evaluate the Classifier’s
Generalization Performance?

• Assume that we test a classifier on some

test set and we derive at the end the
following confusion matrix:
Predicted class
Pos Neg
Actual Pos TP FN P
class N
Neg FP TN
Metrics for Classifier’s Evaluation

• Accuracy = (TP+TN)/(P+N)
• Error = (FP+FN)/(P+N)
• Precision = TP/(TP+FP)
• Recall/TP rate = TP/P
• FP Rate = FP/N
Predicted class
Pos Neg
Actual Pos TP FN P
class N
Neg FP TN
How to Estimate the Metrics?
• We can use:
– Training data;
– Independent test data;
– Hold-out method;
– k-fold cross-validation method;
– Leave-one-out method;
– Bootstrap method;
– And many more…
Estimation with Training Data

• The accuracy/error estimates on the training data are

not good indicators of performance on future data.
Classifier

Training set Training set

– Q: Why?
– A: Because new data will probably not be exactly the same
as the training data!
• The accuracy/error estimates on the training data
measure the degree of classifier’s overfitting.
Estimation with Independent Test Data

• Estimation with independent test data is used when we

have plenty of data and there is a natural way to
forming training and test data.
Classifier

Training set Test set

Hold-out Method

• The hold-out method splits the data into training data and test
data (usually 2/3 for train, 1/3 for test). Then we build a
classifier using the train data and test it using the test data.

Classifier

Training set Test set

Data
• The hold-out method is usually used when we have thousands
of instances, including several hundred instances from each
class.
Making the Most of the Data

• Once evaluation is complete, all the data

can be used to build the final classifier.
• Generally, the larger the training data the
better the classifier (but returns diminish).
• The larger the test data the more accurate
the error estimate.
Stratification

• The holdout method reserves a certain amount

for testing and uses the remainder for training.
– Usually: one third for testing, the rest for training.
• For “unbalanced” datasets, samples might not
be representative.
– Few or none instances of some classes.
• Stratified sample: advanced version of
balancing the data.
– Make sure that each class is represented with
approximately equal proportions in both subsets.
Repeated Holdout Method

• Holdout estimate can be made more reliable

by repeating the process with different
subsamples.
– In each iteration, a certain proportion is
randomly selected for training (possibly with
stratification).
– The error rates on the different iterations are
averaged to yield an overall error rate.
• This is called the repeated holdout method.
k-Fold Cross-Validation

• k-fold cross-validation avoids overlapping test sets:

– First step: data is split into k subsets of equal size;
– Second step: each subset in turn is used for testing and the
remainder for training.
Classifier
• The subsets are stratified
before the cross-validation.
• The estimates are averaged to
yield an overall estimate. train train test

Data train test train

test train train

Classification: Train, Validation, Test Split
Results Known
+
Training set
Model
+
-
-
Builder
+
Data
Evaluate
Classifier Builder
Predictions
+
-
Y N +
Validation set -

+
- Final Evaluation
+
Final Test Set Classifier -

The test data can’t be used for parameter tuning!

History of Machine Learning
• 1950s
– Samuel’s checker player
– Selfridge’s Pandemonium
• 1960s:
– Neural networks: Perceptron
– Pattern recognition
– Learning in the limit theory
– Minsky and Papert prove limitations of Perceptron
• 1970s:
– Symbolic concept induction
– Winston’s arch learner
– Expert systems and the knowledge acquisition bottleneck
– Quinlan’s ID3
– Michalski’s AQ and soybean diagnosis
– Scientific discovery with BACON
– Mathematical discovery with AM

56
History of Machine Learning (cont.)
• 1980s:
– Advanced decision tree and rule learning
– Explanation-based Learning (EBL)
– Learning and planning and problem solving
– Utility problem
– Analogy
– Cognitive architectures
– Resurgence of neural networks (connectionism, backpropagation)
– Valiant’s PAC Learning Theory
– Focus on experimental methodology
• 1990s
– Data mining
– Adaptive software agents and web applications
– Text learning
– Reinforcement learning (RL)
– Inductive Logic Programming (ILP)
– Ensembles: Bagging, Boosting, and Stacking
– Bayes Net learning
57
History of Machine Learning (cont.)
• 2000s
– Support vector machines
– Kernel methods
– Graphical models
– Statistical relational learning
– Transfer learning
– Sequence labeling
– Collective classification and structured outputs
– Computer Systems Applications
• Compilers
• Debugging
• Graphics
• Security (intrusion, virus, and worm detection)
– Email management
– Personalized assistants that learn
– Learning in robotics and vision
58
Thank You

Intelligent Digital Operations Center WP
No ratings yet
Intelligent Digital Operations Center WP
14 pages
Machine Learning in Chemistry Data-Driven Algorithms, Learning Systems, and Predictions
No ratings yet
Machine Learning in Chemistry Data-Driven Algorithms, Learning Systems, and Predictions
140 pages
Accenture Process Reimagined
100% (2)
Accenture Process Reimagined
19 pages
Machine Learning and Data Mining For Emerging Trend in Cyber Dynamics - Theories and Applications
100% (1)
Machine Learning and Data Mining For Emerging Trend in Cyber Dynamics - Theories and Applications
315 pages
4-Confluence of Multiple Disciplines, Classifictaion, Integration-08-Feb-2021Material - I - 08-Feb-2021 - Mod1 - Confluence - Classifictaion
0% (1)
4-Confluence of Multiple Disciplines, Classifictaion, Integration-08-Feb-2021Material - I - 08-Feb-2021 - Mod1 - Confluence - Classifictaion
4 pages
202046702 Artificial Intelligence and Machine Learning
No ratings yet
202046702 Artificial Intelligence and Machine Learning
4 pages
Course Outline ML
No ratings yet
Course Outline ML
3 pages
ML1-Introduction To Machine Learning
No ratings yet
ML1-Introduction To Machine Learning
46 pages
Cz4041 1a Introduction
No ratings yet
Cz4041 1a Introduction
55 pages
Fewshot-Fewshot (LO-Concepts)
No ratings yet
Fewshot-Fewshot (LO-Concepts)
6 pages
Machine Learning and Applications (5L)
No ratings yet
Machine Learning and Applications (5L)
185 pages
3730007
No ratings yet
3730007
2 pages
ML Lecture 1 Introduction and Policies
No ratings yet
ML Lecture 1 Introduction and Policies
45 pages
CONCEPTS IN MACHINE LEARNING-Ktunotes.in
No ratings yet
CONCEPTS IN MACHINE LEARNING-Ktunotes.in
14 pages
Machine Learning_Syllabus_TT1
No ratings yet
Machine Learning_Syllabus_TT1
4 pages
Machine Learning
No ratings yet
Machine Learning
39 pages
1
No ratings yet
1
42 pages
Machine Learning
No ratings yet
Machine Learning
39 pages
Machine Learning
No ratings yet
Machine Learning
39 pages
Design and Analysis of Algorithms
No ratings yet
Design and Analysis of Algorithms
43 pages
Lecture 01 Introduction Annotated
No ratings yet
Lecture 01 Introduction Annotated
18 pages
48.CHE2034IU_Simulation and Optimization
No ratings yet
48.CHE2034IU_Simulation and Optimization
7 pages
Introduction to machine learning
No ratings yet
Introduction to machine learning
33 pages
Index: Unit No Topic Page No
No ratings yet
Index: Unit No Topic Page No
5 pages
Machine Learning Unit1
No ratings yet
Machine Learning Unit1
151 pages
Third Year_SEM V_DJ19
No ratings yet
Third Year_SEM V_DJ19
43 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
CONCEPTS_OF_MACHINE_LEARNING [MINOR]
No ratings yet
CONCEPTS_OF_MACHINE_LEARNING [MINOR]
14 pages
Zeroshot Zeroshot
No ratings yet
Zeroshot Zeroshot
5 pages
49.CHE2035IU - Simulation and Optimization - Lab
No ratings yet
49.CHE2035IU - Simulation and Optimization - Lab
7 pages
Week 1 Introduction To ML
100% (1)
Week 1 Introduction To ML
42 pages
Introduction To ML
100% (1)
Introduction To ML
39 pages
Zeroshot-Fewshot LO
No ratings yet
Zeroshot-Fewshot LO
5 pages
Module 1
No ratings yet
Module 1
140 pages
MACHINE LEARNING ALGORITHM - Unit-1-1
100% (1)
MACHINE LEARNING ALGORITHM - Unit-1-1
78 pages
All Into One ML
No ratings yet
All Into One ML
432 pages
AI Course Outline
No ratings yet
AI Course Outline
3 pages
Lecture # 01 - New
No ratings yet
Lecture # 01 - New
116 pages
Operation Research
No ratings yet
Operation Research
109 pages
Machine Learning 3
No ratings yet
Machine Learning 3
31 pages
L T P/S SW/FW Total Credit Units
No ratings yet
L T P/S SW/FW Total Credit Units
4 pages
P 2.1 Logistic Regression
No ratings yet
P 2.1 Logistic Regression
18 pages
Operations Research
No ratings yet
Operations Research
109 pages
Basic Programming Skills Dcap401 Foundations of Computer Programming
No ratings yet
Basic Programming Skills Dcap401 Foundations of Computer Programming
3 pages
Afafdfsregf
No ratings yet
Afafdfsregf
9 pages
B.Sc. ASA SEM VI
No ratings yet
B.Sc. ASA SEM VI
14 pages
Handout-Artificial Intelligence and Machine Learning - IT3201 Jan 23 To May 23
No ratings yet
Handout-Artificial Intelligence and Machine Learning - IT3201 Jan 23 To May 23
4 pages
syllabus INTRODUCTION TO DEEP LEARNING
No ratings yet
syllabus INTRODUCTION TO DEEP LEARNING
11 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
4 pages
Gujarat Technological University: Semester - V Subject Name: Operation Research
No ratings yet
Gujarat Technological University: Semester - V Subject Name: Operation Research
3 pages
Lecture 01 To 07 Introduction - To - PSAT
No ratings yet
Lecture 01 To 07 Introduction - To - PSAT
98 pages
Se Syl
No ratings yet
Se Syl
4 pages
Course Plan DLD-2020
No ratings yet
Course Plan DLD-2020
16 pages
Ai&ml Unit 4
No ratings yet
Ai&ml Unit 4
68 pages
Machine Learning Techniques Quantum
No ratings yet
Machine Learning Techniques Quantum
161 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
3 pages
Lecture 1
No ratings yet
Lecture 1
51 pages
CTPS Unit I Updated
No ratings yet
CTPS Unit I Updated
99 pages
CDP Finalized Format AI
No ratings yet
CDP Finalized Format AI
16 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
Introduction AI Chapter 1 - Part 1
No ratings yet
Introduction AI Chapter 1 - Part 1
39 pages
Lecture Series On Machine Learning: Ravi Gupta G. Bharadwaja Kumar
No ratings yet
Lecture Series On Machine Learning: Ravi Gupta G. Bharadwaja Kumar
77 pages
2702 PDF
No ratings yet
2702 PDF
7 pages
Course DIT822: Software Engineering For AI Systems
No ratings yet
Course DIT822: Software Engineering For AI Systems
34 pages
ACT DeMYSTiFieD
From Everand
ACT DeMYSTiFieD
Alexandra Mayzler
No ratings yet
Latest Thesis Topics in Orthodontics
67% (3)
Latest Thesis Topics in Orthodontics
6 pages
Researchfinal
No ratings yet
Researchfinal
27 pages
Prediction of Penetration Rate in Drilling Operations: A Comparative Study of Three Neural Network Forecast Methods
No ratings yet
Prediction of Penetration Rate in Drilling Operations: A Comparative Study of Three Neural Network Forecast Methods
14 pages
Data Structures: Notes For Lecture 12 Introduction To Data Mining by Samaher Hussein Ali
No ratings yet
Data Structures: Notes For Lecture 12 Introduction To Data Mining by Samaher Hussein Ali
4 pages
Autoencoder
No ratings yet
Autoencoder
14 pages
Chapter 1
No ratings yet
Chapter 1
46 pages
AD3002 Healthcare Lect Notes Remaining-1
No ratings yet
AD3002 Healthcare Lect Notes Remaining-1
14 pages
Artificial Intelligence in Asset Management 1st Edition Cfa Institute Research Foundation 2024 Scribd Download
100% (4)
Artificial Intelligence in Asset Management 1st Edition Cfa Institute Research Foundation 2024 Scribd Download
32 pages
AI - ML For Manufacturing Industries
100% (1)
AI - ML For Manufacturing Industries
17 pages
Research Article: Automatic Detection Algorithm of Football Events in Videos
No ratings yet
Research Article: Automatic Detection Algorithm of Football Events in Videos
13 pages
KNN
No ratings yet
KNN
3 pages
11 - Vietnamese Text Classification and Sentiment Based
No ratings yet
11 - Vietnamese Text Classification and Sentiment Based
3 pages
Image Classification
No ratings yet
Image Classification
56 pages
Top 45 Machine Learning Interview Questions in 2025
No ratings yet
Top 45 Machine Learning Interview Questions in 2025
37 pages
Perspectives: Scientific Machine Learning Benchmarks
No ratings yet
Perspectives: Scientific Machine Learning Benchmarks
8 pages
Full Download Deep Learning in Bioinformatics: Techniques and Applications in Practice Habib Izadkhah PDF
100% (5)
Full Download Deep Learning in Bioinformatics: Techniques and Applications in Practice Habib Izadkhah PDF
64 pages
Post-Doc Advt 2023 JAN-June
No ratings yet
Post-Doc Advt 2023 JAN-June
7 pages
Big Data Analytics Using Artificial Intelligence
No ratings yet
Big Data Analytics Using Artificial Intelligence
5 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
18 pages
Trick or Treat? Application of Neural Networks in Insurance: January 10th, 2019
100% (3)
Trick or Treat? Application of Neural Networks in Insurance: January 10th, 2019
11 pages
Mini Proj
No ratings yet
Mini Proj
7 pages
Seminar Deep Learning
No ratings yet
Seminar Deep Learning
17 pages
AI Session 8 Slides
No ratings yet
AI Session 8 Slides
33 pages
A Deep Learning Based Surface Defect Detection Method With Coarse Granularity For Corrugated Fuel T
No ratings yet
A Deep Learning Based Surface Defect Detection Method With Coarse Granularity For Corrugated Fuel T
5 pages