0% found this document useful (0 votes)

84 views

Classification Metrics in Machine Learning

The document discusses various classification metrics used in machine learning models including confusion matrix, accuracy, precision, recall, and F-beta. It explains that choosing the right metrics is important to properly evaluate model performance. Confusion matrix, precision, and recall are described in detail and examples are provided to illustrate when each metric would be most useful, such as using recall when false negatives could have high consequences. The F-beta score is also introduced as a metric that balances precision and recall using an adjustable beta parameter.

Uploaded by

xlnc1

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

84 views

Classification Metrics in Machine Learning

Uploaded by

xlnc1

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

CLASSIFICATION METRICS IN

MACHINE LEARNING
INTRODUCTION
Choosing the right Classification Metrics is very crucial for model evaluation. Metrics
like Confusion Matrix is a simple yet a very powerful Classification Metrics when it comes
to evaluating the performance of a classification problem. Confusion Matrix is a performance
measurement for machine learning problem where output can be two or more classes. Similarly
we have Precision which is defined as the fraction of relevant instances among the retrieved
instances, Recall which is the fraction of the total amount of relevant instances that were
actually retrieved, F-Beta is the weighed harmonic mean of Precision and Recall.

We will discuss these in detail in the upcoming sections. Below are the various Classification
metrics that we should use in Machine Learning.

• Confusion Matrix
• Accuracy
• Recall (True Positive Rate, Sensitivity)
• Precision (Positive Prediction Value)
• F – Beta
• Cohen Kappa
• ROC Curve, AUC Score
• PR Curve

It is very important to use the correct kind of metrics to find out how good the model is. If we
are not using the correct metrics, then it would be really difficult tell the efficiency of our
model.
So, let’s understand every metrics and see which one will best fit in what kind of scenarios.

Now let’s consider a classification problem statement. As shown in the below figure, there
are two ways in which we can solve a classification problem statement:

1.PREDICTING THE CLASS LABELS.

Suppose we have a binary classification with classes A and B. The threshold boundary in this
case will by default be 0.5 as we have 2 classes.

So, let’s say our prediction value is greater than 0.5 then it will belong to class B and if it’s
less than 0.5 then it will belong to class A.

2. PROBABILITY

In case of probability also we have to find out the class labels by selecting the right threshold
value.

The threshold value which we will choose will depend on a case by case basis. Let’s say we
want to predict whether a person is having cancer or not. In this case the choosing threshold
value is very critical and should be chosen in a proper way.

The probability approach involves following classification metrics which we can use for
predicting the correct threshold.

i. ROC Curve
ii. AUC Score
iii. PR Curve

Now that we know how we can solve a classification problem, let’s understand what metrics
will be used for a dataset.

a. If we have a dataset which has 1000 records and is split into equal halves, then that means it
is a balanced dataset. In such cases we use Accuracy to be the classification metric.
b. If we have an imbalanced dataset where the distribution of data is not equal for the binary
classification, then we consider Recall, Precision and F-beta to be the classification metric.
Now that we have briefly discussed about balanced and imbalanced and what type of metrics
should be used for each, let’s understand each of them in detail.

1. CONFUSION MATRIX

Confusion Matrix, in case of binary classification is a 2X2 matrix as shown below. The top
values are the actual values and the left part are the predicted values. It is an error matrix,
which allows visualization of the performance of an algorithm.

i. The first field corresponding to 1 for predicted value and 1 for actual value is the True
Positive (TP) field.
ii. Similarly, the field corresponding to 1 for predicted value and 0 for actual value is the False
Positive (FP) field which is also called the type I error or the false positive rate (FPR)
iii. The field corresponding to 0 for predicted value and 1 for actual value is the False Negative
(FN) which is also called the type II error or the false negative rate (FNR).
iv. The field corresponding to 0 for the predicted value and 0 for the actual value is the True
Negative (TN).

One way to remember the formula for FPR is, we consider all the false value (FP, TN) with
respect to the actual predicted value (FP).
Our most accurate results are TP and TN. Our aim should always be to reduce the type I error
and the type II error.

2. ACCURACY

As we discussed before, if our dataset is a balanced one then we use Accuracy as the
classification metric.

The formula for Accuracy is:

Here TP and TN are the most accurate results out of all the other results.

Now what would happen if our dataset is not balanced. What if we still use the Accuracy
Metric as the classification metric. To understand this let’s take an example:

Suppose we have 10K records with label A being 9k and label B being 1K. Now suppose we
are calculating the Accuracy, then its obvious that we will get a 90% accuracy were the
model predicts most of the records being tagged to label A.

Clearly this is not a good way of calculating the efficiency of the model if our dataset is not
balanced.

So, in such such situations we use Recall, Precision, F-beta as the classification metric.

3.RECALL (TRUE POSITIVE RATE, SENSITIVITY)

For a classification matrix Recall says that out of the total actual positive values, how many
positive were we able to predict correctly. This can be seen in the figure below.

One thing to remember here is, in case of Recall we deal with False Negative.
4. PRECISION (POSITIVE PREDICTION VALUE)

Out of the total predicted positive result, how many results were actually positive. One thing
to remember here is, in case of Precision we deal with False Positive.

Now let’s take few examples to better understand the scenarios where we could
use Precision and Recall.

Precision Example

i. Let’s take a use case of Spam Detection. In this case, we mostly have to consider
the Precision. Let’s say we got an email which is originally not a spam, but the model
detected it as a spam, which means it is a False Positive.

In such cases such cases, where the False Positive value is high, our main focus should
always be to reduce it to minimum so that if we get an important email, it should not be
wrongly classified as a spam email.

Recall Example

Now let’s say our model is tasked to predict whether a person is covid positive or not.
Suppose the model predicted it as not having covid whereas he was actually covid positive
which is a False Negative. This might turn out to be a blunder by the model.

In such cases a False Positive won’t be a very big issue because even if the person is not
covid positive but is predicted as positive then he/she could go for another test to verify the
result.

But if the person has covid and is predicted as negative (False Negative) then chances are he
might not go for another test which might turn out to be a disaster.

Therefore, it’s important to use Recall in such situations.

NOTE: Our goal should always be to reduce Precision and Recall, however:

i. Whenever the False Positive is of more importance with respect to the problem statement,
then use precision
ii. If the False Negative has greater importance with respect to the problem statement, then
use Recall.

Now that we have understood what Precision and Recall is, let’s go ahead and understand F-
Beta and where can we possibly use it.
5. F-BETA

We will encounter some of the scenarios in which both the False Positive and False
Negative play an important role in an imbalanced dataset. In such cases we have to consider
both Recall and Precision.

So, if we are considering both these metrics, the we have to use the F-Beta score.

If the Beta value is 1, then the F-Beta becomes a F1-Score. Similarly Beta value can also be
0.5 or 2.

If, β = 1 then,

This formula is a representation of Harmonic mean between Precision and Recall. Now, let’s
understand when to choose what values of Beta.

Case I:

If both False Positive and False Negative are equally important, then we will select Beta = 1.

Case II:

Suppose False Positive is having more impact than the False Negative, then we need to
reduce the Beta value by selecting something between 0 to 1.
Ex: Beta=0.5

Case III:

Suppose the False Negative impact is high which is basically the Recall, then in such cases
we increase the Beta value more than 1.
Ex: Beta=2

Quiz 3 - Business Analytics For Marketing - MBA - Sem II - Batch 2021-2023
No ratings yet
Quiz 3 - Business Analytics For Marketing - MBA - Sem II - Batch 2021-2023
7 pages
Solution PDF
No ratings yet
Solution PDF
20 pages
Chapter 6
0% (1)
Chapter 6
50 pages
Elen C-Series-User-Manual
100% (1)
Elen C-Series-User-Manual
68 pages
Evaluation Mcqs
No ratings yet
Evaluation Mcqs
2 pages
Nueral Network Mcqs
No ratings yet
Nueral Network Mcqs
6 pages
Must Know Questions Deep Learning
No ratings yet
Must Know Questions Deep Learning
22 pages
Untitled
No ratings yet
Untitled
1,326 pages
Classification by Backpropagation - A Multilayer Feed-Forward Neural Network - Defining A Network Topology - Backpropagation
No ratings yet
Classification by Backpropagation - A Multilayer Feed-Forward Neural Network - Defining A Network Topology - Backpropagation
8 pages
Answer 1722791857 NLP and Classification Practical MCQ 4991
No ratings yet
Answer 1722791857 NLP and Classification Practical MCQ 4991
26 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
ML MU Unit 2
100% (2)
ML MU Unit 2
42 pages
Confusion Matrix
No ratings yet
Confusion Matrix
21 pages
Logistic Regression Example
100% (1)
Logistic Regression Example
22 pages
Types of Data (Qualitative and Quantitative)
No ratings yet
Types of Data (Qualitative and Quantitative)
89 pages
Feature Engg Pre Processing Python
No ratings yet
Feature Engg Pre Processing Python
68 pages
Strategic Approach To Software Testing
No ratings yet
Strategic Approach To Software Testing
6 pages
DS+C25 PGDDS+Masters
No ratings yet
DS+C25 PGDDS+Masters
13 pages
DBSCAN
No ratings yet
DBSCAN
18 pages
2.2 ML Session Bias Variance Tradeoffs
No ratings yet
2.2 ML Session Bias Variance Tradeoffs
38 pages
Econometrics - MCQ Flashcards - Quizlet
No ratings yet
Econometrics - MCQ Flashcards - Quizlet
19 pages
### Data Exploration: 'Yes' 'No' 'Agency' 'Direct' 'Employee Referral' 'Yes' 'No'
100% (1)
### Data Exploration: 'Yes' 'No' 'Agency' 'Direct' 'Employee Referral' 'Yes' 'No'
6 pages
Lead Scoring Group Case Study Presentation
100% (2)
Lead Scoring Group Case Study Presentation
19 pages
Lecture 9 PDF
100% (1)
Lecture 9 PDF
28 pages
Time Series
No ratings yet
Time Series
23 pages
Simple Linear Regression - Assign3
No ratings yet
Simple Linear Regression - Assign3
8 pages
Back Propagation Network: Soft Computing
No ratings yet
Back Propagation Network: Soft Computing
33 pages
Module 4 - Confusion Matrix-1
No ratings yet
Module 4 - Confusion Matrix-1
18 pages
Confusion Matrix: Prof. Asim Tewari IIT Bombay
No ratings yet
Confusion Matrix: Prof. Asim Tewari IIT Bombay
8 pages
Multinomial Logistic Regression Basic Relationships
No ratings yet
Multinomial Logistic Regression Basic Relationships
73 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
Assignment-Based Subjective Questions/Answers
No ratings yet
Assignment-Based Subjective Questions/Answers
3 pages
Bilinear Interpolation
No ratings yet
Bilinear Interpolation
4 pages
IIT Madras Notes Machine Learning
No ratings yet
IIT Madras Notes Machine Learning
13 pages
DSF Unit IV MCQ Notes
No ratings yet
DSF Unit IV MCQ Notes
6 pages
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
100% (1)
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
15 pages
Artificial Neural Networks Quiz Questions 1
No ratings yet
Artificial Neural Networks Quiz Questions 1
17 pages
2 - LinearProg 1 PDF
No ratings yet
2 - LinearProg 1 PDF
21 pages
Sajjad DS
100% (2)
Sajjad DS
97 pages
ML Assignment 3 Nptel 2019
No ratings yet
ML Assignment 3 Nptel 2019
26 pages
Duda Solutions PDF
No ratings yet
Duda Solutions PDF
77 pages
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
100% (1)
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
72 pages
Assignment 3: Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 3: Introduction To Machine Learning Prof. B. Ravindran
4 pages
ISyE 6669 Homework 15 PDF
No ratings yet
ISyE 6669 Homework 15 PDF
3 pages
Discriminant Analysis Chapter-Seven
No ratings yet
Discriminant Analysis Chapter-Seven
7 pages
01 ASAP TimeSeriesForcasting Day1 2 Introduction
No ratings yet
01 ASAP TimeSeriesForcasting Day1 2 Introduction
62 pages
ML Concepts: 1. Parametric Vs Non-Parametric Models:: Examples: Linear, Logistic, SVM
No ratings yet
ML Concepts: 1. Parametric Vs Non-Parametric Models:: Examples: Linear, Logistic, SVM
34 pages
OUTLIERS
100% (1)
OUTLIERS
5 pages
Data Modelling and Visualization
No ratings yet
Data Modelling and Visualization
31 pages
Data Preprocessing
No ratings yet
Data Preprocessing
77 pages
Logistic+Regression - Done
100% (1)
Logistic+Regression - Done
41 pages
UE20CS302 Unit4 Slides
No ratings yet
UE20CS302 Unit4 Slides
312 pages
UNIT II - Statistics For Data Science - New
No ratings yet
UNIT II - Statistics For Data Science - New
153 pages
Machine Learning: Lecture 13: Model Validation Techniques, Overfitting, Underfitting
100% (2)
Machine Learning: Lecture 13: Model Validation Techniques, Overfitting, Underfitting
26 pages
In-Class Practices - Session 1 - Answers
No ratings yet
In-Class Practices - Session 1 - Answers
19 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
ML Practice 1
No ratings yet
ML Practice 1
106 pages
CE880_Lecture6_slides
No ratings yet
CE880_Lecture6_slides
25 pages
11.2 - Classification Evaluation Metrics
No ratings yet
11.2 - Classification Evaluation Metrics
22 pages
Lecture 2 Classifier Performance Metrics
No ratings yet
Lecture 2 Classifier Performance Metrics
60 pages
Performance Metrics (Classification) : Enrique J. de La Hoz D
100% (1)
Performance Metrics (Classification) : Enrique J. de La Hoz D
30 pages
27 Rare Systems To Guarantee Impactful Living
No ratings yet
27 Rare Systems To Guarantee Impactful Living
30 pages
More Safety Through Colors
No ratings yet
More Safety Through Colors
2 pages
Root Cause Analysis Guide Book
100% (1)
Root Cause Analysis Guide Book
38 pages
Book 1
No ratings yet
Book 1
23 pages
Introduction To Buck Converter
No ratings yet
Introduction To Buck Converter
24 pages
Multi-Lateral Automotive Regulations - May 2023
No ratings yet
Multi-Lateral Automotive Regulations - May 2023
18 pages
Mental Health Advice
No ratings yet
Mental Health Advice
8 pages
Mini Breaker
No ratings yet
Mini Breaker
7 pages
Overview of MIL, SIL, HIL, Testing Theories & Coverage Metrics
No ratings yet
Overview of MIL, SIL, HIL, Testing Theories & Coverage Metrics
10 pages
Unlocking AI Secrets
No ratings yet
Unlocking AI Secrets
15 pages
Truck Electrification Cost Analysis - February 2023
No ratings yet
Truck Electrification Cost Analysis - February 2023
23 pages
Chatgpt For Engineers
No ratings yet
Chatgpt For Engineers
13 pages
Cat Gp15k - Gp35k Dp20k-Dp35k Service Manjel
100% (1)
Cat Gp15k - Gp35k Dp20k-Dp35k Service Manjel
370 pages
Shutdown Ipcs in The Sitop Psu8600 Buffer Mode
No ratings yet
Shutdown Ipcs in The Sitop Psu8600 Buffer Mode
37 pages
Krystal Crystallizer
83% (6)
Krystal Crystallizer
24 pages
Barnardos Domestic Violence Risk Identification Matrix-2
No ratings yet
Barnardos Domestic Violence Risk Identification Matrix-2
1 page
Principles of Hydrostatics
No ratings yet
Principles of Hydrostatics
4 pages
Project 5 Exponential and Logarithmic Regression
No ratings yet
Project 5 Exponential and Logarithmic Regression
6 pages
D-1 Introduction of Induction Training
No ratings yet
D-1 Introduction of Induction Training
23 pages
Time Distance
No ratings yet
Time Distance
23 pages
Analytical Exposition
No ratings yet
Analytical Exposition
10 pages
18CO5009-10-It2-C04 Bill of Materials
No ratings yet
18CO5009-10-It2-C04 Bill of Materials
7 pages
ML Unit 1
No ratings yet
ML Unit 1
19 pages
Sequence Diagram Tutorial - Complete Guide With Examples - Creately Blog
100% (1)
Sequence Diagram Tutorial - Complete Guide With Examples - Creately Blog
16 pages
Labtest Betp1323 - Rolling - Answer Scheme
No ratings yet
Labtest Betp1323 - Rolling - Answer Scheme
2 pages
BSL Method Statememt For Lifting Thermal Tank - 1
No ratings yet
BSL Method Statememt For Lifting Thermal Tank - 1
4 pages
Week 4 Quiz: Differential Calculus: Uses of The Derivative: Increasing and Decreasing Functions
No ratings yet
Week 4 Quiz: Differential Calculus: Uses of The Derivative: Increasing and Decreasing Functions
7 pages
CV MAF April 2023 PDF
No ratings yet
CV MAF April 2023 PDF
2 pages
Technica Notes: Reaction of Aluminum With Sodium Hydroxide Solution As A Source of Hydrogen
100% (2)
Technica Notes: Reaction of Aluminum With Sodium Hydroxide Solution As A Source of Hydrogen
3 pages
Jawa MOV Spares
No ratings yet
Jawa MOV Spares
18 pages
Evisa Series
No ratings yet
Evisa Series
40 pages
Double Pipe Heat Exchanger
No ratings yet
Double Pipe Heat Exchanger
18 pages
Influence: Science and Practice by Robert Cialdini
No ratings yet
Influence: Science and Practice by Robert Cialdini
12 pages
Organizational Chart
100% (1)
Organizational Chart
5 pages
English Vocabulary 17 PDF
No ratings yet
English Vocabulary 17 PDF
21 pages
General Questions For Interview: Q1: Tell Me About Yourself
No ratings yet
General Questions For Interview: Q1: Tell Me About Yourself
5 pages
DM PKA BLM Download-All
No ratings yet
DM PKA BLM Download-All
39 pages
Fork Lift Preventive Maintenance 3 Checklist: Carwill Construction Inc
No ratings yet
Fork Lift Preventive Maintenance 3 Checklist: Carwill Construction Inc
2 pages
Summer Internship Report
No ratings yet
Summer Internship Report
75 pages
Number Theory II: Congruences
No ratings yet
Number Theory II: Congruences
2 pages
Khizar Inspector ASF
No ratings yet
Khizar Inspector ASF
2 pages