0% found this document useful (0 votes)

122 views

CSDS 440: Machine Learning: Soumya Ray (

This document provides information about a machine learning course titled "CSDS 440: Machine Learning" taught by Soumya Ray at Case Western Reserve University. It includes the instructor's contact information, zoom meeting details, expectations for student participation in zoom sessions, and an agenda for an upcoming class that will cover evaluation metrics for classification and review optimization topics.

Uploaded by

Zaid Sulaiman

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

122 views

CSDS 440: Machine Learning: Soumya Ray (

Uploaded by

Zaid Sulaiman

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

CSDS 440: Machine Learning

https://cwru.zoom.us/rec/share/s-8J2q_vYWRh7RGNOg9HDAoweQgXgpRSbwc5Akph-
Y6sBl1fsi-guHYSRPZiOphx.F6GOaUOhJkuUzEuH
Soumya Ray (sray@case.edu)
Office: Zoomlandia 920 0026 0871/ 598943
Office hours: F 9-10am or by appointment
• Please ensure your full name is visible on zoom.
• Check that your mic is connected by looking for the mic symbol next to your name in the
Participants list. If not you can configure your mic and speakers using arrow next to the mic
button in the lower left of the zoom interface.
• You will be muted on entry (mic symbol has a line through it). Unmute to ask/answer
questions.
• Leave your mic on mute and video off until you are speaking.
• To ask a question, use “Raise Hand” and wait to be called or send your question via chat.
• To answer a question, use “Raise Hand” and wait to be called.
• After asking or answering your question, click “Raise Hand” again to lower your hand.
• This meeting is being recorded. The recording will be made available, likely via canvas.
• If I drop out/can’t be heard/screen freezes/slides disappear etc, please send a note in the chat
window.
• If I get disconnected completely, I will rejoin asap. Please be patient and wait. If there is a
serious issue and I cannot rejoin, I will send email and the class will be postponed.

9/15/2020 Soumya Ray, Case Western Reserve U. 1

Recap
• What is overfitting?
• We control overfitting in trees through (ES) and (PP).
• How does ES work? Why might it not work well in practice?
• PP uses a ____ set. It iteratively _____ a node and evaluates the
result. It selects a node that ________. It stops when _____.
• What is the geometry of the tree’s decision boundary?
• What are some pros of using decision trees? Cons?
• What is goal of learning algorithm performance evaluation?
• Given a finite dataset, we want the training set to an algorithm to
be as ____ as possible. We also want the test sets to be _____.
• These goals are achieved by __-___ ___ ___.
• How does this procedure work?
• What is leave one out?
• What is stratified CV?
9/15/2020 Soumya Ray, Case Western Reserve U. 2
Today
• Evaluation Metrics for Classification
• Review of Optimization

9/15/2020 Soumya Ray, Case Western Reserve U. 3

Contingency Table
Class according to Target Concept
(Correct Answer)
Positive Negative
Class according to Learned Classifier

True Positives False Positives

(Predicted Answer)

Positive (TP) (FP)

(Type I error)

False Negatives True Negatives

Negative
(FN) (TN)
(Type II error)

9/15/2020 Soumya Ray, Case Western Reserve U. 4

Accuracy
• Most commonly used measure for comparing
classification algorithms

TP  TN
Accuracy 
TP  TN  FP  FN

9/15/2020 Soumya Ray, Case Western Reserve U. 5

Weaknesses of Accuracy
• Does not account for:
– Skewed class distributions
– Differential misclassification costs
– Confidence estimates from learning algorithms

9/15/2020 Soumya Ray, Case Western Reserve U. 6

Weighted/Balanced Accuracy
• Corrects for skewed class distributions

1  TP TN 
WAcc    
2  Allpos Allneg 
1  TP TN 
   
2  TP  FN TN  FP 
True Positive Rate True Negative Rate

9/15/2020 Soumya Ray, Case Western Reserve U. 7

Measuring one class
• Often, just a single class is “interesting”
– Call this the “positive” class
Positive Negative

True Positives False Positives

Positive (TP) (FP)
(Type I error)

False Negatives True Negatives

Negative
(FN) (TN)
(Type II error)

9/15/2020 Soumya Ray, Case Western Reserve U. 8

Precision
• Of the examples the learner predicted
positive, how many were actually positive?

TP
Precision 
TP  FP

9/15/2020 Soumya Ray, Case Western Reserve U. 9

Recall/TP rate/Sensitivity
• Of the examples that were actually positive,
how many did the learner predict correctly?

TP TP
Recall  
TP  FN Allpos

9/15/2020 Soumya Ray, Case Western Reserve U. 10

Specificity/TN rate
• Counterpart of recall for the negative class

TN TN
Specificity  
TN  FP Allneg
• So:
1
WAcc   Sensitivity  Specificity 
2
9/15/2020 Soumya Ray, Case Western Reserve U. 11
F1 score
• Combines precision and recall into a single
measure, giving each equal weight
1 1 1 1 
   
F1 2  Precision Recall 
2
F1 
1 1

Precision Recall
9/15/2020 Soumya Ray, Case Western Reserve U. 12
Beyond point estimates
• Everything above is a “point estimate”

• Because they will be computed on the basis of

a sample, we can also compute variance
estimates for each quantity

• Important to show “stability” of solutions, and

when comparing across algorithms (later)
9/15/2020 Soumya Ray, Case Western Reserve U. 13
Learning Curves
• Often useful to plot each metric as a function
of training sample size
• Provides insight into how many examples the
algorithm needs to become effective
1

Metric 0.5
(e.g. Accuracy)

100 200 1000

Training Sample Size
9/15/2020 Soumya Ray, Case Western Reserve U. 14
Metrics with Confidence Measures
• Many learning algorithms produce classifiers
or models that can provide estimates of how
confident they are about a prediction

• In this case, can plot Precision-Recall (PR) and

Receiver Operating Characteristic (ROC)
graphs

9/15/2020 Soumya Ray, Case Western Reserve U. 15

Precision-Recall graphs
Confidence Recall Precision
True Class
On + (x axis) (y axis)

Example 1 + 0.9

Example 2 − 0.8

Example 3 + 0.4

Example 4 − 0.3

9/15/2020 Soumya Ray, Case Western Reserve U. 16

Precision-Recall graphs
True Class
Confidence Recall Precision Precision
On + (x axis) (y axis)

1
Example 1 + 0.9 0.5 1
0.5
Example 2 − 0.8 0.5 0.5

Example 3 + 0.4 1 0.67

0.5 1
Example 4 − 0.3 1 0.5 Recall

9/15/2020 Soumya Ray, Case Western Reserve U. 17

ROC graphs
FP Rate
Confidence Sens./Recall
True Class (1-Spec.)
On + (y axis)
(x axis)

Example 1 + 0.9

Example 2 − 0.8

Example 3 + 0.4

Example 4 − 0.3

9/15/2020 Soumya Ray, Case Western Reserve U. 18

ROC graphs
True Class
Confidence FP Rate Sens./Recall TP rate
On + (x axis) (y axis)

1
Example 1 + 0.9 0 0.5
0.5
Example 2 − 0.8 0.5 0.5

Example 3 + 0.4 0.5 1

0.5 1
Example 4 − 0.3 1 1 FP rate

9/15/2020 Soumya Ray, Case Western Reserve U. 19

Properties of ROC graphs
• Random guessing is a diagonal line
– Also majority class classifier
– If your classifier is any good its ROC must lie above
the diagonal
• Monotonically increasing
• Often use “AUC”/ “AROC” as comparison
statistic (later)
• Can be misleading if class distribution is too
skewed (use PR graphs instead)
9/15/2020 Soumya Ray, Case Western Reserve U. 20

Instructor Manual Statistics Miller and Miller
No ratings yet
Instructor Manual Statistics Miller and Miller
69 pages
Dissertation Format Guidelines For MAHSA University
No ratings yet
Dissertation Format Guidelines For MAHSA University
30 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
Lectures3 5
No ratings yet
Lectures3 5
57 pages
Evaluating A Machine Learning Model
No ratings yet
Evaluating A Machine Learning Model
14 pages
ML-chap-2
No ratings yet
ML-chap-2
60 pages
08 Classifier Evaluation
No ratings yet
08 Classifier Evaluation
39 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
CSC4316 9
No ratings yet
CSC4316 9
40 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
FML - KNN
No ratings yet
FML - KNN
64 pages
Machine Learning Cheatsheet
No ratings yet
Machine Learning Cheatsheet
12 pages
Machine_Learning_II
No ratings yet
Machine_Learning_II
61 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
Int3209 - Data Mining: Week 5: Classification Model Improvements
No ratings yet
Int3209 - Data Mining: Week 5: Classification Model Improvements
56 pages
AIML-HC Mod 03
No ratings yet
AIML-HC Mod 03
46 pages
DL_IT324a_4
No ratings yet
DL_IT324a_4
52 pages
Intermediate Analytics-Regression-Week 3-1
No ratings yet
Intermediate Analytics-Regression-Week 3-1
44 pages
ML Notes UT-2
No ratings yet
ML Notes UT-2
19 pages
Unit-4
No ratings yet
Unit-4
52 pages
Lecture - 2 Classification (Machine Learning Basic and KNN)
No ratings yet
Lecture - 2 Classification (Machine Learning Basic and KNN)
94 pages
Performance Metrics (Classification) : Enrique J. de La Hoz D
100% (1)
Performance Metrics (Classification) : Enrique J. de La Hoz D
30 pages
ML-2-PPT-UNIT-2
No ratings yet
ML-2-PPT-UNIT-2
214 pages
Evaluation Metrics:: Confusion Matrix
No ratings yet
Evaluation Metrics:: Confusion Matrix
7 pages
6.Data Mining - Classification Ppt
No ratings yet
6.Data Mining - Classification Ppt
37 pages
Model Evaluation in ML
No ratings yet
Model Evaluation in ML
12 pages
Chapter 3 Model Evaluation Final
No ratings yet
Chapter 3 Model Evaluation Final
30 pages
Exam PA Knowledge Based Outline
No ratings yet
Exam PA Knowledge Based Outline
22 pages
Class Imbalance Notes
No ratings yet
Class Imbalance Notes
6 pages
2-Training and Testing Models, Evaluation Metrics-01-07-2023
No ratings yet
2-Training and Testing Models, Evaluation Metrics-01-07-2023
23 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
ML3 Evaluating Models
No ratings yet
ML3 Evaluating Models
40 pages
Introduction To Artificial Intelligence: Amna Iftikhar Fall ' 2019 1
No ratings yet
Introduction To Artificial Intelligence: Amna Iftikhar Fall ' 2019 1
33 pages
Lecture 3b - Evaluation
No ratings yet
Lecture 3b - Evaluation
37 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
3-Performance Measures
No ratings yet
3-Performance Measures
35 pages
Classification Metrics in Machine Learning
No ratings yet
Classification Metrics in Machine Learning
6 pages
(REPORT) LAB - 2 - Decision - Tree
No ratings yet
(REPORT) LAB - 2 - Decision - Tree
17 pages
ML Supervised Regression
No ratings yet
ML Supervised Regression
70 pages
ERROR and Confusion Matrix
No ratings yet
ERROR and Confusion Matrix
29 pages
TR Rain Error
No ratings yet
TR Rain Error
6 pages
DM - Ch4 - Classification (Part1)
No ratings yet
DM - Ch4 - Classification (Part1)
20 pages
Week 2: Machine Learning Intro: Instructor: Ting Sun
No ratings yet
Week 2: Machine Learning Intro: Instructor: Ting Sun
21 pages
Machine Learning Updated
No ratings yet
Machine Learning Updated
14 pages
KNN Evaluation
No ratings yet
KNN Evaluation
51 pages
EvaluationMatrix
No ratings yet
EvaluationMatrix
29 pages
11.2 - Classification Evaluation Metrics
No ratings yet
11.2 - Classification Evaluation Metrics
22 pages
DM 09 Classification and Prediction 19112024 102854am
No ratings yet
DM 09 Classification and Prediction 19112024 102854am
21 pages
A10-Model-Performance-v2-2up
No ratings yet
A10-Model-Performance-v2-2up
11 pages
CE880_Lecture6_slides
No ratings yet
CE880_Lecture6_slides
25 pages
19-Introduction classification algorithm-18-09-2024
No ratings yet
19-Introduction classification algorithm-18-09-2024
102 pages
Mod 7 Smote ML
No ratings yet
Mod 7 Smote ML
40 pages
BSC ML CH1.pptx
No ratings yet
BSC ML CH1.pptx
63 pages
Lecture 2 Classifier Performance Metrics
No ratings yet
Lecture 2 Classifier Performance Metrics
60 pages
Ushna FYP
No ratings yet
Ushna FYP
25 pages
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
No ratings yet
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
17 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
06-FSSR_DS610_2024=2025T1_ٍMetrics
No ratings yet
06-FSSR_DS610_2024=2025T1_ٍMetrics
24 pages
Compact Camera User Guide
From Everand
Compact Camera User Guide
David Cade
No ratings yet
Errors of Regression Models: Bite-Size Machine Learning, #1
From Everand
Errors of Regression Models: Bite-Size Machine Learning, #1
Lee Baker
No ratings yet
SAT and ACT Combo Test: Your Guide to Which Test to Take, How to Take It, and What to Do Next
From Everand
SAT and ACT Combo Test: Your Guide to Which Test to Take, How to Take It, and What to Do Next
Heather Krey
No ratings yet
2.6 Rational Functions Asymptotes Tutorial
No ratings yet
2.6 Rational Functions Asymptotes Tutorial
30 pages
IntroOptimManifolds Boumal 2020 PDF
No ratings yet
IntroOptimManifolds Boumal 2020 PDF
310 pages
One Way ANOVA Post Hoc Tests in Excel
No ratings yet
One Way ANOVA Post Hoc Tests in Excel
3 pages
Rathore A.S., Velayudhan A. - Scale-Up and Optimization in Preparative Chromatography Principles and Biopharmaceutical Applications (2003)
No ratings yet
Rathore A.S., Velayudhan A. - Scale-Up and Optimization in Preparative Chromatography Principles and Biopharmaceutical Applications (2003)
346 pages
Raw Data Quantitative Analysis Meaningful Information
No ratings yet
Raw Data Quantitative Analysis Meaningful Information
3 pages
ISO - 286-2 Shaft Limits Tolerances
100% (1)
ISO - 286-2 Shaft Limits Tolerances
2 pages
Learning To Do Qualitative Data Analysis: A Starting Point: Jessica Nina Lester, Yonjoo Cho, and Chad R. Lochmiller
No ratings yet
Learning To Do Qualitative Data Analysis: A Starting Point: Jessica Nina Lester, Yonjoo Cho, and Chad R. Lochmiller
13 pages
BRM Second Internal Question Paper
No ratings yet
BRM Second Internal Question Paper
2 pages
Profile Metalice Teava Rotunda
No ratings yet
Profile Metalice Teava Rotunda
1 page
What Is Lax-Milgram Lemma ?
No ratings yet
What Is Lax-Milgram Lemma ?
9 pages
Discovering Equations From Data: Melissa R. Mcguirl
No ratings yet
Discovering Equations From Data: Melissa R. Mcguirl
29 pages
ECQ Manual PDF
No ratings yet
ECQ Manual PDF
29 pages
Chetan & Neha GP
No ratings yet
Chetan & Neha GP
131 pages
S10 Quantitative Research (Rstudio Video) PDF
No ratings yet
S10 Quantitative Research (Rstudio Video) PDF
5 pages
Iat - 1
No ratings yet
Iat - 1
6 pages
Signals and Systems
No ratings yet
Signals and Systems
9 pages
StewartCalcET7e 13 01
No ratings yet
StewartCalcET7e 13 01
25 pages
Thermo Potential
No ratings yet
Thermo Potential
16 pages
1 4 Extrema and Average Rates of Change
No ratings yet
1 4 Extrema and Average Rates of Change
35 pages
Utilization of Assessment Data
100% (4)
Utilization of Assessment Data
50 pages
Courses in Part Ia of The Mathematical Tripos
No ratings yet
Courses in Part Ia of The Mathematical Tripos
11 pages
Introduction To Numerical Methods
No ratings yet
Introduction To Numerical Methods
10 pages
AHP - Complete - Rev-Week3 YP
No ratings yet
AHP - Complete - Rev-Week3 YP
42 pages
Chapter - Game Theory
No ratings yet
Chapter - Game Theory
25 pages
Change of Variables in Double Integrals
No ratings yet
Change of Variables in Double Integrals
12 pages
Carl M. Bender Et Al - Compactons in PT - Symmetric Generalized Korteweg (De Vries Equations
No ratings yet
Carl M. Bender Et Al - Compactons in PT - Symmetric Generalized Korteweg (De Vries Equations
11 pages
AUC TS 12th E
No ratings yet
AUC TS 12th E
8 pages
Abbott - Understanding Analysis Problem Set
0% (1)
Abbott - Understanding Analysis Problem Set
4 pages