Lec - 4
Lec - 4
Lec - 4
Model evaluation aims to define how well the model performs its task.
▪ It involves dividing the available data into multiple folds or subsets, using one
of these folds as a validation set, and training the model on the remaining folds.
Types of Cross –validation
Types of cross-validation are:
✓ k-fold cross-validation
✓ leave-one-out cross-validation
✓ Holdout validation
✓ Stratified Cross-Validation.
The choice of technique depends on the size and nature of the data, as
well as the specific requirements of the modeling problem.
Types of Cross –validation
4. K-Fold Cross Validation
Training on the whole dataset but leaves only one data point of the
available dataset and then iterates for each data point.
the model is trained on n-1 samples and tested on the one omitted
sample, repeating this process for each data point in the dataset.
It ensure CV process maintains the same class distribution as the entire dataset.
✓During each iteration, one fold is used for testing, and other for training.
✓The process is repeated k times, with each fold serving as the test set exactly once.
Model Selection: used to compare different models and select the one
that performs the best on average.
Data Efficient: allows the use of all the available data for both training
and validation.
Disadvantages of cross validation
Computationally Expensive: especially when the number of folds is
large or when the model is complex and requires a long time to train.
▪ ML models also have parameters, which are the internal coefficients set
by training or optimizing the model on a training dataset.
Grid Search: Define a search space as a grid of hyperparameter values and evaluate
every position in the grid.
Grid search is great for spot-checking combinations that are known to perform well
generally.
Random search is great for discovering and getting hyperparameter combinations that
you would not have guessed intuitively, although it often requires more time to execute.
Evaluation metrics & scoring
▪ Regression metrics
✓True Negatives (TN) - the value of actual class is no and value of predicted class is also no.
Metrics for classification(Binary& multi- class)
✓False positives and false negatives, these values occur when your actual class
contradicts with the predicted class.
✓False Positives (FP) – When actual class is no and predicted class is yes.
✓False Negatives (FN) – When actual class is yes but predicted class in no.
Precision = TP / (TP + FP)
Accuracy Score = (TP + TN) / (TP + TN + FP + FN)
Approach To compute precision of multi class classification problem:
precision depends on true positives and false positives.
Macro averaged precision: calculate precision for all classes individually and then average them
Micro averaged precision: calculate class wise true positive and false positive and then use that to
calculate overall precision
Metrics for classification(Binary& multi- class)
3. Recall (Sensitivity): Recall is the ratio of correctly predicted positive
observations to all observations in actual class - yes.
Recall = TP / (TP + FN)
Approach To compute recall of multi-class classification problem
✓recall depends on true positives and false negatives.
Macro averaged recall: calculate recall for all classes individually and then
average them
Micro averaged recall: calculate class-wise true positive and false negative
and then use that to calculate overall recall
Metrics for classification(Binary& multi- class)
4. F1 score (F1): F1 Score is the weighted average of Precision and Recall.
The higher the AUC, the better the model is at predicting 0s as 0s and 1s as 1s.
Metrics for classification(Binary& multi- class)
Approach to compute AUC score of multi class classification problem:
✓ One vs All and confusion matrix
Metrics for classification(Binary& multi- class)
Reading Assignment
Where y_i is the i’th expected value in the dataset and yhat_i is the i’th
predicted value.
Where y_i is the i’th expected value in the dataset, yhat_i is the i’th predicted
value, and sqrt() is the square root function.
Mean Absolute Error (MAE): it is a popular metric because, like RMSE, the
units of the error score match the units of the target value that is being
predicted.
Where y_i is the i’th expected value in the dataset, yhat_i is the i’th predicted
value and abs() is the absolute function.
Using evaluation metrics in model selection
▪ Model selection is the process of selecting the best model for a problem in
machine learning.
✓ prevent overfitting
• Cheap robots!!!
• Cheap sensors
• Moore’s law
Robotics and ML
• iRobot PackBot
Remotec Andros
Military/Government Robots
• Many uses…
• Cleaning & Housekeeping
• Humanitarian Demining
• Rehabilitation
• Inspection
• Agriculture & Harvesting
• Lawn Mowers
• Surveillance
• Mining Applications
• Construction
• Automatic Refilling
• Fire Fighters
• Search & Rescue
DaVinci surgical robot by Intuitive Surgical. Japanese health care assistant suit
St. Elizabeth Hospital is one of the local hospitals using this robot. You can
see this robot in person during an open house (website). (HAL - Hybrid Assistive Limb)
Also… Mind-
controlled wheelchair
using NI LabVIEW
Laboratory Applications
https://PapersWithCode.com
https://arXiv.org
https://www.anaconda.com/download/success
https://www.codeconvert.ai/free-converter