Machine Learning
Machine Learning
Machine Learning refers “Samples of all variables in the data set and
is divided into training, selection, and test samples”. A sample contains
one (or) more features and possibly a label. Samples can be labeled (or)
unlabeled. It deals with algorithms based on multi-layered Artificial Neural
Networks (ANN) that are inspired by the structure of the human brain.
A model performs and ensuring it meets the desired objectives. They are
1. Define the Task and Metrics:-
Classification:
Regression:-
Mean Absolute Error (MAE): MAE = (1/n) * Σ|yi - ŷi|. Measures average
absolute errors.
Mean Squared Error (MSE): MSE = (1/n) * Σ(yi - ŷi)². Emphasizes larger
errors.
Root Mean Squared Error (RMSE): RMSE = √MSE. Provides error in
the same units as the target variable.
R-Squared (R²): R² = 1 - (SS_res / SS_tot). Indicates the proportion of
variance explained by the model.
Clustering:-
Ranking:-
Mean Average Precision (MAP): Measures the average precision at
each relevant item.
Normalized Discounted Cumulative Gain (NDCG): Evaluates the
quality of ranked results based on relevance.
2. Cross-Validation:-
3. Train-Test Split:-
Holdout Method: Split the dataset into training and testing sets (e.g.,
70% training, 30% testing). Evaluate the model on the test set to assess
its generalization ability.
4. Analyze Results:-
For a classification task, you might use the following evaluation approach:
********************************
2. Regression Models:-
Purpose: To predict continuous numerical values.
Examples:
Linear Regression: predict one output variable using one or more input
variables. The representation of linear regression is a linear equation,
which combines a set of input values(x) and predicted output(y) for the
set of those input values. It is represented in the form of a line:
Y = bx+ c
Polynomial Regression: Extends linear regression to model non-linear
relationships using polynomial terms.
Support Vector Regression (SVR): Predicts continuous values with a
margin of tolerance.
Output:
Continuous Value: The predicted numerical value (e.g., house price,
temperature).
3. Clustering Models:-
Purpose: To group similar data points into clusters without predefined labels.
Examples:
K-Means Clustering: Assigns data points to a fixed number of clusters
(k) based on feature similarity.
Hierarchical Clustering: Builds a tree of clusters, representing data
point relationships at various levels of granularity.
DBSCAN: Identifies clusters of varying shapes and sizes based on
density.
Output:
Cluster Labels: The assigned cluster for each data point.
Cluster Centers: In methods like K-Means, the centroid of each cluster.
1. Classes:
o Positive Class: One of the two categories the model is predicting.
Often, this is the class of interest.
o Negative Class: The other category, representing cases where the
event or characteristic of interest is absent.
Output:
True Positive (TP): The patient is diseased and the model predicts
"diseased"
False Positive (FP): The patient is healthy but the model predicts
"diseased"
True Negative (TN): The patient is healthy and the model predicts
"healthy"
False Negative (FN): The patient is diseased and the model predicts
"healthy"
After obtaining these values, we can compute the accuracy score of the
binary classifier as follows
3. Evaluation Metrics:
o Accuracy: The ratio of correctly classified instances to the total
number of instances. While useful, it can be misleading in
imbalanced datasets.
o Precision (Positive Predictive Value): The ratio of true positive
predictions to the sum of true positive and false positive predictions.
It measures how many of the predicted positives are actually
positive.
o Recall (Sensitivity, True Positive Rate): The ratio of true
positive predictions to the sum of true positive and false negative
predictions. It measures how many of the actual positives were
captured by the model.
o F1 Score: The harmonic mean of precision and recall. It balances
the trade-off between precision and recall.
o ROC Curve (Receiver Operating Characteristic Curve): A
graphical plot of the true positive rate against the false positive rate
at various threshold settings.
o AUC (Area Under the ROC Curve): A single value that
summarizes the performance of the model across all classification
thresholds.
Common Uses:
Example:-
1. Data Collection: Gather labeled data with features (inputs) and binary
labels (0 or 1).
2. Data Preprocessing: Clean and prepare the data by handling missing
values, encoding categorical variables, and normalizing numerical
features.
3. Model Training: Choose a suitable binary classification algorithm and
train the model on the training dataset.
4. Model Evaluation: Assess the model's performance using metrics such
as accuracy, precision, recall, and F1 score. Use a validation set or cross-
validation to tune hyperparameters.
5. Model Testing: Evaluate the model on a test set to estimate its
performance on unseen data.
6. Deployment: Deploy the model to make predictions on new, real-world
data.
7. Monitoring and Maintenance: Continuously monitor the model's
performance and update it as needed based on new data or changes in
the underlying patterns.