Machine Learning
Machine Learning
1. What is Regression:
Regression is a method for understanding the relationship between independent
variables or features and a dependent variable or outcome. Outcomes can then be
predicted once the relationship between independent and dependent variables has
been estimated. Regression is a field of study in statistics which forms a key part of
forecast models in machine learning. It’s used as an approach to predict continuous
outcomes in predictive modelling, so has utility in forecasting and predicting
outcomes from data. Machine learning regression generally involves plotting a line
of best fit through the data points. The distance between each point and the line is
minimised to achieve the best fit line.
Regression analysis is a statistical method to model the
relationship between a dependent (target) and independent (predictor) variables
with one or more independent variables. More specifically, Regression analysis helps
us to understand how the value of the dependent variable is changing corresponding
to an independent variable when other independent variables are held fixed. It
predicts continuous/real values such as temperature, age, salary, price, etc.
machine learning and data science. Below are some other reasons for using
Regression analysis:
Regression estimates the relationship between the target and the independent
variable.
It is used to find the trends in data.
It helps to predict real/continuous values.
By performing the regression, we can confidently determine the most important
factor, the least important factor, and how each factor is affecting the other
factors.
3. Types of Regression
There are various types of regressions which are used in data science and machine
learning. Each type has its own importance on different scenarios, but at the core, all the
regression methods analyse the effect of the independent variable on dependent variables.
Here we are discussing some important types of regression which are given below:
Linear Regression
Logistic Regression
Polynomial Regression
Support Vector Regression
Decision Tree Regression
Random Forest Regression
Ridge Regression
Lasso Regression:
Dr. Kapil K. MisalUnit III- Machine Learning
them by a total number of observations And this is MAE. And we aim to get a
minimum MAE because this is a loss.
As RMSE is clear by the name itself, that it is a simple square root of mean
squared error.
R2 Score
The R2 score (pronounced R-Squared Score) is a statistical measure that tells
us how well our model is making all its predictions on a scale of zero to one.
As mentioned above, it's not ideal for a model to predict the actual values in
a regression problem (as opposed to a classification problem that has
discrete levels of value).
But we can use the R2 score to determine the accuracy of our model in terms
of distance or residual. You can calculate the R2 score using the formula
below:
Dr. Kapil K. MisalUnit III- Machine Learning
Linear Regression
Linear regression is a type of supervised machine learning algorithm that
computes the linear relationship between a dependent variable and one or
more independent features. When the number of the independent feature, is
1 then it is known as Univariate Linear regression, and in the case of more
than one feature, it is known as multivariate linear regression. The goal of the
algorithm is to find the best linear equation that can predict the value of the
dependent variable based on the independent variables. The equation
provides a straight line that represents the relationship between the
dependent and independent variables. The slope of the line indicates how
much the dependent variable changes for a unit change in the independent
variable(s).
Signal: It refers to the true underlying pattern of the data that helps the
machine learning model to learn from the data.
Noise: Noise is unnecessary and irrelevant data that reduces the performance
of the model.
Bias: Bias is a prediction error that is introduced in the model due to
oversimplifying the machine learning algorithms. Or it is the difference
between the predicted values and the actual values.
Variance: If the machine learning model performs well with the training
dataset, but does not perform well with the test dataset, then variance
occurs.
Overfitting
Overfitting occurs when our machine learning model tries to cover all the
data points or more than the required data points present in the given
dataset. Because of this, the model starts caching noise and inaccurate values
present in the dataset, and all these factors reduce the efficiency and
accuracy of the model. The overfitted model has low bias and high variance.
The chances of occurrence of overfitting increase as much we provide
training to our model. It means the more we train our model, the more
chances of occurring the overfitted model. Overfitting is the main problem
that occurs in supervised learning.
Example: The concept of the overfitting can be understood by the below
graph of the linear regression output:
Dr. Kapil K. MisalUnit III- Machine Learning
Underfitting
Underfitting occurs when our machine learning model is not able to capture
the underlying trend of the data. To avoid the overfitting in the model, the
fed of training data can be stopped at an early stage, due to which the model
may not learn enough from the training data. As a result, it may fail to find
Dr. Kapil K. MisalUnit III- Machine Learning
the best fit of the dominant trend in the data. In the case of underfitting, the
model is not able to learn enough from the training data, and hence it
reduces the accuracy and produces unreliable predictions.
An underfitted model has high bias and low variance.
Example: We can understand the underfitting using below output of the
linear regression model:
quantity that changes. It does not deal with causes or relationships and the
main purpose of the analysis is to describe the data and find patterns that
exist within it. The example of a univariate data can be height.
Suppose the temperature and ice cream sales are the two variables of a
bivariate data(figure). Here, the relationship is visible from the table that
temperature and sales are directly proportional to each other and thus
related because as the temperature increases, the sales also increase. Thus
bivariate data analysis involves comparisons, relationships, causes and
explanations. These variables are often plotted on X and Y axis on the graph
for better understanding of data and one of these variables is independent
while the other is dependent.
3. Multivariate data
Dr. Kapil K. MisalUnit III- Machine Learning
It only summarize
It only summarize two It only summarize more than 2
single variable at
variables variables.
a time.
It does not
It does contain only It is similar to bivariate but it
contain any
one dependent contains more than 2
dependent
variable. variables.
variable.
The main purpose The main purpose is The main purpose is to study
is to describe. to explain. the relationship among them.
What is Regularization?
Regularization is one of the most important concepts of machine learning. It
is a technique to prevent the model from overfitting by adding extra
information to it.
Sometimes the machine learning model performs well with the training data
but does not perform well with the test data. It means the model is not able
to predict the output when deals with unseen data by introducing noise in
the output, and hence the model is called overfitted. This problem can be
deal with the help of a regularization technique.
This technique can be used in such a way that it will allow to maintain all
variables or features in the model by reducing the magnitude of the
variables. Hence, it maintains accuracy as well as a generalization of the
model.
It mainly regularizes or reduces the coefficient of features toward zero. In
simple words, "In regularization technique, we reduce the magnitude of the
features by keeping the same number of features."
Ridge Regression: Ridge regression is one of the types of linear regression in
which a small amount of bias is introduced so that we can get better long-
term predictions. Ridge regression is a regularization technique, which is used
to reduce the complexity of the model. It is also called as L2 regularization. In
this technique, the cost function is altered by adding the penalty term to it.
The amount of bias added to the model is called Ridge Regression penalty.
We can calculate it by multiplying with the lambda to the squared weight of
each individual feature.
Lasso Regression: Lasso regression is another regularization technique to
reduce the complexity of the model. It stands for Least Absolute and
Selection Operator. It is similar to the Ridge Regression except that the
Dr. Kapil K. MisalUnit III- Machine Learning
Bias-variance dilemma
The Bias-Variance dilemma is relevant for supervised machine learning. It’s a
way to diagnose an algorithm performance by breaking down its prediction
error. There are three types of prediction errors: bias, variance, and
irreducible error. Machine learning is a branch of Artificial Intelligence, which
allows machines to perform data analysis and make predictions. However, if
the machine learning model is not accurate, it can make predictions errors,
and these prediction errors are usually known as Bias and Variance. In
machine learning, these errors will always be present as there is always a
slight difference between the model predictions and actual predictions. The
main aim of ML/data science analysts is to reduce these errors in order to get
more accurate results.
Errors in Machine Learning?
In machine learning, an error is a measure of how accurately an algorithm
can make predictions for the previously unknown dataset. On the basis of
these errors, the machine learning model is selected that can perform best
on the particular dataset. There are mainly two types of errors in machine
learning, which are:
Dr. Kapil K. MisalUnit III- Machine Learning
High Bias: A model with a high bias makes more assumptions, and the model
becomes unable to capture the important features of our dataset. A high bias
model also cannot perform well on new data.
Generally, a linear algorithm has a high bias, as it makes them learn fast. The
simpler the algorithm, the higher the bias it has likely to be introduced.
Whereas a nonlinear algorithm often has low bias.
Some examples of machine learning algorithms with low bias are Decision
Trees, k-Nearest Neighbours and Support Vector Machines. At the same time,
an algorithm with high bias is Linear Regression, Linear Discriminant Analysis
and Logistic Regression.
Ways to reduce High Bias:
High bias mainly occurs due to a much simple model. Below are some ways
to reduce the high bias:
Increase the input features as the model is underfitted.
Decrease the regularization term.
Use more complex models, such as including some polynomial
features.
result, such a model gives good results with the training dataset but shows
high error rates on the test dataset.
Since, with high variance, the model learns too much from the dataset, it
leads to overfitting of the model. A model with high variance has the below
problems:
A high variance model leads to overfitting.
Increase model complexities.
Usually, nonlinear algorithms have a lot of flexibility to fit the model, have
high variance.
Some examples of machine learning algorithms with low variance are, Linear
Regression, Logistic Regression, and Linear discriminant analysis. At the same
time, algorithms with high variance are decision tree, Support Vector
Machine, and K-nearest neighbours.
Ways to Reduce High Variance:
Reduce the input features or number of parameters as a model is
overfitted.
Do not use a much complex model.
Increase the training data.
Increase the Regularization term.
Different Combinations of Bias-Variance
There are four possible combinations of bias and variances, which are
represented by the below diagram:
Dr. Kapil K. MisalUnit III- Machine Learning
Low-Bias, Low-Variance: The combination of low bias and low variance shows
an ideal machine learning model. However, it is not possible practically.
Low-Bias, High-Variance: With low bias and high variance, model
predictions are inconsistent and accurate on average. This case occurs
when the model learns with a large number of parameters and hence
leads to an overfitting
High-Bias, Low-Variance: With High bias and low variance, predictions
are consistent but inaccurate on average. This case occurs when a
model does not learn well with the training dataset or uses few
numbers of the parameter. It leads to underfitting problems in the
model.
High-Bias, High-Variance: With high bias and high variance, predictions are
inconsistent and also inaccurate on average.
Bias-Variance Trade-Off