ML Using Python Unit3 pdf
ML Using Python Unit3 pdf
Regression: Regression algorithms are used if there is a relationship between the input
variable and the output variable.
Regression is a process of finding the correlations between dependent and independent
variables.
It helps in predicting the continuous variables such as prediction of Market Trends,
prediction of House prices, Weather forecasting etc.
The task of the Regression algorithm is to find the mapping function to map the input
variable(x) to the continuous output variable(y).
Example: Suppose we want to do weather forecasting, so for this, we will use the
Regression algorithm. In weather prediction, the model is trained on the past data, and once
the training is completed, it can easily predict the weather for future days.
Types of Regression Algorithm:
1. Simple Linear Regression
2. Multiple Linear Regression
3. Polynomial Regression
4. Support Vector Regression
5. Decision Tree Regression
6. Random Forest Regression
7. Ridge Regression
8. Lasso Regression
2)Explain Linear Regression?
Linear regression is one of the easiest and most popular Machine Learning algorithms.
It is a statistical method that is used for predictive analysis. Linear regression makes
predictions for continuous/real or numeric variables such as sales, salary, age, product price,
etc.
Linear regression algorithm shows a linear relationship between a dependent (y) and one or
more independent (y) variables, hence called as linear regression.
Since linear regression shows the linear relationship, which means it finds how the value of
the dependent variable is changing according to the value of the independent variable.
The linear regression model provides a sloped straight line representing the relationship
between the variables.
Linear Regression Line:
A linear line showing the relationship between the dependent and independent variables is
called a regression line. A regression line can show two types of relationship:
Positive Linear Relation:If the dependent variable increases on the Y-axis and
independent variable increases on X-axis, then such a relationship is termed as a
Positive linear relationship.
Negative Linear Relation:If the dependent variable decreases on the Y-axis and
independent variable increases on the X-axis, then such a relationship is called a negative
linear relationship.
Types of Linear Regression:
a) Simple Linear Regression
b) Multiple Linear Regression
a) Simple Linear Regression:If a single independent variable is used to predict the value of
a numerical dependent variable, then such a Linear Regression algorithm is called Simple
Linear Regression.
Simple Linear Regression is a type of Regression algorithms that models the relationship
between a dependent variable and a single independent variable. The relationship shown by a
Simple Linear Regression model is linear or a sloped straight line, hence it is called Simple
Linear Regression.
The key point in Simple Linear Regression is that the dependent variable must be a
continuous/real value. However, the independent variable can be measured on continuous or
categorical values.
Where,
a0= It is the intercept of the Regression line (can be obtained putting x=0)
a1= It is the slope of the regression line, which tells whether the line is increasing or decreasing.
ε = The error term. (For a good model it will be negligible)
b)Multiple Linear regression: If more than one independent variable is used to predict the
value of a numerical dependent variable, then such a Linear Regression algorithm is called
Multiple Linear Regression.
Ex:Prediction of CO2 emission based on engine size and number of cylinders in a car.
Where,
Y= Output/Response variable
Note: Logistic regression uses the concept of predictive modeling as regression; therefore, it
is called logistic regression, but is used to classify samples; Therefore, it falls under the
classification algorithm.
Logistic Function (Sigmoid Function):
o The sigmoid function is a mathematical function used to map the predicted values to
probabilities.
o It maps any real value into another value within a range of 0 and 1.
o The value of the logistic regression must be between 0 and 1, which cannot go beyond
this limit, so it forms a curve like the "S" form. The S-form curve is called the Sigmoid
function or the logistic function.
o In logistic regression, we use the concept of the threshold value, which defines the
probability of either 0 or 1. Such as values above the threshold value tends to 1, and a
value below the threshold values tends to 0.
Assumptions for Logistic Regression:
o The dependent variable must be categorical in nature.
o The independent variable should not have multi-collinearity.
Logistic Regression Equation:
The Logistic regression equation can be obtained from the Linear Regression equation. The
mathematical steps to get Logistic Regression equations are given below:
o We know the equation of the straight line can be written as:
o In Logistic Regression y can be between 0 and 1 only, so for this let's divide the above
equation by (1-y):
o But we need range between -[infinity] to +[infinity], then take logarithm of the equation
it will become:
The likelihood of a given set of observations is the probability of obtaining that particular set
of data, given chosen probability distribution model.
MLE is carried out by writing an expression known as the Likelihood function for a set of
observations. This expression contains an unknown parameter, say, θ of he model. We obtain
the value of this parameter that maximizes the likelihood of the observations. This value is
called maximum likelihood estimate.
Think of MLE as opposite of probability. While probability function tries to determine the
probability of the parameters for a given sample, likelihood tries to determine the probability
of the samples given the parameter.
They become unbiased minimum variance estimator with increasing sample size they have
approximate normal distributions.
Deriving the Likelihood Function:
Assuming a random sample x1, x2, x3, … ,xn which have joint probability density and denoted
by:
So the question is ‘what would be the maximum value of θ for the given observations? This
can be found by maximizing this product using calculus methods, which is not covered in this
lesson.
Log Likelihood: .
Maximizing the likelihood function derived above can be a complex operation. So to work
around this, we can use the fact that the logarithm of a function is also an increasing function.
So maximizing the logarithm of the likelihood function, would also be equivalent to
maximizing the likelihood function.
This is given as:
So at this point, the result we have from maximizing this function is known as ‘maximum
likelihood estimate‘ for the given function
Applications of Maximum Likelihood Estimation:
MLE can be applied in different statistical models including;
• linear and generalized linear models,
• exploratory and confirmatory analysis,
• communication system,
• econometrics and signal detection.