UNIT 3 Regression

Uploaded by

johnwilliams

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

UNIT 3 Regression

Uploaded by

johnwilliams

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Regression in machine learning

Regression, a statistical approach, dissects the relationship between

dependent and independent variables, enabling predictions through various
regression models.
What is Regression?
Regression is a statistical approach used to analyze the relationship
between a dependent variable (target variable) and one or more
independent variables (predictor variables). The objective is to
determine the most suitable function that characterizes the connection
between these variables.
It seeks to find the best-fitting model, which can be utilized to make
predictions or draw conclusions.
Regression in Machine Learning
It is a supervised machine learning technique, used to predict the value of
the dependent variable for new, unseen data. It models the relationship
between the input features and the target variable, allowing for the
estimation or prediction of numerical values.
Regression analysis problem works with if output variable is a real or
continuous value, such as “salary” or “weight”. Many different models
can be used, the simplest is the linear regression. It tries to fit data with the
best hyper-plane which goes through the points.
Terminologies Related to the Regression Analysis in
Machine Learning
Terminologies Related to Regression Analysis:
● Response Variable: The primary factor to predict or understand in
regression, also known as the dependent variable or target variable.
● Predictor Variable: Factors influencing the response variable, used
to predict its values; also called independent variables.
● Outliers: Observations with significantly low or high values
compared to others, potentially impacting results and best avoided.
● Multicollinearity: High correlation among independent variables, which
can complicate the ranking of influential variables.
● Underfitting and Overfitting: Overfitting occurs when an algorithm
performs well on training but poorly on testing, while underfitting
indicates poor performance on both datasets.
Regression Types
There are two main types of regression:
● Simple Regression
o Used to predict a continuous dependent variable based on a
single independent variable.

1
o Simple linear regression should be used when there is only a
single independent variable.
● Multiple Regression
o Used to predict a continuous dependent variable based on
multiple independent variables.
o Multiple linear regression should be used when there are
multiple independent variables.
● NonLinear Regression
o Relationship between the dependent variable and
independent variable(s) follows a nonlinear pattern.
o Provides flexibility in modeling a wide range of functional forms.
Regression Algorithms
There are many different types of regression algorithms, but some of the
most common include:
● Linear Regression
o Linear regression is one of the simplest and most widely
used statistical models. This assumes that there is a linear
relationship between the independent and dependent
variables. This means that the change in the dependent
variable is proportional to the change in the independent
variables.
● Polynomial Regression
o Polynomial regression is used to model nonlinear
relationships between the dependent variable and the
independent variables. It adds polynomial terms to the linear
regression model to capture more complex relationships.
● Support Vector Regression (SVR)
o Support vector regression (SVR) is a type of regression
algorithm that is based on the support vector machine (SVM)
algorithm. SVM is a type of algorithm that is used for
classification tasks, but it can also be used for regression
tasks. SVR works by finding a hyperplane that minimizes the
sum of the squared residuals between the predicted and actual
values.
● Decision Tree Regression
o Decision tree regression is a type of regression algorithm that
builds a decision tree to predict the target value. A decision
tree is a tree-like structure that consists of nodes and
branches. Each node represents a decision, and each branch
represents the outcome of that decision. The goal of
decision tree regression is to build a tree that can accurately
predict the target value for new data points.

2
● Random Forest Regression
o Random forest regression is an ensemble method that
combines multiple decision trees to predict the target value.
Ensemble methods are a type of machine learning algorithm
that combines multiple models to improve the performance of
the overall model. Random forest regression works by
building a large number of decision trees, each of which is
trained on a different subset of the training data. The final
prediction is made by averaging the predictions of all of the
trees.
Regularized Linear Regression Techniques
● Ridge Regression
o Ridge regression is a type of linear regression that is used to
prevent overfitting. Overfitting occurs when the model learns
the training data too well and is unable to generalize to new
data.
● Lasso regression
o Lasso regression is another type of linear regression that is
used to prevent overfitting. It does this by adding a penalty
term to the loss function that forces the model to use some
weights and to set others to zero.
Characteristics of Regression
Here are the characteristics of the regression:
● Continuous Target Variable: Regression deals with predicting
continuous target variables that represent numerical values.
Examples include predicting house prices, forecasting sales figures, or
estimating patient recovery times.
● Error Measurement: Regression models are evaluated based on their
ability to minimize the error between the predicted and actual
values of the target variable. Common error metrics include mean
absolute error (MAE), mean squared error (MSE), and root mean
squared error (RMSE).
● Model Complexity: Regression models range from simple linear
models to more complex nonlinear models. The choice of model
complexity depends on the complexity of the relationship between the
input features and the target variable.
● Overfitting and Underfitting: Regression models are susceptible to
overfitting and underfitting.
● Interpretability: The interpretability of regression models varies
depending on the algorithm used. Simple linear models are highly

3
interpretable, while more complex models may be more difficult to
interpret.

Examples
Which of the following is a regression task?
● Predicting age of a person

● Predicting nationality of a person

● Predicting whether stock price of a company will increase tomorrow
● Predicting whether a document is related to sighting of UFOs?
Solution : Predicting age of a person (because it is a real value, predicting
nationality is categorical, whether stock price will increase is
discrete-yes/no answer, predicting whether a document is related to UFO is
again discrete- a yes/no answer).
Regression Evaluation Metrics
Here are some most popular evaluation metrics for regression:
● Mean Absolute Error (MAE): The average absolute difference between
the predicted and actual values of the target variable.
● Mean Squared Error (MSE): The average squared difference between
the predicted and actual values of the target variable.
● Root Mean Squared Error (RMSE): The square root of the mean
squared error.
● Huber Loss: A hybrid loss function that transitions from MAE to MSE
for larger errors, providing balance between robustness and MSE’s
sensitivity to outliers.
● Root Mean Square Logarithmic Error

● R2 – Score: Higher values indicate better fit, ranging from 0 to 1.

Applications of Regression
● Predicting prices: For example, a regression model could be used to
predict the price of a house based on its size, location, and other
features.
● Forecasting trends: For example, a regression model could be used to
forecast the sales of a product based on historical sales data and
economic indicators.
● Identifying risk factors: For example, a regression model could be
used to identify risk factors for heart disease based on patient data.
● Making decisions: For example, a regression model could be used to
recommend which investment to buy based on market data.
Advantages of Regression
● Easy to understand and interpret

4
● Robust to outliers
● Can handle both linear and nonlinear relationships.
Disadvantages of Regression
● Assumes linearity
● Sensitive to multicollinearity
● May not be suitable for highly complex relationships

Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
From Everand
Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
Jim Frost
5/5 (4)
Linear Regression
No ratings yet
Linear Regression
16 pages
ML-U2-Regression
No ratings yet
ML-U2-Regression
20 pages
LECTURE Regression
No ratings yet
LECTURE Regression
12 pages
ML Activity 2 - 2006205
No ratings yet
ML Activity 2 - 2006205
3 pages
Unit 2 Topic 1 REGRESSION
No ratings yet
Unit 2 Topic 1 REGRESSION
19 pages
LLM ML Interview Q
No ratings yet
LLM ML Interview Q
43 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
33 pages
ML2
No ratings yet
ML2
8 pages
Lecture 2
No ratings yet
Lecture 2
17 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
Module 1 Notes
100% (1)
Module 1 Notes
73 pages
Unit - 3 Machine Learning
No ratings yet
Unit - 3 Machine Learning
30 pages
Regression Techniques
No ratings yet
Regression Techniques
14 pages
DA2
No ratings yet
DA2
12 pages
Group_1_Practical
No ratings yet
Group_1_Practical
16 pages
Regression
No ratings yet
Regression
11 pages
Regression
No ratings yet
Regression
35 pages
Unit II-II
No ratings yet
Unit II-II
21 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
Unit No. 4
No ratings yet
Unit No. 4
4 pages
9 Types of Regression Analysis
No ratings yet
9 Types of Regression Analysis
16 pages
DATA ANALYTICS CLASS - UNIT-III
No ratings yet
DATA ANALYTICS CLASS - UNIT-III
45 pages
Unit - II_DA
No ratings yet
Unit - II_DA
22 pages
Interview Questions - Linear Regression
No ratings yet
Interview Questions - Linear Regression
6 pages
Machine Learning
No ratings yet
Machine Learning
19 pages
Machine Learning Algorithns - Unit3
No ratings yet
Machine Learning Algorithns - Unit3
124 pages
Module 5.2
No ratings yet
Module 5.2
51 pages
ml_unit_3_notes
No ratings yet
ml_unit_3_notes
12 pages
Regression: UNIT - V Regression Model
100% (1)
Regression: UNIT - V Regression Model
21 pages
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript Ds
No ratings yet
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript Ds
12 pages
CPE412 Pattern Recognition (Week 8)
100% (1)
CPE412 Pattern Recognition (Week 8)
25 pages
U-4_IML
No ratings yet
U-4_IML
17 pages
National University of Modern Languages Lahore Campus Topic
No ratings yet
National University of Modern Languages Lahore Campus Topic
4 pages
Unit - 2 MLA
No ratings yet
Unit - 2 MLA
57 pages
6.Classification & Regression
No ratings yet
6.Classification & Regression
45 pages
Machine Learning Algorithm
100% (2)
Machine Learning Algorithm
20 pages
Accuracy Assessment and Confusion Matrix
No ratings yet
Accuracy Assessment and Confusion Matrix
23 pages
UNIT 3
No ratings yet
UNIT 3
20 pages
unit-3 part 2 DA
No ratings yet
unit-3 part 2 DA
20 pages
ML
No ratings yet
ML
6 pages
Unit 2
No ratings yet
Unit 2
76 pages
datamining unit4
No ratings yet
datamining unit4
21 pages
4 ML
No ratings yet
4 ML
41 pages
Linear Regression Basic Interview Questions
No ratings yet
Linear Regression Basic Interview Questions
36 pages
Regression Analysis
100% (2)
Regression Analysis
11 pages
ML Unit 2
No ratings yet
ML Unit 2
27 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
Introduction To AI and ML
No ratings yet
Introduction To AI and ML
22 pages
National University of Modern Languages Lahore Campus Topic
No ratings yet
National University of Modern Languages Lahore Campus Topic
5 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
26 pages
Intro Regression Modeling
No ratings yet
Intro Regression Modeling
11 pages
Mme 8201-4-Linear Regression Models
No ratings yet
Mme 8201-4-Linear Regression Models
24 pages
Data Science
No ratings yet
Data Science
5 pages
MLT Unit 2
No ratings yet
MLT Unit 2
53 pages
PA
No ratings yet
PA
28 pages
ML unit-2 half
No ratings yet
ML unit-2 half
16 pages
Chapter Three _ Regression Feb 26 2024
No ratings yet
Chapter Three _ Regression Feb 26 2024
17 pages
DA Unit-3 Short Q&A
No ratings yet
DA Unit-3 Short Q&A
17 pages
Regression
No ratings yet
Regression
4 pages
The Comprehensive Guide to Machine Learning Algorithms and Techniques
From Everand
The Comprehensive Guide to Machine Learning Algorithms and Techniques
Mohammed Ahmed
5/5 (1)
420L20 Pro Glycos & SER
No ratings yet
420L20 Pro Glycos & SER
14 pages
Electrochemistry
No ratings yet
Electrochemistry
11 pages
MudraLoan-SalientFeatures-English
No ratings yet
MudraLoan-SalientFeatures-English
3 pages
ascorbic
No ratings yet
ascorbic
6 pages
6df85089-56a0-4400-927b-397a7bd3b3b8
No ratings yet
6df85089-56a0-4400-927b-397a7bd3b3b8
36 pages
Electronic Spectra of Transition Metal
No ratings yet
Electronic Spectra of Transition Metal
39 pages
Excel Link 3 User Guide
No ratings yet
Excel Link 3 User Guide
88 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
Graphs Lesson Note
No ratings yet
Graphs Lesson Note
8 pages
2 Regression With Multiple Regressors 1
No ratings yet
2 Regression With Multiple Regressors 1
22 pages
Xtxtgee
No ratings yet
Xtxtgee
19 pages
DS-203: E2 Assignment - Linear Regression Report: Sahil Barbade (210040131) 29th Jan 2024
No ratings yet
DS-203: E2 Assignment - Linear Regression Report: Sahil Barbade (210040131) 29th Jan 2024
18 pages
Econometrics For Finance
100% (1)
Econometrics For Finance
54 pages
ECE 3040 Lecture 18: Curve Fitting by Least-Squares-Error Regression
No ratings yet
ECE 3040 Lecture 18: Curve Fitting by Least-Squares-Error Regression
38 pages
04-06 - B-Testing APM
No ratings yet
04-06 - B-Testing APM
10 pages
SEM - An Econometrican S Introduction
No ratings yet
SEM - An Econometrican S Introduction
16 pages
MGT782 - Assignment 3
No ratings yet
MGT782 - Assignment 3
8 pages
Evans Analytics2e PPT 08
No ratings yet
Evans Analytics2e PPT 08
65 pages
GeneMapIDX Ver1 5 ReferenceGuide
No ratings yet
GeneMapIDX Ver1 5 ReferenceGuide
82 pages
Chapter 7 Correlation and Simple Linear Linear Regression Fall 2023-2024
No ratings yet
Chapter 7 Correlation and Simple Linear Linear Regression Fall 2023-2024
35 pages
CSL0777 L17
No ratings yet
CSL0777 L17
27 pages
Simple & Multiple Regression
No ratings yet
Simple & Multiple Regression
12 pages
NC-5-axis Production
No ratings yet
NC-5-axis Production
674 pages
Question Bank - PA
No ratings yet
Question Bank - PA
3 pages
Lectures - Multiple - Regression - Analysis - Further - Issues
No ratings yet
Lectures - Multiple - Regression - Analysis - Further - Issues
14 pages
1 4 Multilevel and Longitudinal Mode PDF
No ratings yet
1 4 Multilevel and Longitudinal Mode PDF
1,503 pages
Assignment 11
No ratings yet
Assignment 11
5 pages
Poisson Regression
No ratings yet
Poisson Regression
3 pages
Regression Analysis
No ratings yet
Regression Analysis
50 pages
Assignment 3 - DS
No ratings yet
Assignment 3 - DS
9 pages
Chapter 8
No ratings yet
Chapter 8
20 pages
Chapter - 2 - Week 4-11 Feb
No ratings yet
Chapter - 2 - Week 4-11 Feb
45 pages
Econometrics 4
No ratings yet
Econometrics 4
37 pages
Saputri, Pengaruh-Body-Condition-Score-Periode-Steaming-Up-Terhadap-Jumlah-dan-Lama-Produksi-Kolostrum
No ratings yet
Saputri, Pengaruh-Body-Condition-Score-Periode-Steaming-Up-Terhadap-Jumlah-dan-Lama-Produksi-Kolostrum
6 pages