Data Science

Regression analysis is a statistical method used to model relationships between variables. It allows prediction of continuous target variables from inputs and understanding how the target changes with each input. There are different types including linear, logistic, polynomial, support vector, decision tree, and random forest regression. Each type is suited to different problem domains like trends analysis, forecasting, and classification.

Uploaded by

Abisha

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views

Data Science

Uploaded by

Abisha

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Regression Analysis

Regression analysis is a statistical method to model the relationship between a

dependent (target) and independent (predictor) variables with one or more
independent variables.
More specifically, Regression analysis helps us to understand how the value of the
dependent variable is changing corresponding to an independent variable when
other independent variables are held fixed. It predicts continuous/real values such as
temperature, age, salary, price, etc.

Some examples of regression can be as:

 Prediction of rain using temperature and other factors
 Determining Market trends
 Prediction of road accidents due to rash driving

Terminologies Related to the Regression Analysis:

Dependent Variable: The main factor in Regression analysis which we want to
predict or understand is called the dependent variable. It is also called target
variable.
Independent Variable: The factors which affect the dependent variables or which
are used to predict the values of the dependent variables are called independent
variable, also called as a predictor.
Outliers: Outlier is an observation which contains either very low value or very
high value in comparison to other observed values. An outlier may hamper the
result, so it should be avoided.
Multicollinearity: If the independent variables are highly correlated with each
other than other variables, then such condition is called Multicollinearity. It should
not be present in the dataset, because it creates problem while ranking the most
affecting variable.
Underfitting and Overfitting: If our algorithm works well with the training dataset
but not well with test dataset, then such problem is called Overfitting. And if our
algorithm does not perform well even with training dataset, then such problem is
called underfitting.

Below are some other reasons for using Regression analysis:

 Regression estimates the relationship between the target and the independent
variable.
 It is used to find the trends in data.
 It helps to predict real/continuous values.
 By performing the regression, we can confidently determine the most
important factor, the least important factor, and how each factor is affecting
the other factors.

Types of Regression
There are various types of regressions which are used in data science and machine
learning. Each type has its own importance on different scenarios, but at the core,
all the regression methods analyze the effect of the independent variable on
dependent variables.

Here we are discussing some important types of regression which are given below:
 Linear Regression
 Logistic Regression
 Polynomial Regression
 Support Vector Regression
 Decision Tree Regression
 Random Forest Regression
 Ridge Regression
 Lasso Regression.
Linear Regression:
 Linear regression is a statistical regression method which is used for
predictive analysis.
 It is one of the very simple and easy algorithms which works on regression
and shows the relationship between the continuous variables.
 It is used for solving the regression problem in machine learning.
Below is the mathematical equation for Linear regression:
Y= aX+b
Here, Y = dependent variables (target variables),
X= Independent variables (predictor variables),
A and b are the linear coefficients

Some popular applications of linear regression are:

o Analyzing trends and sales estimates

o Salary forecasting
o Real estate prediction
o Arriving at ETAs in traffic.

Logistic Regression:
Logistic regression is another supervised learning algorithm which is used to solve
the classification problems. In classification problems, we have dependent variables
in a binary or discrete format such as 0 or 1.
Logistic regression algorithm works with the categorical variable such as 0 or 1, Yes
or No, True or False, Spam or not spam, etc.
There are three types of logistic regression:
 Binary(0/1, pass/fail)
 Multi(cats, dogs, lions)
 Ordinal(low, medium, high)

Polynomial Regression:
Polynomial Regression is a type of regression which models the non-linear dataset
using a linear model.
It is similar to multiple linear regression, but it fits a non-linear curve between the
value of x and corresponding conditional values of y.
Suppose there is a dataset which consists of datapoints which are present in a non-
linear fashion, so for such case, linear regression will not best fit to those
datapoints. To cover such datapoints, we need Polynomial regression.

Support Vector Regression:

Support Vector Machine is a supervised learning algorithm which can be used for
regression as well as classification problems. So if we use it for regression
problems, then it is termed as Support Vector Regression.
Kernel: It is a function used to map a lower-dimensional data into higher
dimensional data.
Hyperplane: In general SVM, it is a separation line between two classes, but in
SVR, it is a line which helps to predict the continuous variables and cover most of
the datapoints.
Boundary line: Boundary lines are the two lines apart from hyperplane, which
creates a margin for datapoints.
Support vectors: Support vectors are the datapoints which are nearest to the
hyperplane and opposite class.

Decision Tree Regression:

Decision Tree is a supervised learning algorithm which can be used for solving both
classification and regression problems.
It can solve problems for both categorical and numerical data
Decision Tree regression builds a tree-like structure in which each internal node
represents the “test” for an attribute, each branch represent the result of the test, and
each leaf node represents the final decision or result.

Ridge Regression:
Ridge regression is one of the most robust versions of linear regression in which a
small amount of bias is introduced so that we can get better long term predictions.
The amount of bias added to the model is known as Ridge Regression penalty. We
can compute this penalty term by multiplying with the lambda to the squared weight
of each individual features.

Lasso Regression:
Lasso regression is another regularization technique to reduce the complexity of the
model.
It is similar to the Ridge Regression except that penalty term contains only the
absolute weights instead of a square of weights.

Download Complete Quantitative Investment Analysis, 4th Edition Cfa Institute PDF for All Chapters
No ratings yet
Download Complete Quantitative Investment Analysis, 4th Edition Cfa Institute PDF for All Chapters
57 pages
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
From Everand
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
Lee Baker
No ratings yet
Unit - Iii Data Analysis
No ratings yet
Unit - Iii Data Analysis
39 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
4 ML
No ratings yet
4 ML
41 pages
Unit 2
No ratings yet
Unit 2
67 pages
Unit 2 Notes - Final
No ratings yet
Unit 2 Notes - Final
32 pages
ML-U2-Regression
No ratings yet
ML-U2-Regression
20 pages
6 Regression Analysis
No ratings yet
6 Regression Analysis
12 pages
Unit - 2 MLA
No ratings yet
Unit - 2 MLA
57 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
26 pages
Types of Regression
No ratings yet
Types of Regression
8 pages
Lecture 2
No ratings yet
Lecture 2
17 pages
ML points
No ratings yet
ML points
13 pages
UNIT 3 Regression
No ratings yet
UNIT 3 Regression
5 pages
Regression: UNIT - V Regression Model
100% (1)
Regression: UNIT - V Regression Model
21 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
9 pages
5.REGRESSION-1
No ratings yet
5.REGRESSION-1
46 pages
DOC-20240831-WA0023.
No ratings yet
DOC-20240831-WA0023.
22 pages
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript Ds
No ratings yet
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript Ds
12 pages
LECTURE Regression
No ratings yet
LECTURE Regression
12 pages
Module 1 Notes
100% (1)
Module 1 Notes
73 pages
ML 2 nd Unit
No ratings yet
ML 2 nd Unit
50 pages
MCAN305G Machine Learning
No ratings yet
MCAN305G Machine Learning
18 pages
ML Unit 2
No ratings yet
ML Unit 2
27 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
228w1f0065 ML
No ratings yet
228w1f0065 ML
15 pages
Regression in M.L
No ratings yet
Regression in M.L
13 pages
9 Types of Regression Analysis
No ratings yet
9 Types of Regression Analysis
16 pages
Supervised Learning
No ratings yet
Supervised Learning
24 pages
Ch-2 Supervised Machine Learning
No ratings yet
Ch-2 Supervised Machine Learning
48 pages
ML unit-2 half
No ratings yet
ML unit-2 half
16 pages
Notes 2
No ratings yet
Notes 2
22 pages
Regression
No ratings yet
Regression
11 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
Unit I
No ratings yet
Unit I
14 pages
Unit 2
No ratings yet
Unit 2
19 pages
DMML Unit4
No ratings yet
DMML Unit4
77 pages
Regression: Unit Iii
No ratings yet
Regression: Unit Iii
54 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
12 pages
Machine Learning Algorithns - Unit3
No ratings yet
Machine Learning Algorithns - Unit3
124 pages
Unit1 6thsemCS
No ratings yet
Unit1 6thsemCS
22 pages
Unit 2 - NOTES1 - ML
No ratings yet
Unit 2 - NOTES1 - ML
35 pages
Machine Learning
No ratings yet
Machine Learning
19 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
Module_2
No ratings yet
Module_2
5 pages
Supervised Learning Regression
No ratings yet
Supervised Learning Regression
15 pages
ML_7th_Sem_AIML_ITE_Notes_Complete_LONG[1]-34-62
No ratings yet
ML_7th_Sem_AIML_ITE_Notes_Complete_LONG[1]-34-62
29 pages
ARTIFICIAL INTELLIGENCE LEC 4
No ratings yet
ARTIFICIAL INTELLIGENCE LEC 4
13 pages
MLT Unit 2 Linear Regression
No ratings yet
MLT Unit 2 Linear Regression
26 pages
MLT Unit 2
No ratings yet
MLT Unit 2
53 pages
Module5
No ratings yet
Module5
30 pages
2 Regression Models
No ratings yet
2 Regression Models
6 pages
APznzaaV-S8wLPGsP_Add8mCHq3JcpXzeJ180tg4GWAcHx6DAgMVD3eyvT5dWstrOMVpGkO6YPvB6EzW3QMZ2MOlHap6AIHzt5bF4qrpZ6P5COArRIkGSOpTA3irJqdWr5VzZJgsslAEoNck-7XB6goMBGQ2C1xBIjiLrywLxqEZfdK9zE3-of9LPSjsbB_QkInc2mquD_oyBRUUJcHri
No ratings yet
APznzaaV-S8wLPGsP_Add8mCHq3JcpXzeJ180tg4GWAcHx6DAgMVD3eyvT5dWstrOMVpGkO6YPvB6EzW3QMZ2MOlHap6AIHzt5bF4qrpZ6P5COArRIkGSOpTA3irJqdWr5VzZJgsslAEoNck-7XB6goMBGQ2C1xBIjiLrywLxqEZfdK9zE3-of9LPSjsbB_QkInc2mquD_oyBRUUJcHri
199 pages
DA2
No ratings yet
DA2
12 pages
Chapter+3+ ++Regression+Algorithms
No ratings yet
Chapter+3+ ++Regression+Algorithms
22 pages
Regression
No ratings yet
Regression
45 pages
unit-3 part 2 DA
No ratings yet
unit-3 part 2 DA
20 pages
Module 5.2
No ratings yet
Module 5.2
51 pages
Aiml - 04 - 28
No ratings yet
Aiml - 04 - 28
4 pages
ML UNIT II
No ratings yet
ML UNIT II
30 pages
SPSS - Training - Section 1
No ratings yet
SPSS - Training - Section 1
84 pages
Analisis Permintaan Dan Penawaran Uang Di Indonesia: Muhammad Andi Prayogi
No ratings yet
Analisis Permintaan Dan Penawaran Uang Di Indonesia: Muhammad Andi Prayogi
10 pages
IDA-Group Assignment Question
No ratings yet
IDA-Group Assignment Question
6 pages
201015 - GIAO-Percepciones de Corrupción Rusa (Sharafutdinova)
No ratings yet
201015 - GIAO-Percepciones de Corrupción Rusa (Sharafutdinova)
20 pages
Numbers of Classifier
No ratings yet
Numbers of Classifier
49 pages
Impact of FDI On Bangladesh's Economic Growth: Business Statistics II (BUS 204) Term Paper
No ratings yet
Impact of FDI On Bangladesh's Economic Growth: Business Statistics II (BUS 204) Term Paper
10 pages
Lab-9 RMD
No ratings yet
Lab-9 RMD
5 pages
(Ebook) Design of Experiments for Reliability Achievement by Steven E. Rigdon, Rong Pan, Douglas C. Montgomery, Laura J. Freeman ISBN 9781119237693, 1119237696 - The latest ebook is available, download it today
100% (2)
(Ebook) Design of Experiments for Reliability Achievement by Steven E. Rigdon, Rong Pan, Douglas C. Montgomery, Laura J. Freeman ISBN 9781119237693, 1119237696 - The latest ebook is available, download it today
83 pages
Correlation and Regression Questions - Answers
No ratings yet
Correlation and Regression Questions - Answers
19 pages
Commerce Syllabuseconomic
No ratings yet
Commerce Syllabuseconomic
7 pages
Aniebiet's Project 333
No ratings yet
Aniebiet's Project 333
17 pages
Panel Data MOdel-5 PDF
No ratings yet
Panel Data MOdel-5 PDF
44 pages
The Impact of Fertilizer Subsidy On Paddy Cultivation in Sri Lanka
No ratings yet
The Impact of Fertilizer Subsidy On Paddy Cultivation in Sri Lanka
24 pages
The Effect of Shopping Lifestyle and Fashion Involvement On The Impulse Buying Behavior of High-Income Community in Mataram
No ratings yet
The Effect of Shopping Lifestyle and Fashion Involvement On The Impulse Buying Behavior of High-Income Community in Mataram
12 pages
Chapter Three Homework
100% (1)
Chapter Three Homework
11 pages
SSRN 4709397
No ratings yet
SSRN 4709397
53 pages
CS 229, Public Course Problem Set #4 Solutions: Unsupervised Learn-Ing and Reinforcement Learning
No ratings yet
CS 229, Public Course Problem Set #4 Solutions: Unsupervised Learn-Ing and Reinforcement Learning
12 pages
Faculity of Agriculture Department of Rural Development and The Role of Agricultural Cooperatives in Achieving Socio-Economic Development of Its Members: in Case of Teda Kebele, Gondar
No ratings yet
Faculity of Agriculture Department of Rural Development and The Role of Agricultural Cooperatives in Achieving Socio-Economic Development of Its Members: in Case of Teda Kebele, Gondar
46 pages
Academic Year 2021-22 Scheme and Syllabus
No ratings yet
Academic Year 2021-22 Scheme and Syllabus
49 pages
Machine Learning Project: Raghul Harish
100% (2)
Machine Learning Project: Raghul Harish
46 pages
Quantitative Techniques in Management (NEW)
No ratings yet
Quantitative Techniques in Management (NEW)
41 pages
Journal Pre-Proofs
No ratings yet
Journal Pre-Proofs
23 pages
1152CS239-Intro. To Data Science-Syllabus
No ratings yet
1152CS239-Intro. To Data Science-Syllabus
6 pages
Statistics For Quantitative Techniques For Managers
No ratings yet
Statistics For Quantitative Techniques For Managers
3 pages
Sample of CFA Level 1 Question Bank 2025
100% (1)
Sample of CFA Level 1 Question Bank 2025
489 pages
Coaching Actuaries Exam SRM Suggested Study Schedule: Phase 1: Learn
No ratings yet
Coaching Actuaries Exam SRM Suggested Study Schedule: Phase 1: Learn
4 pages
Faktor-Faktor Yang Mempengaruhi Keputusan Pembelian Bebek Goreng Pak Koes Yogyakarta
No ratings yet
Faktor-Faktor Yang Mempengaruhi Keputusan Pembelian Bebek Goreng Pak Koes Yogyakarta
24 pages
Data Preprocessing
No ratings yet
Data Preprocessing
30 pages
The Effect of Talent Management On Organizational Excellence: A Case Study of The Jordanian Manaseer Group
No ratings yet
The Effect of Talent Management On Organizational Excellence: A Case Study of The Jordanian Manaseer Group
25 pages