Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
4 views

Assignment 12'

Uploaded by

llp936186
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Assignment 12'

Uploaded by

llp936186
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

1

UNIVERSITY OF MALAKAND
Name: Aizaz khan
Roll No: 33
Department
Tourism & Hotel management
Semester: 3rd

Assignment
Subject: Statistics
Topic: Correlation & regression
Submitted to
Sir Habib Ullah
2

Correlation and Regression


Correlation and regression are the two most commonly used techniques for
investigating the relationship between quantitative variables. Here regression refers to
linear regression. Correlation is used to give the relationship between the variables
whereas linear regression uses an equation to express this relationship.

Correlation and regression are used to define some form of association between
quantitative variables that are assumed to have a linear relationship. In this article, we
will learn more about these topics, the difference between correlation and regression as
well as see some associated examples.

What are Correlation and Regression?


Correlation and regression are statistical measurements that are used to give a
relationship between two variables. For example, suppose a person is driving an
expensive car then it is assumed that she must be financially well. To numerically
quantify this relationship, correlation and regression are used.

Correlation
Definition: Correlation can be defined as a measurement that is used to quantify the
relationship between variables. If an increase (or decrease) in one variable causes a
corresponding increase (or decrease) in another then the two variables are said to be
directly correlated. Similarly, if an increase in one causes a decrease in another or vice
versa, then the variables are said to be indirectly correlated. If a change in an
independent variable does not cause a change in the dependent variable then they are
uncorrelated. Thus, correlation can be positive (direct correlation), negative (indirect
correlation), or zero. This relationship is given by the correlation coefficient.

Regression
Definition: Regression can be defined as a measurement that is used to quantify
how the change in one variable will affect another variable. Regression is used to find
the cause and effect between two variables. Linear regression is the most commonly
used type of regression because it is easier to analyze as compared to the rest. Linear
regression is used to find the line that is the best fit to establish a relationship between
variables.

Correlation and Regression Analysis


3

Both correlation and regression analysis are done to quantify the strength of
the relationship between two variables by using numbers. Graphically,
correlation and regression analysis can be visualized using scatter plots.

Correlation analysis is done so as to determine whether there is a relationship


between the variables that are being tested. Furthermore, a correlation coefficient such
as Pearson's correlation coefficient is used to give a signed numeric value that depicts
the strength as well as the direction of the correlation. The scatter plot gives the
correlation between two variables x and y for individual data points as shown below.

Regression analysis is used to determine the relationship between two variables


such that the value of the unknown variable can be estimated using the knowledge of
4
the known variables. The goal of linear regression is to find the best-fitted line through
the data points. For two variables, x, and y, the regression analysis can be visualized as
follows:

Pearson's Correlation Coefficient:

∑𝑛1(𝑥𝑖 − 𝑥 ) (𝑦𝑖 − 𝑦)
𝑟𝑥𝑦 =
√∑𝑛1 (𝑥𝑖 − 𝑥 )2 ∑𝑛1 (𝑦𝑖 − 𝑦)2

Ordinary Least Squares (OLS) Linear Regression:


5
The straight line equation is given as y = α + βx
𝑛
∑1 (𝑥𝑖 − 𝑥) (𝑦𝑖 − 𝑦)
𝛽=
∑𝑛1 (𝑥𝑖 − 𝑥)2
𝜎𝑦
𝛽 = 𝑟𝑥𝑦
𝜎𝑥
𝛼 = 𝑦 − 𝛽𝑥
Here, 𝑥 is the mean, and 𝜎𝑥 is the standard deviation of the first data set where each
data point is represented by𝑥𝑖 . Similarly, 𝑦 is the mean, and 𝜎𝑦 is the standard deviation
of the second data set. N is the number of data points in the datasets.

Difference between Correlation and Regression


Correlation and regression are both used as statistical measurements to get a good
understanding of the relationship between variables. If the correlation coefficient is
negative (or positive) then the slope of the regression line will also be negative (or
positive). The table given below highlights the key difference between correlation and
regression.

Correlation Regression

Regression is used to
Correlation is used to numerically describe
determine whether how a dependent
variables are related or variable changes with
not. a change in an
independent variable

It finds the best-fitted


Correlation tries to
regression line to
establish a linear
estimate an unknown
relationship between
variable on the basis of
variables.
the known variable.

The variables can be The variables cannot


used interchangeably be interchanged.
6

Correlation Regression

Regression is used to
Correlation uses a
show the impact of a
signed numerical value
unit change in the
to estimate the strength
independent variable
of the relationship
on the dependent
between the variables.
variable.

The least-squares
The Pearson's
method is the best
coefficient is the best
technique to determine
measure of correlation.
the regression line.

You might also like