0% found this document useful (0 votes)

6 views

1-Multiple Regression

نو.،و

Uploaded by

Omar Qasim

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

1-Multiple Regression

نو.،و

Uploaded by

Omar Qasim

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

Multiple Linear Regression

• Most regression problems involve more than one independent

variable (predictor).
• If each independent variables varies in a linear manner with Y, the
estimated regression function in this case is:
ෝ𝒊 = 𝒃𝟎 + 𝒃𝟏 𝑿𝟏𝒊 + 𝒃𝟐 𝑿𝟐𝒊 + ⋯ + 𝒃𝒌 𝑿𝒌𝒊
𝒚

• The optimal values for the 𝒃𝒊 can again be found by minimizing the
sum squares error (SSE).
• The resulting function fits a hyperplane to our sample data.

1
Example Dataset: IQ scores
• Dependent variable (y): Performance IQ scores (PIQ) from the revised
Wechsler Adult Intelligence Scale. This variable served as the investigator’s
measure of the individual's intelligence.
• Potential independent variable (x1): Brain size based on the count
obtained from MRI scans (given as count/10,000)
• Potential independent variable (x2): Height in inches
• Potential independent variable (x3): Weight in pounds
• Potential independent variable (x4): Gender (categorical variable)
0= male, 1= female

2
Let's start with some descriptive statistics
• Analyze → Descriptive Statistics→ Descriptives

3
Descriptive statistics SPSS output

Descriptive Statistics
N Minimum Maximum Mean Std. Deviation
Performance IQ score 38 72 150 111.34 22.598
Brain (MRI) 38 79.06 107.95 90.6758 7.25628
Height 38 62.0 77.0 68.421 3.9938
Weight 38 106 192 151.05 23.479
Valid N (listwise) 38

4
For the categorical variable
• Analyze → Descriptive Statistics→ Frequencies

5
Frequencies SPSS output
Gender

Valid
Freq Percent Percent Cum.
Percent
Valid Male 19 50.0 50.0 50.0
Female 19 50.0 50.0 100.0
Total 38 100.0 100.0

6
Correlation analysis
• Analyze → Correlate→ Bivariate

7
Correlation analysis SPSS output
Correlations
Performance Brain (MRI) Height
IQ score
Pearson Correlation .378*
Brain (MRI) Sig. (2-tailed) .019
N 38
Pearson Correlation -.093 .588**
Height Sig. (2-tailed) .578 .000
N 38 38
Pearson Correlation .003 .513** .700**
Weight Sig. (2-tailed) .988 .001 .000
N 38 38 38
*. Correlation is significant at the 0.05 level (2-tailed).
**. Correlation is significant at the 0.01 level (2-tailed).

8
Regression Model Building
Three tables should be included in the regression analysis:
1- Table of Coefficients (Estimates).
2- Fit Measure (adjusted R squared).
3- ANOVA Table.

9
Let’s start with including all independent
variables (Full Regression Model)
• Analyze → Regression→ linear

10
Table of Coefficients
Coefficientsa

Unstandardized Standardized
Coefficients Coefficients t Sig.
Model B Std. Error Beta
1 (Constant) 107.217 62.052 1.728 .093
Brain (MRI) 2.199 .563 .706 3.907 <.001
Height -2.644 1.212 -.467 -2.182 .036
Weight -.064 .199 -.066 -.321 .750
Gender -9.460 6.544 -.212 -1.446 .158
a. Dependent Variable: Performance IQ score

11
Table of Coefficients
The estimated regression equation can be written as:
𝑦ො = 107.2 + 2.2 𝐵𝑟𝑎𝑖𝑛 − 2.64 𝐻𝑒𝑖𝑔ℎ𝑡 − 0.06 𝑊𝑒𝑖𝑔ℎ𝑡 − 9.46(𝐺𝑒𝑛𝑑𝑒𝑟)
The coefficients may be interpreted as follows:
• 𝑏0 (𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡) = 107.2 : The performance IQ scores (PIQ) is 107.2
when brain size is zero, height is zero, weight is zero and gender is
male. (Has no logic)
• 𝑏1 (𝐵𝑟𝑎𝑖𝑛) = 2.2 : When the brain size increases by one unit, PIQ
increases by 2.2 units holding other variables constant.
• 𝑏2 𝐻𝑒𝑖𝑔ℎ𝑡 = −2.64 : When the height increases by one inch, PIQ
decreases by 2.64 units holding other variables constant

12
Table of Coefficients
The estimated regression equation can be written as:
𝑦ො = 107.2 + 2.2 𝐵𝑟𝑎𝑖𝑛 − 2.64 𝐻𝑒𝑖𝑔ℎ𝑡 − 0.06 𝑊𝑒𝑖𝑔ℎ𝑡 − 9.46(𝐺𝑒𝑛𝑑𝑒𝑟)
The coefficients may be interpreted as follows:
• 𝑏3 𝑊𝑒𝑖𝑔ℎ𝑡 = −0.06 : When the weight increases by one lbs, PIQ
decreases by 0.06 units holding other variables constant
• 𝑏4 𝐺𝑒𝑛𝑑𝑒𝑟 = −9.46 : Females (coded as 1) on average were 9.46
points lower than males (coded as 0).
• However, the p-values for both weight and gender are higher than
the 5% significance level. Thus, they should be removed from the
final regression model and the regression equation is re-estimated.

13
Multicollinearity
• A high correlation between two independent variables such that the
two variables contribute redundant information to the model. When
highly correlated independent variables are included in the regression
model, they can adversely affect the regression results.

14
Multicollinearity
• One method of measuring and detecting multicollinearity is known as
the variance inflation factor (VIF).
• A VIF equal to 1.0 for a given independent variable indicates that this
independent variable is not correlated with the remaining
independent variables in the model.
• The greater the multicollinearity, the larger the VIF.

15
Multicollinearity
• Generally, if VIF <5 for a particular independent variable, then we do
not consider multicollinearity a problem for that variable.
• VIF ≥ 5 implies that the correlation between the independent
variables is too extreme and should be dealt with by dropping
variables from the model.

16
SPSS VIF
• Analyze → Regression→ linear → Statistics → Collinearity diagnostics

17
SPSS VIF output

Collinearity Statistics
Model
Tolerance VIF
(Constant)
Brain (MRI) .615 1.626
1 Height .438 2.282
Weight .470 2.129
Gender .933 1.072

18
Selecting the Model
• We want to identify the simplest model that adequately accounts for
the systematic variation in the Y variable.
• Arbitrarily using all the independent variables may result in
overfitting. We want to avoid overfitting the data.
• As additional independent variables are added to a model:
- The R2 statistic can only increase.
- The Adjusted-R2 statistic can increase or decrease.
• The R2 statistic can be artificially inflated by adding any independent
variable to the model. We can compare adjusted R2 values as a
heuristic to tell if adding an additional independent variable really
helps.
19
Stepwise Regression
• One option in regression analysis is to bring all possible independent
variables into the model in one step (Full regression). This is what we
have done in the previous sections.
• Another option for developing a regression model is called stepwise
regression.
• Stepwise regression is the step-by-step iterative construction of a
regression model that involves the selection of independent
variables to be used in a final model.

20
Stepwise Regression in SPSS
• Analyze → Regression→ linear → Chose Method: Stepwise

21
Stepwise Regression in SPSS Output
Coefficientsa
Unstandardized Standardized Collinearity
Coefficients Coefficients t Sig. Statistics

Model B Std. Error Beta Tolerance VIF

(Constant) 4.652 43.712 .106 .916
1
Brain (MRI) 1.177 .481 .378 2.448 .019 1.000 1.000

(Constant) 111.276 55.867 1.992 .054

2 Brain (MRI) 2.061 .547 .662 3.770 .001 .654 1.529
Height -2.730 .993 -.482 -2.749 .009 .654 1.529
a. Dependent Variable: Performance IQ score

22
Stepwise Regression in SPSS Output
• The final estimated regression equation can be written as:
𝑦ො = 111.3 + 2.06 𝐵𝑟𝑎𝑖𝑛 − 2.73 𝐻𝑒𝑖𝑔ℎ𝑡
The coefficients may be interpreted as follows:
• 𝑏1 (𝐵𝑟𝑎𝑖𝑛) = 2.06 : When the brain size increases by one unit, PIQ
increases by 2.06 units holding other variables constant.
• 𝑏2 𝐻𝑒𝑖𝑔ℎ𝑡 = −2.73 : When the height increases by one inch, PIQ
decreases by 2.73 units holding other variables constant.

23
Testing the parameters (regression coefficients)
We can test the significance for each parameter β0 , β1 and β2 using
t- test as follows;
H0: β0 = 0 H1: β0 ≠ 0
P-value = 0.054
Since the P-value for the t-test appearing in the table is almost 0.05, we will reject
H0. This means that that the constant (β0) is significantly different from 0.
H0: β1 = 0 H1: β1 ≠ 0
P-value = <0.001
Since the P-value for the t-test appearing in the table is less than 0.05, we will
reject H0. This suggests that the slope parameter for Brain (β1) is significant, which
means that the β1 is significantly different from 0.

24
Testing the parameters (regression coefficients)
We can test the significance for each parameter β0 , β1 and β2 using
t- test as follows;
H0: β2 = 0 H1: β2 ≠ 0
P-value = 0.009
Since the P value for the t-test is less than 0.05, we will reject H0. This
suggests that the slope parameter for Height (β2) is significant, which means
that the β2 is significantly different from 0.

25
Model Fit
Model Summary

Model R R Square Adjusted R Std. Error of

Square the Estimate
1 .378a .143 .119 21.212
2 .543b .295 .255 19.510
a. Predictors: (Constant), Brain (MRI)

b. Predictors: (Constant), Brain (MRI), Height

• The adjusted R-square (Coefficient of determination) suggest that

25.5% of the variation that occurs in PIQ is explained by brain and
height. The remaining 74.5% remains unexplained by the regression
model.

26
Testing the overall model (ANOVA)
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 2697.094 1 2697.094 5.994 .019b
Residual 16197.459 36 449.929
Total 18894.553 37
2 Regression 5572.741 2 2786.371 7.321 .002c
Residual 13321.811 35 380.623
Total 18894.553 37
a. Dependent Variable: Performance IQ score
b. Predictors: (Constant), Brain (MRI)
c. Predictors: (Constant), Brain (MRI), Height

27
Testing the overall model (ANOVA)
H0: β1 = β2 =0 (model is not significant)
H1: At least one β’s ≠ 0 (model is significant)
• The P-value for ANOVA F-test is equal to 0.02. It is less than 0.05
significance level. So, we reject H0.
• This means that the overall regression model is significant and can be
used for prediction.

28
Linear Regression Assumptions
It is essential to check the assumptions of the linear regression model.
If the assumptions are valid, the regression results will be reliable.
There are different assumptions, the most important are:
1- Linearity: The relationship between the X’s and Y should be
linear.
2- Multicollinearity: There should be no (or little) multicollinearity.
3- Homoscedasticity: The variance of residual (error term) is the
same for any value of X.
4- Normality: The residuals (error terms) should be normally
distributed.

29
1- Linearity
• linearity can be checked by plotting the residuals (on the vertical axis)
versus each independent variable (on the horizontal axis) : the
pattern should be approximately linear.
• Alternatively, linearity can be checked by plotting the outcome
variable against the independent (predictor) variable: the pattern
should be approximately linear.
• A curving pattern suggests that a linear model may not be the best fit
and that a more complex model (for example, a quadratic term) may
need to be added.

30
Linearity Checking SPSS
• Analyze → Regression→ linear → Save → Check unstandardized
residuals
Note: SPSS will save the
residuals of the model
and add a new variable
to the data set (RES_1)
which contains the
calculated residuals.

31
Scatter plot of the residuals against the
independent variables

Since they are random patterns, then linearity is achieved.

32
2- Multicollinearity
• Multicollinearity can be checked by computing the Variance Inflation
Factor (VIF) discussed earlier.
• Generally, if VIF <5 for a particular independent variable, then we do
not consider multicollinearity a problem for that variable.

33
3- Homoscedasticity
• Residual plots also can be used to determine whether the residuals
have a constant variance.
• When we have developed a multiple regression model, we can
analyze the equal variance assumption by plotting the residuals
ෝ values.
against the fitted 𝒚
• Analyze → Regression→ linear → Save → Check unstandardized
residuals AND Check unstandardized Predicted values

34
Scatter plot of the residuals against the Fitted
values

The variance of the residuals stays approximately constant over the

range of the fitted values.
35
4- Normality
• The need for normally distributed model errors occurs when we want
to test a hypothesis about the regression model.
• Small departures from normality do not cause serious problems.
• However, if the model errors depart dramatically from a normal
distribution, there is cause for concern.
• Examining the residuals will allow us to detect such dramatic
departures.
• One method for graphically analyzing the residuals is to form a
frequency histogram of the residuals to determine whether the
general shape is normal or using a Q-Q plot.

36
SPSS Q-Q plot
• Analyze →
Descriptive Statistics→
Q-Q plots

37
Normal Q-Q plot of unstandardized residuals
If the points are around the
45-degree line, Normality
is achieved.

Otherwise, Normality
assumption is violated.

Roitt'S Essential Immunology 12Th Edition (All Mcqs With Answers)
100% (9)
Roitt'S Essential Immunology 12Th Edition (All Mcqs With Answers)
100 pages
Shin-Nippon SLM-4000-5000 - Service Manual PDF
No ratings yet
Shin-Nippon SLM-4000-5000 - Service Manual PDF
46 pages
GB 150.4 Fabrication, Inspection and Testing, and Acceptance PDF
No ratings yet
GB 150.4 Fabrication, Inspection and Testing, and Acceptance PDF
43 pages
Regression
No ratings yet
Regression
39 pages
Bivariate Regression Analysis: The Beginning of Many Types of Regression
No ratings yet
Bivariate Regression Analysis: The Beginning of Many Types of Regression
40 pages
Lect-1-Types and Summarizing Data - 2017
No ratings yet
Lect-1-Types and Summarizing Data - 2017
51 pages
Regression and Correlation
No ratings yet
Regression and Correlation
12 pages
Analysing Your Data
No ratings yet
Analysing Your Data
15 pages
Lecture 6-Revisions Chapter 1-5
No ratings yet
Lecture 6-Revisions Chapter 1-5
62 pages
Introduction To Correlation Packet
No ratings yet
Introduction To Correlation Packet
10 pages
Multiple Linear Regression in Data Mining
100% (1)
Multiple Linear Regression in Data Mining
14 pages
Charpter 5 - Descriptive Analysis
No ratings yet
Charpter 5 - Descriptive Analysis
88 pages
K10-regresi pelbagai 6013
No ratings yet
K10-regresi pelbagai 6013
29 pages
Regression Analysis
No ratings yet
Regression Analysis
54 pages
ML RUSA Module 5 Dim Red
No ratings yet
ML RUSA Module 5 Dim Red
85 pages
6: Regression and Multiple Regression: Independent Variable. Then, Click
No ratings yet
6: Regression and Multiple Regression: Independent Variable. Then, Click
9 pages
6: Regression and Multiple Regression: Independent Variable. Then, Click
No ratings yet
6: Regression and Multiple Regression: Independent Variable. Then, Click
9 pages
W9 Multiple Linear Regression ANOVA
No ratings yet
W9 Multiple Linear Regression ANOVA
28 pages
01_statistics_lesson
No ratings yet
01_statistics_lesson
35 pages
1.linear Regression PSP
No ratings yet
1.linear Regression PSP
92 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
18 pages
Chapter 15
No ratings yet
Chapter 15
43 pages
Ex2D
No ratings yet
Ex2D
20 pages
Statistics and Excel Worksheet - JGG - Rev - STUDENT
No ratings yet
Statistics and Excel Worksheet - JGG - Rev - STUDENT
6 pages
Predict Classify Cluster
No ratings yet
Predict Classify Cluster
12 pages
1.3. MR Using SPSS
No ratings yet
1.3. MR Using SPSS
24 pages
Basic Statistical Tools For Educational Research
No ratings yet
Basic Statistical Tools For Educational Research
39 pages
Applied Statistics II-SLR
100% (1)
Applied Statistics II-SLR
23 pages
Chatgpt Unit - 2
No ratings yet
Chatgpt Unit - 2
3 pages
Mp11 QuantitativeHypothesis Testing 7e2016sekaran &bougie-Ch15
No ratings yet
Mp11 QuantitativeHypothesis Testing 7e2016sekaran &bougie-Ch15
39 pages
OLS
No ratings yet
OLS
6 pages
Uncertainty and Its Propagation Through Calculations
No ratings yet
Uncertainty and Its Propagation Through Calculations
24 pages
Econometrics
No ratings yet
Econometrics
6 pages
Estimating Demand Function
No ratings yet
Estimating Demand Function
45 pages
Chapter 5 Descriptive Analysis Using Measures of Central Tendency and Measures of Despersion
No ratings yet
Chapter 5 Descriptive Analysis Using Measures of Central Tendency and Measures of Despersion
88 pages
ml
No ratings yet
ml
10 pages
Lec 3 Regression.
No ratings yet
Lec 3 Regression.
20 pages
Covariance Vs Component Based Structural Equation Modeling: Prof. Dr. Imam Ghozali
No ratings yet
Covariance Vs Component Based Structural Equation Modeling: Prof. Dr. Imam Ghozali
49 pages
Analysing Panel Data
No ratings yet
Analysing Panel Data
25 pages
Class X: Bivariate Association & The Chi Square Test
No ratings yet
Class X: Bivariate Association & The Chi Square Test
27 pages
R: Linear Regression
No ratings yet
R: Linear Regression
15 pages
Real Returns Through Artificial Intelligence
No ratings yet
Real Returns Through Artificial Intelligence
25 pages
lec3
No ratings yet
lec3
60 pages
Descriptive Statistics Analysis Part 1
No ratings yet
Descriptive Statistics Analysis Part 1
42 pages
Conf. Univ. Dr. Dana Viorică: Bazele Statisticii
100% (1)
Conf. Univ. Dr. Dana Viorică: Bazele Statisticii
39 pages
Data Analysis
No ratings yet
Data Analysis
56 pages
Tugas Regresi & Kuisoner Aldi DKK
No ratings yet
Tugas Regresi & Kuisoner Aldi DKK
7 pages
9 Measurement and Uncertainty IAEA
No ratings yet
9 Measurement and Uncertainty IAEA
24 pages
Lecture 09_02.09.2024_Regression-01
No ratings yet
Lecture 09_02.09.2024_Regression-01
62 pages
Multiple Regression: Model and Interpretation
No ratings yet
Multiple Regression: Model and Interpretation
10 pages
Dispersion
No ratings yet
Dispersion
16 pages
PPT 08 - Quantitative Data Analysis
No ratings yet
PPT 08 - Quantitative Data Analysis
51 pages
Prerequisite Test (Analysis of Normality, Homogeneity, and Linearity Using Spss Program)
No ratings yet
Prerequisite Test (Analysis of Normality, Homogeneity, and Linearity Using Spss Program)
10 pages
Biostatistics Central Tendency
No ratings yet
Biostatistics Central Tendency
40 pages
Lab2 SPSS Ii
No ratings yet
Lab2 SPSS Ii
39 pages
Principal Component Analysis (PCA) : Anisha M. Lal
No ratings yet
Principal Component Analysis (PCA) : Anisha M. Lal
20 pages
Lesson 8_ Regression-T
No ratings yet
Lesson 8_ Regression-T
54 pages
Statistik Deskriptif - 2016 BIRU
No ratings yet
Statistik Deskriptif - 2016 BIRU
64 pages
Assignment 1 - Psychometrics
No ratings yet
Assignment 1 - Psychometrics
3 pages
SPC Procedure
80% (5)
SPC Procedure
5 pages
Leanmap FREE Regression Analysis Calculator
No ratings yet
Leanmap FREE Regression Analysis Calculator
2 pages
Chapter 9: Correlation & Simple Linear Regression
No ratings yet
Chapter 9: Correlation & Simple Linear Regression
17 pages
Introduction to Linear Models and Statistical Inference
From Everand
Introduction to Linear Models and Statistical Inference
Steven J. Janke
No ratings yet
Complex Analysis - George Cain - Georgia Institute of Technology
No ratings yet
Complex Analysis - George Cain - Georgia Institute of Technology
115 pages
Unit 2 Register Transfer and Microoperations
No ratings yet
Unit 2 Register Transfer and Microoperations
39 pages
Quick Start Guide - Barco - DCS-200
No ratings yet
Quick Start Guide - Barco - DCS-200
2 pages
Job
No ratings yet
Job
4 pages
DJJ40153 - Lab Sheet 3
No ratings yet
DJJ40153 - Lab Sheet 3
3 pages
Elsa Thyovani and Shilvi Maljeti
No ratings yet
Elsa Thyovani and Shilvi Maljeti
15 pages
Worksheet 4 Mirobial Growth
No ratings yet
Worksheet 4 Mirobial Growth
4 pages
UA SRG JEE Adv. 24 Assignment Result PCM (12,13 - March - 2024)
No ratings yet
UA SRG JEE Adv. 24 Assignment Result PCM (12,13 - March - 2024)
3 pages
Qorvo DW3000-2934245
No ratings yet
Qorvo DW3000-2934245
57 pages
Module 1 Physics 1 Fluid Mechanics
No ratings yet
Module 1 Physics 1 Fluid Mechanics
34 pages
Lab 4 (1)
No ratings yet
Lab 4 (1)
8 pages
14-01b Section Properties of Cmu Walls
No ratings yet
14-01b Section Properties of Cmu Walls
8 pages
Abacus Academy of Kerala: Application Form
No ratings yet
Abacus Academy of Kerala: Application Form
1 page
Comparisonof Egyptian Standards
No ratings yet
Comparisonof Egyptian Standards
16 pages
Research Article: An Empirical Study of Machine Learning Algorithms For Stock Daily Trading Strategy
No ratings yet
Research Article: An Empirical Study of Machine Learning Algorithms For Stock Daily Trading Strategy
31 pages
Unit 7 water_Universal solvent
No ratings yet
Unit 7 water_Universal solvent
5 pages
Python Dictionaries: Accessing Items
No ratings yet
Python Dictionaries: Accessing Items
9 pages
Answers PDF
No ratings yet
Answers PDF
19 pages
Lesson 2.accuracy An Precision
No ratings yet
Lesson 2.accuracy An Precision
19 pages
HW Unit 3&4 Trigonometric Functions
No ratings yet
HW Unit 3&4 Trigonometric Functions
3 pages
Class 11 Physics Notes Chapter 10 Thermal Properties of Matter
No ratings yet
Class 11 Physics Notes Chapter 10 Thermal Properties of Matter
62 pages
Rajasthan Board Class 12 Physics Ss 40 1 2018
No ratings yet
Rajasthan Board Class 12 Physics Ss 40 1 2018
8 pages
Wax Contouring of CD Report
100% (1)
Wax Contouring of CD Report
10 pages
Eccentricity Footing - 01022020
No ratings yet
Eccentricity Footing - 01022020
7 pages
Purlin Calculator
No ratings yet
Purlin Calculator
16 pages
Blood Bank Management System: Link To The Video Presentation
No ratings yet
Blood Bank Management System: Link To The Video Presentation
33 pages
18.0 Carbonyl Compounds
100% (2)
18.0 Carbonyl Compounds
9 pages