0% found this document useful (0 votes)

2 views

Lecture 3

The document discusses model misspecification and its implications in regression analysis, particularly focusing on heteroskedasticity, serial correlation, and multicollinearity. It explains the consequences of these violations on statistical inference and the reliability of regression coefficients, emphasizing the importance of correcting for these issues through robust standard errors and other methods. Additionally, it outlines testing methods for detecting these violations and suggests potential solutions for multicollinearity.

Uploaded by

amir rafique

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Lecture 3

Uploaded by

amir rafique

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 35

Model

Misspecification
• Model specification refers to the set of variables included in the regression and
the regression equation’s functional form.
• When estimating a regression, we assume it has the correct functional form, an
assumption that can fail in different ways, as shown in Exhibit 2.
Violations of regression
assumptions: Heteroskedasticity
• An important assumption underlying linear regression is that the variance of
errors is constant across observations (errors are homoskedastic).
• Residuals in financial model estimations, however, are often heteroskedastic,
meaning the variance of the residuals differs across observations.
• Heteroskedasticity may arise from model misspecification, including omitted
variables, incorrect functional form, and incorrect data transformations,
as well as from extreme values of independent variables.
Consequences of Heteroskedasticity

• There are two broad types of heteroskedasticity: unconditional and conditional.

• Unconditional heteroskedasticity occurs when the error variance is not
correlated with the regression’s independent variables.
• Although it violates a linear regression assumption, this form of
heteroskedasticity creates no major problems for statistical inference.
• Conditional heteroskedasticity is more problematic for statistical inference—
when the error variance is correlated with (conditional on) the values of the
independent variables.
• This type of heteroskedasticity may lead to mistakes in statistical inference.
• When errors are conditional heteroskedastic, the F-test for the overall
regression significance is unreliable because the MSE becomes a biased
estimator of the true population variance.
• Moreover, t-tests of individual regression coefficients are unreliable because
heteroskedasticity introduces bias into estimators of the standard error of
regression coefficients.
• Thus, in regressions with financial data, the most likely impacts of conditional
heteroskedasticity are that standard errors will be underestimated, so t-statistics
will be inflated.
• If there is conditional heteroskedasticity in the estimated model, we tend to
find significant relationships where none actually exist and commit more Type
I errors (rejecting the null hypothesis when it is actually true)
Testing for Conditional Heteroskedasticity
• The Breusch–Pagan (BP) test is widely used in financial analysis to diagnose
potential conditional heteroskedasticity and is best understood via the three-
step process shown in Exhibit 4

If conditional heteroskedasticity is present in the initial regression, the independent

variables will explain a significant portion of the variation in the squared residuals in Step
Correcting for Heteroskedasticity

• It is important to note that market efficiency implies that in efficient

markets, heteroskedasticity should generally not be observed in financial
data.
• However, if heteroskedasticity is detected, for example, in the form of
volatility clustering—where large (small) changes tend to be followed by large
(small) changes—then it presents an opportunity to forecast asset returns
that should be exploited to generate alpha.
• So, analysts should not only correct problems in their models due to
heteroskedasticity but also understand the underlying processes in their
data and capitalize on them.
• The easiest method to correct for the effects of conditional heteroskedasticity in
linear regression is to compute robust standard errors, which adjust the
standard errors of the regression’s estimated coefficients to account for the
heteroskedasticity.
VIOLATIONS OF REGRESSION ASSUMPTIONS: SERIAL
CORRELATION

• A common and serious problem in multiple linear regression is violation of the

assumption that regression errors are uncorrelated across observations.
• When regression errors are correlated across observations, they are serially
correlated.
The Consequences of Serial Correlation
• The main problem caused by serial correlation in linear regression is an
incorrect estimate of the regression coefficients’ standard errors.
• If none of the regressors is a previous value—a lagged value—of the dependent
variable, then the estimated parameters themselves will be consistent and need
not be adjusted for the effects of serial correlation.
• But if one of the independent variables is a lagged value of the dependent
variable, serial correlation in the error term causes all parameter estimates to be
inconsistent—that is, invalid estimates of the true parameters.
• Positive serial correlation is present when a positive residual for one
observation increases the chance of a positive residual in a subsequent
observation, resulting in a stable pattern of residuals over time.

• Positive serial correlation also means a negative residual for one observation
increases the chance of a negative residual for another observation.

• Conversely, negative serial correlation has the opposite effect, so a positive

residual for one observation increases the chance of a negative residual for
another observation, and so on.

• We examine positive serial correlation because it is the most common type and
assume first-order serial correlation, or correlation between adjacent
observations. In a time series, this means the sign of the residual tends to
persist from one period to the next.
• Positive serial correlation does not affect the consistency of regression
coefficients, but it does affect statistical tests.

• First, the F-statistic may be inflated because the MSE will tend to
underestimate the population error variance.

• Second, positive serial correlation typically causes standard errors to be

underestimated, so t-statistics are inflated, which (as with
heteroskedasticity) leads to more Type I errors
• Importantly, if a time series exhibits serial correlation, this means that there
is some degree of predictability to it.

• In the case of asset prices, if these prices were to exhibit a pattern, investors
would likely discern this pattern and exploit it to capture alpha, thereby
eliminating such a pattern.

• This idea follows directly from the efficient market hypothesis.

Consequently, assuming market efficiency (even weak form), we should not
observe serial correlation in financial market data.
Testing for Serial Correlation
• There are a variety of tests for serial correlation, but the most common are the
Durbin– Watson (DW) test and the Breusch–Godfrey (BG) test.

• The DW test is a measure of autocorrelation and compares the squared

differences of successive residuals with the sum of the squared residuals.

• However, the DW test is limiting because it applies only to testing for first-
order serial correlation.

• The BG test is more robust because it can detect autocorrelation up to a pre-

designated order p, where the error in period t is correlated with the error in
period t – p.
Correcting for SC
• The most common “fix” for a regression with significant serial correlation is to
adjust the coefficient standard errors to account for the serial correlation.

• Methods for adjusting standard errors are standard in many software packages.
The corrections are known by various names, including serial-correlation
consistent standard errors, serial correlation and heteroskedasticity adjusted
standard errors, Newey–West standard errors, and robust standard errors.

• An advantage of these methods is that they also correct for conditional

heteroskedasticity.

• The robust standard errors, for example, use heteroskedasticity- and

autocorrelation-consistent (HAC) estimators of the variance–covariance matrix
in the regression estimation.
VIOLATIONS OF REGRESSION ASSUMPTIONS:
MULTICOLLINEARITY
• An assumption of multiple linear regression is that there is no exact linear
relationship between two or more independent variables.

• However, multicollinearity may occur when two or more independent

variables are highly correlated or when there is an approximate linear
relationship among independent variables.

• With multicollinearity, the regression can be estimated, but interpretation of

the role and significance of the independent variables is problematic.

• Multicollinearity is a serious concern because approximate linear

relationships among economic and financial variables are common.
Consequences of Multicollinearity

• Multicollinearity does not affect the consistency of regression coefficient

estimates, but it makes these estimates imprecise and unreliable.

• Moreover, it becomes impossible to distinguish the individual impacts of the

independent variables on the dependent variable.

• These consequences are reflected in inflated standard errors and diminished

t-statistics, so t-tests of coefficients have little power (ability to reject the null
hypothesis).
Detecting Multicollinearity
• Except in the case of exactly two independent variables, using the magnitude
of pairwise correlations among the independent variables to assess
multicollinearity is generally inadequate.

• With more than two independent variables, high pairwise correlations are
not a necessary condition for multicollinearity.

• For example, despite low pairwise correlations, there may be approximate

linear combinations among several independent variables (which are
unobservable) and that themselves are highly correlated.

• The classic symptom of multicollinearity is a high R2 and significant F-

statistic but t-statistics for the individual estimated slope coefficients that are
not significant due to inflated standard errors.
• Fortunately, we can use the variance inflation factor (VIF) to quantify
multicollinearity issues.

• In a multiple regression, a VIF exists for each independent variable.

• Suppose we have k independent variables X1, . . ., Xk.

• By regressing one independent variable (Xj) on the remaining k – 1 independent

variables, we obtain Rj2 for the regression—the variation in Xj explained by the
other k – 1 independent variables— from which the VIF for X j is
• For a given independent variable, Xj, the minimum VIFj is 1, which occurs
when Rj2 is 0, so when there is no correlation between Xj and the remaining
independent variables.

• VIF increases as the correlation increases; the higher the VIF, the more likely a
given independent variable can be accurately predicted from the remaining
independent variables, making it increasingly redundant.

• The following are useful rules of thumb:

• VIFj > 5 warrants further investigation of the given independent variable.

• VIFj >10 indicates serious multicollinearity requiring correction.

Correcting for Multicollinearity

• Possible solutions to multicollinearity include

• excluding one or more of the regression variables,

• using a different proxy for one of the variables,

• increasing the sample size.

Multiple Choice Test Bank Questions No Feedback - Chapter 1
No ratings yet
Multiple Choice Test Bank Questions No Feedback - Chapter 1
47 pages
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
From Everand
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
Lee Baker
No ratings yet
Lecture 2
No ratings yet
Lecture 2
29 pages
Multiple Regression Analysis
No ratings yet
Multiple Regression Analysis
61 pages
DA Unit-3 Short Q&A
No ratings yet
DA Unit-3 Short Q&A
17 pages
Linear Regression
No ratings yet
Linear Regression
38 pages
Group 1 Testing Assumptions
No ratings yet
Group 1 Testing Assumptions
35 pages
Yaregal Birhanu
No ratings yet
Yaregal Birhanu
8 pages
Interview Questions - Linear Regression
No ratings yet
Interview Questions - Linear Regression
6 pages
Chapter 5 - Violations of Regression Assumptions
No ratings yet
Chapter 5 - Violations of Regression Assumptions
44 pages
Violation of Assumptions of CLR Model:: Multicollinearity
No ratings yet
Violation of Assumptions of CLR Model:: Multicollinearity
28 pages
Models Assignment
No ratings yet
Models Assignment
43 pages
Econometrics Board Questions
No ratings yet
Econometrics Board Questions
11 pages
Econometrics Board Questions
No ratings yet
Econometrics Board Questions
13 pages
multicollinearity
No ratings yet
multicollinearity
15 pages
BA3-4-5modules
No ratings yet
BA3-4-5modules
258 pages
Econ. Assignment
No ratings yet
Econ. Assignment
6 pages
CH2. Simple Linear Regression 2023
No ratings yet
CH2. Simple Linear Regression 2023
100 pages
CFA LVL II Quantitative Methods Study Notes
No ratings yet
CFA LVL II Quantitative Methods Study Notes
10 pages
Da Semi
No ratings yet
Da Semi
42 pages
Important points for Regression
No ratings yet
Important points for Regression
6 pages
Assumptions of Logistic Regression
100% (1)
Assumptions of Logistic Regression
2 pages
ARM 2nd Mid
No ratings yet
ARM 2nd Mid
13 pages
Linear Regression 1
No ratings yet
Linear Regression 1
14 pages
LUBS5902-Lec6-LinearRegressionAssumptions-full - Tagged
No ratings yet
LUBS5902-Lec6-LinearRegressionAssumptions-full - Tagged
52 pages
module 2 modified
No ratings yet
module 2 modified
67 pages
Session on Multicollinearity
No ratings yet
Session on Multicollinearity
11 pages
UNIT II Regression
No ratings yet
UNIT II Regression
59 pages
The Five Assumptions of Multiple Linear Regression
No ratings yet
The Five Assumptions of Multiple Linear Regression
18 pages
Logistic Regression
No ratings yet
Logistic Regression
42 pages
Stata....Basic Note
No ratings yet
Stata....Basic Note
20 pages
Econometrics Assignment
No ratings yet
Econometrics Assignment
20 pages
Regression Analysis Final-Exam
No ratings yet
Regression Analysis Final-Exam
8 pages
Econometrics
No ratings yet
Econometrics
18 pages
Presentation Regression Analysis (1)
No ratings yet
Presentation Regression Analysis (1)
61 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
6 pages
Testing The Assumptions of Linear Regression
100% (1)
Testing The Assumptions of Linear Regression
14 pages
Regression Anaysis Explaination Lecture Notes by Dr. Wahid Sherani
No ratings yet
Regression Anaysis Explaination Lecture Notes by Dr. Wahid Sherani
7 pages
Assignment For Viva
No ratings yet
Assignment For Viva
54 pages
Introduction To Econometrics - Summary
No ratings yet
Introduction To Econometrics - Summary
23 pages
A. Collinearity Diagnostics of Binary Logistic Regression Model
No ratings yet
A. Collinearity Diagnostics of Binary Logistic Regression Model
16 pages
Unit 4-1
No ratings yet
Unit 4-1
29 pages
DA-MODULE-3
No ratings yet
DA-MODULE-3
54 pages
A) 4 Conditions of Regression
No ratings yet
A) 4 Conditions of Regression
3 pages
EC229 Part II Answers
No ratings yet
EC229 Part II Answers
9 pages
Regression
No ratings yet
Regression
48 pages
CH 4 - Problems
No ratings yet
CH 4 - Problems
72 pages
BRM Multivariate Notes
No ratings yet
BRM Multivariate Notes
22 pages
BRM Multivariate Notes
No ratings yet
BRM Multivariate Notes
22 pages
Unit - 3 Machine Learning
No ratings yet
Unit - 3 Machine Learning
30 pages
06_Banerjee and Banerjee_Business Analytics_Ch06
No ratings yet
06_Banerjee and Banerjee_Business Analytics_Ch06
21 pages
Chapter Three
No ratings yet
Chapter Three
35 pages
Mod3
No ratings yet
Mod3
50 pages
Social Research Methods - Statistical Analysis
No ratings yet
Social Research Methods - Statistical Analysis
3 pages
Extra Notes
No ratings yet
Extra Notes
50 pages
Econometrics
No ratings yet
Econometrics
13 pages
Regression
No ratings yet
Regression
6 pages
Linear Models Bias
No ratings yet
Linear Models Bias
17 pages
Questions For Viva
No ratings yet
Questions For Viva
4 pages
Gale Researcher Guide for: Econometric Models
From Everand
Gale Researcher Guide for: Econometric Models
Chupp
No ratings yet
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Appendix E
No ratings yet
Appendix E
9 pages
Research Methodology
No ratings yet
Research Methodology
8 pages
Statistics Q4-Summative
No ratings yet
Statistics Q4-Summative
7 pages
Instructions For How To Solve Assignment
No ratings yet
Instructions For How To Solve Assignment
3 pages
Dickey, Fuller - 1981 - Likelihood Ratio Statistics For Autoregressive Time Series With A Unit Root
No ratings yet
Dickey, Fuller - 1981 - Likelihood Ratio Statistics For Autoregressive Time Series With A Unit Root
17 pages
Multiple Linear Regression: BIOST 515 January 15, 2004
No ratings yet
Multiple Linear Regression: BIOST 515 January 15, 2004
32 pages
Mtma Dse2 End Sem 19
No ratings yet
Mtma Dse2 End Sem 19
3 pages
Hypothesis
No ratings yet
Hypothesis
9 pages
4.1-Descriptive-Measures
No ratings yet
4.1-Descriptive-Measures
34 pages
習題 Ch03 3 Anderson 13e Statistics PDF
No ratings yet
習題 Ch03 3 Anderson 13e Statistics PDF
3 pages
What Statistical Analysis Should I Use? Statistical Analyses Using SPSS
No ratings yet
What Statistical Analysis Should I Use? Statistical Analyses Using SPSS
39 pages
What Statistical Analysis Should I Use
No ratings yet
What Statistical Analysis Should I Use
3 pages
Q4_February-13-2025
No ratings yet
Q4_February-13-2025
4 pages
DOC-20250304-WA0000.
No ratings yet
DOC-20250304-WA0000.
18 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
49 pages
ANOVA (Analysis of Variance)
No ratings yet
ANOVA (Analysis of Variance)
5 pages
Discrete Probability Distributions Problem Set
No ratings yet
Discrete Probability Distributions Problem Set
7 pages
Preleaf by Masai Data Analytics Curriculum PDF
No ratings yet
Preleaf by Masai Data Analytics Curriculum PDF
5 pages
Sustainability 12 03505 v2
No ratings yet
Sustainability 12 03505 v2
11 pages
Binomial Dist-2
No ratings yet
Binomial Dist-2
36 pages
Statistical Test Methods For Hypothesis Testing
No ratings yet
Statistical Test Methods For Hypothesis Testing
6 pages
Biostat Midterm
No ratings yet
Biostat Midterm
4 pages
Wealth Ranking Study of Villages
No ratings yet
Wealth Ranking Study of Villages
16 pages
Hypothesis Testing
100% (1)
Hypothesis Testing
30 pages
STAT Q4 Week 2 Enhanced.v1
No ratings yet
STAT Q4 Week 2 Enhanced.v1
11 pages
Subject CT4 Models Core Technical Syllabus: The Faculty of Actuaries and Institute of Actuaries
No ratings yet
Subject CT4 Models Core Technical Syllabus: The Faculty of Actuaries and Institute of Actuaries
8 pages
Transport Management
No ratings yet
Transport Management
37 pages
Lembar Jawaban Utek Mandat Arni Despa P 220104016P
No ratings yet
Lembar Jawaban Utek Mandat Arni Despa P 220104016P
8 pages
Central Tendency
No ratings yet
Central Tendency
1 page