Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
48 views

Lesson 4 Linear Assumptions

This document discusses checking assumptions in linear models. It covers checking for normality, constant variance, and nonlinearity through graphical methods like QQ plots and residual plots. It also discusses transforming both the response and predictor variables using approaches like Box-Cox transformations and polynomials to better meet linear model assumptions. Polynomials can be replaced with orthogonal polynomials or regression splines to avoid issues like collinearity. Splines use B-spline basis functions defined over intervals separated by knots to allow for flexible nonlinear relationships.

Uploaded by

maartenwilders
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

Lesson 4 Linear Assumptions

This document discusses checking assumptions in linear models. It covers checking for normality, constant variance, and nonlinearity through graphical methods like QQ plots and residual plots. It also discusses transforming both the response and predictor variables using approaches like Box-Cox transformations and polynomials to better meet linear model assumptions. Polynomials can be replaced with orthogonal polynomials or regression splines to avoid issues like collinearity. Splines use B-spline basis functions defined over intervals separated by knots to allow for flexible nonlinear relationships.

Uploaded by

maartenwilders
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Linear and Generalized Linear

Models (4433LGLM6Y)

Checking assumptions in Linear Model


Meeting 7

Vahe Avagyan
Biometris, Wageningen University and Research
Overview

• Checking assumptions in Linear Model (Fox, 12.1 -12.2)


• Transformations (Fox, 12.3 -12.4)
• Polynomials and splines (Faraway text, 8.2-8.4)
Overview

• Checking assumptions in Linear Model


• Transformations
• Polynomials and splines
Example: Survey of Labour and Income Dynamics

• SLID data
• wages: Composite hourly wage rate ($/hour).
• age: in years.
• sex: dummy variable, (1=male or 0=female).
• education: Completed years of education.
Model assumptions

• For a linear model to be a good model, there are four conditions that need to be fulfilled.

• Independence: The residuals are independent of each other.

• Linearity: The relationship between the variables can be described by a linear equation (also
called additivity)

• Equal variance: The residuals have equal variance (also called homoskedasticity)

• Normality: The distribution of the residuals is normal


Graphical check of normality

• A quantile comparison plot can give us a sense of which observations depart from normality.

• QQ-plot of residuals
• Plot studentized residual 𝐸𝑖∗ versus normal or 𝑡𝑛−𝑘−2 distribution.
• The difference between two is important for small samples.
• In larger samples, internally studentized residuals or raw residuals will give same impression.
• QQ-plot effective in displaying tail behavior.

• Histogram or smoothed histogram.


• The skew may help to chose a transformation.
Graphical check of normality: Examples: SLID regression

• (a) QQ-plot and (b) smoothed histogram


of studentized residuals 𝐸𝑖∗ .

• First row, not transformed.

• Second row, after the log-transformation.


Nonconstant Error Variance

• Error variance:
𝑉 𝜖 = 𝑉 𝑌 𝑥1 , … , 𝑥𝑘 ) = 𝜎𝜖2

Heteroscedasticity = nonconstant error variance. Homoscedasticity = constant error variance.

• Note: LS estimator 𝐛 remains unbiased and consistent even with nonconstant variance.
• Its efficiency is impaired (we can do better) and usual formulas for standard errors are inaccurate.

• Harm produced by heteroscedasticity is relatively mild. Worry if the largest variance is 4 times the
smallest variance (i.e., sd of the errors varies by more than a factor 2).
Graphical check of constant variance

• Most important plot: Residual plots - 𝐞 vs 𝑦ො


Graphical check of constant variance: Example

• Plot residuals 𝐸𝑖 against fitted values 𝑌෠𝑖


(not 𝑌).
• Check for constant variance in vertical
direction, and scatter should be symmetric
vertically about 0.

• Plot residuals against each 𝑋 (included or


excluded).
Graphical check of constant variance: Example

• Plot studentized residuals 𝐸𝑖∗

• Ordinary residuals have unequal variances,


even with constant error variance.

• Pattern of changing spread more easily


seen by plotting |𝐸𝑖∗ | or 𝐸𝑖∗2 against 𝑌.

• First row: not transformed.

• Second row: log-transformed.


Graphical check of constant variance: Example

• R provides default residual diagnostics with


plot() function.

1. 𝐸𝑖 versus 𝐲ො𝑖

2. Normal QQ-plot for 𝐸𝑖′

3. |𝐸𝑖′ | versus 𝐲ො𝑖

4. 𝐸𝑖′ versus ℎ𝑖
Nonlinearity

• 𝐸(𝜖) = 0 implies that regression surface accurately reflects the dependency 𝐸(𝑌|𝑋𝑖 ).

• Regression surface is generally high dimensional.

• Focus on certain patterns of departure from linearity.

• Graphical diagnostics։ cloud plotting or 𝑌 vs 𝑋𝑖 .

• The residual based plots maybe more informative.

• Monotone and non-monotone nonlinearity.


Component-Plus-Residual Plots (partial residual plots)

Residual of the main regression model


• Partial residual for 𝑗-th regressor

𝑗
𝐸𝑖 = 𝐸𝑖 + 𝐵𝑗 𝑋𝑖𝑗 :

i.e., add back linear component of partial relationship between 𝑌 and 𝑋𝑗 to 𝐸𝑖 .

• Plot 𝐸 (𝑗) versus 𝑋𝑗 .

• Multiple regression coefficient 𝐵𝑗 is the slope of simple regression of 𝐸 (𝑗) on 𝑋𝑗 .

• Nonlinearity may be apparent in the plot.


Component-Plus-Residual Plots: Example

• SLID regression: the solid lines show the lowess smooths, the broken lines are least-squares fits.
Overview

• Checking assumptions in Linear Model


• Transformations
• Polynomials and splines
Transformations (response variable)

• Variable transformation may help to address possible violations of the assumptions:

• Log transformation 𝑌 → ln 𝑌

• Power transformation: 𝑌 → 𝑌 𝜆 (parameter of transformation)

• We can obtain 𝜆መ 𝑀𝐿𝐸 using the Maximum Likelihood Estimation.


Which 𝜆0 means no transformation?
• Check the hypothesis

𝐻0 ∶ 𝜆 = 𝜆0 .

• Likelihood Ratio Tests, Wald test, Score test.


Transformations: Box-Cox

• The aim of the Box-Cox transformations is to ensure the usual assumptions for Linear Model hold.
(𝜆)
𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖1 + ⋯ + 𝛽𝑘 𝑋𝑖𝑘 + 𝜖𝑖 ,

where
𝑌𝑖𝜆 − 1
=൞ 𝜆 , for 𝜆 ≠ 0
(𝜆)
𝑌𝑖
ln 𝑌𝑖 , for 𝜆 = 0

• Log-transformation is a particular case of Box-Cox (i.e., 𝜆 = 0).

• Which 𝜆 means no transformation?


Maximum Likelihood Estimation

• SLID regression:

• We can see a strong reason to transform (why?).

CI0.95 (𝜆)
Overview

• Checking assumptions in Linear Model


• Transformations
• Polynomials and splines
Transformations (predictors)

• Generalizing the 𝐗𝜷 part of the model by adding polynomial terms (e.g., one-predictor case):
𝑦 = 𝛽0 + 𝛽1 𝑋 + ⋯ + 𝛽𝑑 𝑋 𝑑 + 𝜖

• Selection of 𝑑.
1. Keep adding terms until the added term is not statistically significant.
2. Start with a large 𝑑 , eliminate not significant terms starting with the highest order term.

• Polynomial regression allows for more flexible relationship

• Principle of marginality: do not remove lower order terms from model, even if they are not statistically
significant.
Polynomial regression
Orthogonal polynomials

• When a term is removed from or added to the model, coefficients change and model needs to be refitted.

• High order polynomial models may be numerically unstable.

• Orthogonal polynomials may help:

• replace old set of predictors 𝑋, 𝑋 2 , 𝑋 3 … by new, orthogonal, set of predictors 𝑍1 , 𝑍2 , 𝑍3 …


𝑍1 = 𝑎1 + 𝑏1 𝑋
𝑍2 = 𝑎2 + 𝑏2 𝑋 + 𝑐2 𝑋 2
𝑍3 = 𝑎3 + 𝑏3 𝑋 + 𝑐3 𝑋 2 + 𝑑3 𝑋 3

Such that 𝑍𝑖′ 𝑍𝑗 = 0 and 𝑍𝑖′ 𝑍𝑖 = 1


Orthogonal polynomials: Example

The poly() function constructs


Orthogonal polynomials

We come to the same coefficients

Orthogonal polynomials 𝑍𝑖 s are


indeed orthogonal.
Regression splines

• A spline is a piecewise polynomial with a certain level of smoothness. Spline fixes the disadvantages of
Polynomial regression by combining it with Segmented regression (see more on practical session).

• Splines use B-spline basis functions.

• Define cubic B-spline basis 𝑆(𝑋) (defined over [𝑎; 𝑏]) using knots at 𝑡1 , … , 𝑡𝑘

• 𝑆 𝑋 , 𝑆 ′ (𝑋), 𝑆 ′′ (𝑋) are continuous on [𝑎; 𝑏]

• Partition 𝑎 = 𝑡0 < 𝑡1 < · · · < 𝑡𝑘 = 𝑏 , function 𝑆(X) is cubic on each subinterval [𝑡𝑖 , 𝑡𝑖+1 ], i.e.,
𝑆𝑖 𝑋 = 𝑎0,𝑖 + 𝑎1,𝑖 𝑋 + 𝑎2,𝑖 𝑋 2 + 𝑎3,𝑖 𝑋 3

• How many unknowns are there?


Regression splines: Example

• Suppose we know the true model is:

𝑦 = sin3 2𝜋𝑥 3 + 𝜖, with 𝜖~𝑁(0; 0.12 )


Regression splines: Example

• Now, let’s use splines with 12 basis functions.

• Hint: Place more knots in places where the function might


vary rapidly and fewer knows where it seems more stable.

You might also like