Applied Statistics II-SLR
Applied Statistics II-SLR
Applied Statistics II-SLR
Applied Statistics - II
Qinlu (Claire) Wang
Statistician
1. Introduction
2. Assumptions
3. Diagnostics
4. Prediction
Simple Linear Regression
Simple linear regression is a statistical method that allows us to summarize and study
relationships between two continuous (quantitative) variables:
• One variable, denoted x, is regarded as the predictor, explanatory, or independent variable.
• The other variable, denoted y, is regarded as the response, outcome, or dependent variable.
𝑌 𝑖 = 𝛽 0+ 𝛽 1 𝑥𝑖 +𝜖 𝑖
^
𝛽 1=
∑ (𝑥 𝑖 −
𝑖= 1
𝑥 )(𝑌
´ 𝑖 −𝑌
´ )
𝑛
∑ ( 𝑥❑
𝑖 𝑥 )2
−´
𝑖=1
The method of maximum-likelihood finds parameters values and that maximize the
joint density of the independent responses evaluated at (the observed values) is
In this (special) case, the method of maximum-likelihood gives the same parameter estimates
as the method of least-squares.
Assumptions
Linearity
of the data. The relationship between the predictor (x) and the
outcome (y) is assumed to be linear. for
Normality of residuals. The residual errors , …, are assumed to be normally
distributed.
Homogeneity of residuals variance. The residuals are assumed to have a
constant variance. for
Independence of residuals error terms. , …, are independent
𝜖 𝑖 𝑁𝐼𝐷 ( 0 ,𝜎 2 ) ,𝑖 =1 , … ,𝑛
How to check assumptions?
1. Linearity of the data
If not?
A simple approach is to use non-
linear transformations of the
predictors, such as log(x), sqrt(x)
and x^2, in the regression model.
2. Normality of residuals
Ideally, it’s good if you see a horizontal line with equally spread points.
If not? A possible solution is to use a log or square root transformation of the response (y).
4. Outliers and high levarage points
𝑛
❑ ❑ 2
=1− 𝑆𝑆𝐸
𝑅
2 𝑆𝑆𝐸=∑ ( 𝑦 𝑖 − ^𝑦 𝑖 )
𝑆𝑆𝑇 𝑛 =1
𝑛
❑ 2
𝑆𝑆𝑇 =∑ ( 𝑦 ❑
𝑖 − 𝑦
´ 𝑖 )
𝑛=1
𝑆𝑆𝑇
𝑀𝑆𝑇 =
(𝑛 − 1)
3. Standard Error and F-Statistic
4. AIC and BIC
𝐴𝐼𝐶=2 𝑘 −2 × 𝐼𝑛( 𝐿
^)
𝐵𝐼𝐶
=𝑘 × 𝐼𝑛 ( 𝑛 ) −2 × 𝐼𝑛 ( ^
𝐿)
For model comparison, the model with the lowest AIC and BIC score is preferred.
Prediction
Examples:
• Robust regression
• Multiple Linear Regression
Regression are much more than these! • Non-linear Regression
The materials of this training: Mali - BCBB - 2020 • Logistic Regression