7SSMM700 Lecture 8
7SSMM700 Lecture 8
7SSMM700 Lecture 8
Week 8
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 1 / 33
The big data era is creating plenty of opportunities for new developments
in econometrics, economics, and finance.
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 2 / 33
Tools to manage Big Data
Introduction to Regularization
Theoretical approach
Ridge Regression
Lasso Regression
y = Xβ + e (1)
where y ∈ RN , β ∈ Rk , and X ∈ RN ×k .
The assumptions for the Gauss-Markov theorem are mostly true in smaller
datasets, which makes OLS a very powerful tool for statisticians and
scientists.
One of the most common issues with the OLS method is the tendency for
the model to overfit the data when there is too much noise caused by
correlated variables. This can happen in many different situations.
The most extreme case occurs when p > n. In this case there exists no
unique solution to the system and linear regression fails to produce
accurate coefficient values.
When overfitting, the coefficients can hive high standard errors and low
levels of significance despite a high R 2 value.
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 5 / 33
Ridge Regression and LASSO (Least Absolute Shrinkage and Selection)
improve the overall accuracy of ordinary least squares regression by adding
a bias that imposes shrinkage on the model that greatly reduces the
variance of coefficient estimates.
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 6 / 33
Regularization or shrinkage
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 7 / 33
Some notation for LASSO and Ridge penalties
L1 = ∑ | βi |
i
q
L2 = ∑ β2i
i
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 8 / 33
Ridge Regression
RSS
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 9 / 33
Ridge estimator
From equation (2) we can derive the ridge coefficient estimate:
(y − X β)T (y − X β) + λβT β
= (y T − βT X T )(y − X β) + λβT β
= y T Y − Y T X β − βT X T y + βT X T X β + λβT β
= y T Y − 2y T X β + βT X T X β + λβT β
∂
→ = 0 − 2y T X + 2X T X β + 2λβ = 0
∂β
The selection of lambda, and thus the optimal model, will be not be
discussed here. However, there is an existence theorem that states there
always exists a λ > 0 such that the MSE is less than that of the least
squares estimate λ = 0. A proof of the theorem can be found in Hoerl
(1970).
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 11 / 33
Advantages of Ridge
Disadvantages of Ridge
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 12 / 33
LASSO estimator
The LASSO model can be shown in a similar form as for the Ridge
p
(y − X β )T (y − X β ) + λ ∑ | β j | (4)
j =1
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 13 / 33
Comparing Ridge and LASSO estimators
Alternatively, one can show that the Ridge and LASSO models solve these
equations respectively:
Ridge equation
LASSO equation
This means that for every value of λ in Ridge and LASSO, there exists a t
such that you will get the same coefficient estimates for (2) and (5), and
the same estimates for (4) and (6). When p = 2, (5) shows that ridge
regression has the smallest RSS out of all the points that lie within the
diamond defined by | β 1 | + | β 2 | ≤ t. (6) shows LASSO performs the same
with the points that lie within the circle defined by β21 + β22 ≤ t.
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 14 / 33
Graphical representation
1 Tibshirani, R. (1996).
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 15 / 33
Graphical representation: comment
In the above figure, β̂ is the least squares solution, and the diamond and
circle portray the ridge and LASSO constraints given in (6). The ellipses
around β̂ are lines of constant RSS.
Equation (5) shows that the LASSO and ridge coefficient estimate is
where the ellipses and constraint regions meet.
Since the constraint region of ridge is a circle, the probability that the
intersection will occur on an axis is zero. In contrast, the diamond
constraint region has corners at each axis, so the intersection of the
ellipses and the constraint region will often occur on the axis.
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 16 / 33
Instabilities in the Ridge regression
Figure: If moving along the “ridge ”, large changes in the parameters estimate
cause small changes in the error, determining instabilities.
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 17 / 33
Elastic Net Regression
λ=0
OLS.
α=1
Ridge regression.
α=0
LASSO regression.
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 18 / 33
All regressors must be standardized, i.e. they have zero mean and variance
equal to 1.
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 19 / 33
Application to returns in the Forex market
Data Intra-day data, five years (January 2007 - January 2011) worth of
one-minute, on returns on the currency pair EUR/USD.
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 20 / 33
Application to returns in the Forex market
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 21 / 33
Set of regressors: recent returns and technical indicators
Recent returns
N-Minute returns
over windows lengths of 5, 10, 15, 20, 25, 30, 60 mins
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 22 / 33
Regressand
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 23 / 33
OLS estimate
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 24 / 33
Back-testing
Trading with the model Take a very simple strategy (realistically, we would
put in some minimum return before we actually trade): whenever the
returns prediction is greater than 0, go long, and whenever the returns
prediction is less then 0, go short
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 25 / 33
Figure: Backtesting the OLS estimate
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 26 / 33
Repeat this process using a stepwise algorithm to discard terms and retain
only those predictors with the most predictive capability.
The algorithm remove all those factors that are less statistically significant,
returning a more compact model.
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 27 / 33
Stepwise estimate
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 28 / 33
Backtesting the OLS and stepwise estimates
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 29 / 33
Comparison with the Elastic Net regularization
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 30 / 33
Conclusions
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 31 / 33
Readings
Tibshirani, R. (1996) “Regression Shrinkage and Selection via the Lasso.”
Journal of the Royal Statistical Society. Series B, Vol. 58, No. 1, 267–288.
Zou, H., and T. Hastie. (2005) “Regularization and Variable Selection via
the Elastic Net.” Journal of the Royal Statistical Society. Series B, Vol.
67, No. 2, 301–320.
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 32 / 33
THANK YOU
L. Leonida - M. Dolfin (King’s College London) M.Sc. Banking and Finance Week 8 33 / 33