K Kiran Kumar IIM Indore
K Kiran Kumar IIM Indore
K Kiran Kumar IIM Indore
K Kiran Kumar
IIM Indore
Recap….
Regression objective
To explain the variation in dependent variable with the help of independentvariables
To assess the relative contribution of each independent variable in explaining the variation in
dependent variable
Reasons for including Error term
Data unavailability; Poor proxy of data; vagueness in theory; intrinsic randomness; to keep the
equation simple.
Matrix representation of regression equation
Assumptions on Errors
Estimation of BETA – Minimizing Error Sum of Squares
Sensitivity of assumptions on errors
Expected value andVariance of Estimated BETA
Sensitivity of assumptions on errors
Linearity of Regression Model
Goodness of Fit – AdjustedR-square
Regression problem sets
R Lab session 1 – R Studio
Multicollinearity -- Model specification bias
Dummy variables as Independent variables
Projects: Is Gold safe haven?
R Lab Session 2 – IntradayTrades data set + Regression
Dummy Variables as Independent variables
Intercept dummy variables
In the presence of intercept, include one less than the number of
categories. Estimated coefficients of dummy variables are the
incremental intercept (vis-à-vis the base category which takes zero)
If intercept is absent then include as many as number of categories.
Estimated coefficients of dummy variables are the marginal (level) of the
intercept for each category.
Slope <Interactive> dummy variables
Interactive= quantitative independent variable (ie slope) * categorical
variable (ie dummy variable)
If slope is present then include one less than the number of categories as
interactive terms. Estimated coefficients are the incremental slopes.
If slope is absent then include as many as the number of categories as
interactive terms and the coefficients are the slope for that category .
Seasonality
To account for seasonality effects in the dependent variable
Seasonal effects in financial markets have been widely observed and are often termed
“calendar anomalies”
One way to cope with this is the inclusion of dummy variables- e.g. for
quarterly data, we could have 4 dummy variables:
How many dummy variables do we need? We need one less than the
“seasonality” of the data. e.g. for quarterly series, consider what happens if we
use all 4 dummies
Seasonalities in South East Asian Stock Returns
Brooks and Persand (2001) examine the evidence for a day-of-the-week effect in
five Southeast Asian stock markets: South Korea, Malaysia, the Philippines,
Taiwan and Thailand.
The data, are on a daily close-to-close basis for all weekdays (Mondays to
Fridays) falling in the period 31 December 1989 to 19 January 1996 (a total of
1581 observations).
They use daily dummy variables for the day of the week effects in the
regression:
rt = 1D1t + 2D2t + 3D3t + 4D4t + 5D5t + ut
Then the coefficients can be interpreted as the average return on each day of the
week.
Pp 157 of the paper
It is possible that the different returns on different days of the week could be a
result of
5
different levels of risk on different days.
To allowfor this, Brooks and Persand re-estimate the model allowing for
i1
different betas on different days of the week using slope dummies:
rt = ( iDit + i DitRWMt) + ut
where Dit is the ith dummy variable taking the value 1 for day t=i and zero
otherwise, and RWMt is the return on the world marketindex
Now both risk and return are allowed to vary across the days of the week.
When Dummy Variables may be used as Dependent variables…
There are numerous examples of instances where this may arise, for example where we
want to model:
Why firms choose to list their shares on the NASDAQ rather than the NYSE
Why some stocks pay dividends while others do not
What factors affect whether countries default on their sovereign debt
Why firms choose to issue new stock to finance an expansion while others issue bonds
Why some firms choose to engage in stock splits while others do not.
It is fairly easy to see in all these cases that the appropriate form for the dependent
variable would be a 0-1 dummy variable since there are only two possible outcomes.
There are, of course, also situations where it would be more useful to allow the
dependent variable to take on other values.
The Linear Probability Model
We will first examine a simple and obvious, but unfortunately flawed, method
for dealing with binary dependent variables, known as the linear probability
model.
it is based on an assumption that the probability of an event occurring, Pi, is
linearly related to a set of explanatory variables
Pi p( yi 1) 1 2 x2i 3 x3i k xki ui
The slope estimates for the linear probability model can be interpreted as the
change in the probability that the dependent variable will equal 1 for a one-
unit change in a given explanatory variable, holding the effect of all other
explanatory variables fixed.
Suppose, for example, that we wanted to model the probability that a firm i
will pay a dividend p(yi = 1) as a function of its market capitalisation (x2i,
measured in millions of US dollars), and we fit the following line:
Pˆ 0.3 0.012x
i 2i
For any firm whose value is less than $25m, the model-predicted probability
of dividend payment is negative, while for any firm worth more than $88m,
the probability is greater than one.
However, there are at least two reasons why this is still not adequate.
The process of truncation will result in too many observations for which the
estimated probabilities are exactly zero or one.
More importantly, it is simply not plausible to suggest that the firm's probability
of paying a dividend is either exactly zero or exactly one. Are we really certain
that very small firms will definitely never pay a dividend and that large firms will
always make a payout?
Probably not, and so a different kind of model is usually used for binary
dependent variables either a logit or a probit specification.
Disadvantages of the Linear Probability Model 3
The LPM also suffers from a couple of more standard econometric problems that
we have examined in previous chapters.
Since the dependent variable only takes one or two values, for given (fixed in
repeated samples) values of the explanatory variables, the disturbance term will
also only take on one of two values.
Hence the error term cannot plausibly be assumed to be normally distributed.
Since the disturbance term changes systematically with the explanatory variables,
the former will also be heteroscedastic.
It is therefore essential that heteroscedasticity-robust standard errors are always used
in the context of limited dependent variable models.
Logit and Probit: Better Approaches
With the logistic model, 0 and 1 are asymptotes to the function and thus the
probabilities will never actually fall to exactly zero or rise to one, although
they may come infinitesimally close.
The logit model is not linear (and cannot be made linear by a transformation)
and thus is not estimable using OLS.
If your outcome is
categorical, you need to
use…
If you have
more
predictors than
Multinomial Ordinal you can deal
Binomial logistic If your outcome vs.
logistic logistic with,
regression predictor
regression regression
analysis relationship is non-
analysis analysis
(dichotomous linear, Transform the
(polytomous (ordinal
outcome) outcome or
outcome) outcome)
predictor
Create Form composites of
taxonomies of the indicators of Use non-linear
fitted models any common regression
If time is a predictor, you need discrete-
and compare. construct. analysis.
time survival analysis…
Logisitic Regression in R
setwd("D:/pgp/201819/FAuR/data")
admit=read.csv("binary.csv")
head(admit)
tail(admit)
summary(admit)
sapply(admit,sd) # sapply is to apply a function to all columns in a dataset..here sd is stdev
admit$rank=factor(admit$rank) # converting integer variable into categorical variable
xtabs(~admit+rank,data=admit) #contingency table
mylogit <- glm(admit ~ gre + gpa + rank, data = admit, family = "binomial") # to estimatelogistic
regression
summary(mylogit)
For every one unit change in gre, the log odds of admission (versus non-admission) increases by0.002.
For a one unit increase in gpa, the log odds of being admitted to graduate school increases by 0.804.
The indicator variables for rank have a slightly different interpretation. For example, having attendedan
undergraduate institution with rank of 2, versus an institution with a rank of 1, changes the log odds of
admission by -0.675.
coef1=exp(coef(mylogit)) # odds ratio
coef2=coef1/(1+coef1) # probabilities
Time Series Econometrics
Time Series Data?
What is a time series?
Anything observed sequentially (by time?)
Returns, volatility, interest rates, exchange rates, bond yields, …
Trade and Order book
How is it
different?
The observations are not independent.
There is correlation from observation to observation / time to time.
No missing values
How do we determine correlation in time series data?
2
Substitute sample estimates of the covariance
1
between X(t) and X(t+h). Note:We do not
have “n” pairs but “n-h”pairs.
0
Substitute sample estimate of variance. Jan 60 Jan 64 Jan 68 Jan 72 Jan 76 Jan 80
Ti me in quarters
1.0
1.0
15
0.5
0.5
10
ACF
ACF
0.0
0.0
5
-0.5
-0.5
-1.0
-1.0
0
-1 0 1 2 3 0 5 10 15 0 5 10 15
Correlogram
Plot of Autocorrelation against its lag
Ut Attracting
Postive
. . .. . . . ..
Auto. 0
. . . . .. . . .
. .. . t
Ut . . .. . . . . .
Random
No . . .
Auto. 0
. . .. . . .
. . .. .
. .
. .t
Ut . . .
Reverse
. . .
Negative . . .
Auto. 0
. . . . . . . t
. .
How about our day-to-day Economic Series look like ?
2.5
.7
2.25
.65
2
Share Prices Exchange Rate
1.75 .6
1.5
.55
1.25
.5
1
.45
.75
.5 .4
.25 .35
9.3
9
Normal (and t, F etc) distributions have
8.9 constant mean
8.8
If the variables in the regression model are not stationary, then it can be proved
that the standard assumptions for asymptotic analysis will not be valid. In other
words, the usual “t-ratios” will not follow a t-distribution, so we cannot validly
undertake hypothesis tests about the regression parameters.
Market Efficiency – Speed of convergence to Mkt Efficiency
Testing for Market Efficiency
Significance of past lag values
Yes – Market is inefficient
No – Market is Efficient
How fast the market becomes efficient ?
Run the regressions at daily, hourly, 30-, 15- and 5-minfrequency
Check the significance of past lag values
Running multiple regressions at a go
library(quantmod)
getSymbols("INFY.NS",from='2015-01-01')
•A time series Xt is said to be stationary if its expected value and population variance are independent of
time and if the population covariance between its values at time t and time t+s depends on s but not ont.
• Called as Covariance Stationary or WeakStationary
• Any example ?
X t 2 X t 1 t 1 2 1
An example of a stationary time series is an AR(1) process Xt = 2Xt-1 + t, provided that –1 < 2 < 1,
where t is a random variable with 0 mean and constant variance and not subject to
autocorrelation.
STATIONARY PROCESSES : AR(1) series
X t 2 X t 1 t 1 2 1
X t 1 2 X t 2 t1
This can easily be demonstrated. If the relationship is valid for time period t, it is also valid for
time period t-1.
STATIONARY PROCESSES : AR(1) series
X t 2 X t 1 t 1 2 1
X t 1 2 X t 2 t1
X t 2 ( 2 X t 2 t1 ) t
2X
2 t2 2 t1 t
X ...
t t1
2 0 2 1 2 t1 t
Substituting for Xt-1 in the first equation, one obtains Xt in terms of Xt-2, t, and t-1.
STATIONARY PROCESSES : AR(1) series
X t 2 X t 1 t 1 2 1
X t 1 2 X t 2 t1
X t 2 ( 2 X t 2 t1 ) t
22 X t2 2 t1 t
2t X 0 2t1 1 ... 2 t1 t
Continuing this process of lagging and substituting, one obtains Xt in terms of X0 and the
innovations 1, ..., t.
STATIONARY PROCESSES
X t 2 X t 1 t 1 2 1
X t 1 2 X t 2 t1
X t 2 ( 2 X t 2 t1 ) t
22 X t2 2 t1 t
2t X 0 2t1 1 ... 2 t1 t
t X2 , which
The expected value of each innovation is 0. Hence the expected value of X is t
0
tends to 0 as t increases. Thus E(Xt) is ultimately independent of t.
STATIONARY PROCESSES
X t 2 X t 1 t 1 2 1
X t 2 X t 1 t 1 2 1
2tX0 is an additive constant and therefore does not affect the variance.
STATIONARY PROCESSES
X t 2 X t 1 t 1 2 1
The innovations are assumed to be generated independently of each other, and hence their
population covariances are 0.
STATIONARY PROCESSES
X t 2 X t 1 t 1 2 1
2 2
1 1
2 2
The 2 factors are squared when taken out of the variance terms
STATIONARY PROCESSES
X t 2t2 X t 12t2
t 12 2 1
Z 2 2 ... 2
2
X t t X t1 ... t t
2
2 0 2 1
2
1 2t2 1
(1 22 )Z 1 22t Z
2 population variance of ( t X 1 t122 ... )
Xt 2 0 2 1 2 t1 t
1 22 1 22
X t 2 X t 1 t 1 2 1
1 22 1 22
The term 22t in the numerator tends to 0 as t increases, so we have demonstrated that the
variance is also independent of time.
STATIONARY PROCESSES
X t 2 X t 1 t 1 2 1
Next we will consider the population covariance between Xt and Xt+s. It is convenient to
start by writing Xt+s in terms of Xt and the innovations t+1, ..., t+s. This is done by lagging
and substituting, as before.
STATIONARY PROCESSES
X t 2 X t 1 t 1 2 1
Now Xt is fixed at time t and is therefore independent of the innovations after time t. Hence the
population covariance of X tand X t+s reduces to the population covariance of X t and 2sXt . Thus it
is equal to the population variance of X t multiplied by 2s.
STATIONARY PROCESSES
X t 2 X t 1 t 1 2 1
1 22
20
15
10
0 41 71
-5
-10
-15
Here is a series generated by this process with 2 = 0.7 and random numbers for the innovations.
Non StationaryTime Series
Random walk
X t X t 1 t
X t X 0 1 ... t1 t
The condition –1< 2 <1 was crucial for stationarity. If 2 is equal to 1, the series becomes a
nonstationary process known as a random walk.
It will be assumed, as before, that the innovations are generated independently from a
fixed distribution with mean 0 and population variance 2.
If the process starts at X0 at time 0, its value at time t is given by X0 plus the sum of the
innovations in periods 1 to t.
NONSTATIONARY PROCESSES
Random walk
X t X t 1 t
X t X 0 1 ... t1 t
E( X t ) X 0 E( 1 ) ... E( n ) X 0
If expectations are taken at time 0, the expected value at any future time t is fixed at X0 because the
expected values of the future innovations are all 0. Thus E(Xt) is independent of t and the first
condition for stationarity remains satisfied.
NONSTATIONARY PROCESSES
Random walk
X t X t 1 t
X t X 0 1 ... t1 t
E( X t ) X 0 E( 1 ) ... E( n ) X 0
However, the condition that the variance of Xt be independent of time is not satisfied.
NONSTATIONARY PROCESSES
Random walk
X t X t 1 t
X t X 0 1 ... t1 t
E( X t ) X 0 E( 1 ) ... E( n ) X 0
The variance of Xt is equal to the variance of X0 plus the sum of the innovations. X0 may be
dropped from the expression because it is an additive constant
NONSTATIONARY PROCESSES
Random walk
X t X t 1 t
X t X 0 1 ... t1 t
E( X t ) X 0 E( 1 ) ... E( n ) X 0
Random walk
X t X t 1 t
X t X 0 1 ... t1 t
E( X t ) X 0 E( 1 ) ... E( n ) X 0
t 2
The variance of each innovation is equal to , by assumption. Hence the population variance of
2
Xt is directly proportional to t. Its distribution bec omes wider and flatter, the further one looks into
the future.
NONSTATIONARY PROCESSES
20
15
10
5
Random walk
0
-5
-10
-15
The chart shows a typical random walk. If it were a stationary process, there would be a
tendency for the series to return to 0 periodically. Here there is no such tendency.
Some nonstationary series
TESTING FOR NONSTATIONARITY
1.0
0.8
X t 2 X t 1 t
k 2k
0.6
2 0.8
0.4
0.2
0.0
1 4 7 10 13 16 19
Correlogram of an AR(1) process
For stationary processes the autocorrelation coefficients tend to 0 quite quickly as k
increases. The figure shows the correlogram for an AR(1) process with 2 = 0.8.
Higher order AR(p) processes will exhibit more complex behavior, but if they are stationary,
the coefficients will eventually decline to 0.
TESTING FOR NONSTATIONARITY
1.0
0.8 X t X t 1 t
0.6
0.4
0.2
0.0
1 4 7 10 13 16 19
Correlogram of a random walk
In the case of nonstationary processes, the theoretical autocorrelation coefficients are not defined
but one may be able to obtain an expression for E(rk), the expected value of the sample
autocorrelation coefficients. For long time series, these coefficients decline slowly.
Hence time series analysts can make an initial judgment as to whether a time series is
nonstationary or not by computing its sample correlogram and seeing how quickly the coefficients
decline.
TESTING FOR NONSTATIONARITY
X t 1 2 X t 1 t
A more formal method of detecting nonstationarity is often described as testing for unit roots.
The early and pioneering work on testing for a unit root in time series was done by Dickey and
Fuller (Dickey and Fuller 1979, Fuller 1976). The basic objective of the test is to test the null
hypothesis that β2=1 against the alternative β2<1
TESTING FOR NONSTATIONARITY
X t 1 2 X t 1 t
H0 : 2 1 2X t 2
t
2 1
2
H 1 : 2 1 2
2
1 2 1
1 2
X
In practice, there will be just two possibilities: 2 = 1, and -1 < 2 < 1. If 2 = 1, the process is
nonstationary because its variance increases with t. If 2 lies between 1 and -1, the variance is
fixed and the series is stationary.
The test is intended to discriminate between the two possibilities. The null hypothesis is that the
process is nonstationary. We need a specific value of b2 when we define the null hypothesis, so
we make H0: b2 = 1. The alternative hypothesis is then H1: b2 < 1.
TESTING FOR NONSTATIONARITY
X t 1 2 X t 1 t
H0 : 2 1 2X t 2
t
2 1
2
H1 : 2 1 2
2
1 2 1
1 2
X
X t X t 1 1 ( 2 1)X t 1 t
X t 1 ( 2 1)X t 1 t
H 0 : 2 1 0
H1 : 2 1 0
Before performing the test, it is convenient to rewrite the model subtracting Xt-1 from both sides.
To perform the test, we regress Xt on Xt-1 and test whether the slope coefficient is significantly
different from 0.
TESTING FOR NONSTATIONARITY
2200
2000
1800
1600
1400
1200
1000
800
100 200 300 400 500
NIFTY
Here is the NSE Nifty daily series from the last two years. It is clearly nonstationary, but nevertheless we
will perform a formal test.
In EViews you can perform a test of nonstationarity by clicking on the name of the series to be tested,
clicking on the View tab in the window that opens, and then clicking on Unit Root Test on the menu that
appears. The next slide shows the output for the test for NIfty.
X t ( 2 3 1)X t 1 3 X t 1 1 4 t t
The key items are the coefficient of Xt-1, here Nifty(-1), and its t statistic. The coefficient is close
to 0, as it would be under the null hypothesis of nonstationarity.
The t statistic is reproduced at the top of the output, where it is described as the Augmented
Dickey-Fuller test statistic.
EViews calculates the critical values for you. In this case you would not reject the null hypothesis
that Nifty is a nonstationary series. The test result thus corroborates the conclusion we drew
looking at the graph.
TESTING FOR NONSTATIONARITY
150
D(LGHOUS)
100
50
-5 0
-1 0 0
-1 5 0
-2 0 0
100 200 300 400 500
DNIF
Here is the series for the first differences of Nifty daily series. Does it look stationary or
nonstationary?
The coefficient is far from 0 and it has a high t statistic. We can reject the null hypothesis of
nonstationarity at the 1 percent level. Thus it would appear that Nifty can be rendered stationary by
differencing once and hence I (1)
Different Time-Series Processes
Autoregressive Model
Yt depends only on its own past values Yt-1,Yt-2 ,Yt-3 ,
A common representation:Yt depends on p of its past values called as AR(p) model
Yt = + 1Yt-1 + 2Yt-2 + 3Yt-3 + …..+
pYt-p + t
Example:This year’s income depends on past incomes
Moving Average Model
Yt depends only on the random error terms which follow a white noise process
A common representation:Yt depends on q of its past values called as MA(q) model
Yt = + t - 1 t-1 - 2 t-2 - ….- q t-q
Example: Number of patients discharged on a day depends on how many patients get admitted
and stay for one day, two days ot three days.
Autoregressive Moving Average Model
The time-series may be represented as a mix of both AR and MA model referred as ARMA
(p,q)
general form of such a time-series model, which depends on p of its own past values and q
past values of random disturbances
ImpliedVolatility
Implicit in the market price of an option
Market’s forecast of volatility of the underlying asset over the life of
option
Best guess of volatility, if the market is efficient and the pricing model is
appropriate
Problems with Implied volatility
ARE THESE DAYS THE SAME?
-15% 5% 0%
-10 8 0
-5 12 25
0 16 35
5 18 10
10 16 7
15 12 9
20 8 5
25 5 3
30 0 3
35 0 3
Notice that the expected return for both of these portfolios is 5%:
1. Variance
As seen earlier, this is the traditional measure of risk, calculated the sum of the weighted squared
differences of the potential returns from the expected outcome of a probability distribution. For
these two portfolios the calculations are:
and
The semi-variance adjusts the variance by considering only those potential outcomes that fall
below the expected returns. For our two portfolios we have:
Also, the semi-standard deviations can be derived as the square roots of these values:
Notice here that although Portfolio #2 has a higher standard deviation than Portfolio #1, it's semi-
standard deviation is smaller.
Regression Model
Yi = + 1X1i + 2X2i + Ui
Homoskedasticity: Var(Ui) = 2
Heteroskedasticity: Var(Ui) = i 2
A larger variance when values of some Xi (or theYi’s themselves)
are large (or small)
Whats the problem with the afore-mentioned remedies?
For finance people….variance is a measure of risk….and which takes‘central’ part in any procedure.
Changes in Risk are almost order of the day….so explicitly modeling how risk(volatility / variance) is
changing is of utmost need.
Since the conditional variance needs to be nonnegative, the conditions have to be met. If 1 = 0,
then the conditional variance is constant and is conditionally homoscedastic.
Can be generalized to‘p’ lags…ARCH(p) model
87
Then what is GARCH ?
Due to Bollerslev…1986…GENERALIZEDARCH
Today’s variance is a weighted average of the ht 1 2t1 1ht 1
Long run average variance
Yesterday’s variance forecast -- Persistence coefficient
The news (yday’s squared shock) -- Information absorption coefficient
Non-negativity constraints: w >0;Alpha1>=0; Beta1>=0
Stationarity constraints:Alptha1+Beta1<1
GARCH(1,1)--Parsimonious representation ofARCH(p)
2. Then square the residuals, and examine the autocorrelation in squared residuals
3. If the variance of residuals is constant then we should not see any autocorrelation in
squared residuals.
4. If the GARCH model is appropriate then the residual squares obtained after fitting
GARCH model should not have any autocorrelation.
Generalised ARCH (GARCH) Models
0 i u 2
t i j t j 2
i1 j1
when 1 < 1
The method works by finding the most likely values of the parameters
given the actual data.
GJR-GARCH Model
(Glosten, Jaganathan, and Runkle)
t2 0 1ut21 1 t21 u t21I t1
It 1 1 if ut 1 0 (bad news)
0 otherwise (good news)
News Impact Curves for Nifty Returns using Coefficients from GARCH and GJR Model
Estimates:
0.14
GARCH
GJR
0.12
Value of Conditional Varianc
0.1
0.08
0.06
0.04
0.02
0
-1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Value of Lagged Shock
High Risk – High Return : GARCH-in-Mean model
This type of model introduces the conditional variance (or standard
deviation) into the mean equation.
These are often used in asset return equations, where both return and
risk are to be considered.
If the coefficient on this risk variable is positive and significant, it
shows that increased risk leads to a higher return.
R ˆ Bt 0 . 7 0.3 t 1
(0.7) (0.1)
ˆ2 t 0 . 8 0 . 5 u t2 1 0 . 4 2 t 1
(0.4) (0.3) (0.1)
the positive sign and significant t-statistic indicate that the risk of the
bond leads to a higher return.
Comovement of volatility -- GARCH –X
RV t *VF t t
When you sayVF is “unbiased” estimate of Realized /ActualVolatility?
Historical Fluctuations of India VIX and Nifty 50 (4-Mar-2008 to 31- 24-08-2015 -6.10 64.36
Aug-2015)
India VIX Nifty 50 22-09-2011 -4.17 22.44
70
Nifty 50
7000 27-08-2013 -3.51 11.52
60
6000
50 24-02-2011 -3.26 6.21
5000
40 27-01-2010 -3.14 12.01
4000
30
3000 06-01-2015 -3.04 23.09
20 2000
19-05-2010 -2.94 20.72
10 1000
20-06-2013 -2.90 3.95
0 0
012
008
009
010
013
014
So, does India VIX ( India VIX futures-Introduced by NSE on 26th February,2014)
provide an excellent hedge in extreme market conditions?
India VIX is a forward looking volatility index –so does India VIX predict Nifty
movements?
Intraday Volatility Pattern – looks like ‘U’ shape
Converting Irregular time-series data equi-distant time series data