K Kiran Kumar IIM Indore

Financial
Financial Analytics Using R
K Kiran Kumar
IIM Indore
Recap….
 Regression objective
 To explain the variation in dependent variable with the help of independentvariables
 To assess the relative contribution of each independent variable in explaining the variation in
dependent variable
 Reasons for including Error term
 Data unavailability; Poor proxy of data; vagueness in theory; intrinsic randomness; to keep the
equation simple.
 Matrix representation of regression equation
 Assumptions on Errors
 Estimation of BETA – Minimizing Error Sum of Squares
 Sensitivity of assumptions on errors
 Expected value andVariance of Estimated BETA
 Sensitivity of assumptions on errors
 Linearity of Regression Model
 Goodness of Fit – AdjustedR-square
 Regression problem sets
 R Lab session 1 – R Studio
 Multicollinearity -- Model specification bias
 Dummy variables as Independent variables
 Projects: Is Gold safe haven?
 R Lab Session 2 – IntradayTrades data set + Regression
Dummy Variables as Independent variables
 Intercept dummy variables
 In the presence of intercept, include one less than the number of
categories. Estimated coefficients of dummy variables are the
incremental intercept (vis-à-vis the base category which takes zero)
 If intercept is absent then include as many as number of categories.
Estimated coefficients of dummy variables are the marginal (level) of the
intercept for each category.
 Slope <Interactive> dummy variables
 Interactive= quantitative independent variable (ie slope) * categorical
variable (ie dummy variable)
 If slope is present then include one less than the number of categories as
interactive terms. Estimated coefficients are the incremental slopes.
 If slope is absent then include as many as the number of categories as
interactive terms and the coefficients are the slope for that category .
Seasonality
 To account for seasonality effects in the dependent variable
 Seasonal effects in financial markets have been widely observed and are often termed
“calendar anomalies”
 Examples include day-of-the-week effects, open- or close-of-market effect, January

effects, or bank holiday effects.
 Their existence is not necessarily inconsistent with the EMH

Constructing Dummy Variables for Seasonality
 One way to cope with this is the inclusion of dummy variables- e.g. for
quarterly data, we could have 4 dummy variables:
D1t = 1 in Q1 and zero otherwise

 How many dummy variables do we need? We need one less than the
“seasonality” of the data. e.g. for quarterly series, consider what happens if we
use all 4 dummies
Seasonalities in South East Asian Stock Returns
 Brooks and Persand (2001) examine the evidence for a day-of-the-week effect in
five Southeast Asian stock markets: South Korea, Malaysia, the Philippines,
Taiwan and Thailand.
 The data, are on a daily close-to-close basis for all weekdays (Mondays to
Fridays) falling in the period 31 December 1989 to 19 January 1996 (a total of
1581 observations).
 They use daily dummy variables for the day of the week effects in the
regression:
rt = 1D1t + 2D2t + 3D3t + 4D4t + 5D5t + ut
 Then the coefficients can be interpreted as the average return on each day of the
week.
Pp 157 of the paper
Whats the major result from the table 1?

Table 2
 Whats the major result from Table 2 ?

Seasonalities in South East Asian Stock Returns Revisited
 It is possible that the different returns on different days of the week could be a
result of
5
different levels of risk on different days.
 To allowfor this, Brooks and Persand re-estimate the model allowing for
i1
different betas on different days of the week using slope dummies:
rt = ( iDit + i DitRWMt) + ut
 where Dit is the ith dummy variable taking the value 1 for day t=i and zero
otherwise, and RWMt is the return on the world marketindex
 Now both risk and return are allowed to vary across the days of the week.
When Dummy Variables may be used as Dependent variables…
There are numerous examples of instances where this may arise, for example where we
want to model:
 Why firms choose to list their shares on the NASDAQ rather than the NYSE
 Why some stocks pay dividends while others do not
 What factors affect whether countries default on their sovereign debt
 Why firms choose to issue new stock to finance an expansion while others issue bonds
 Why some firms choose to engage in stock splits while others do not.
It is fairly easy to see in all these cases that the appropriate form for the dependent
variable would be a 0-1 dummy variable since there are only two possible outcomes.
There are, of course, also situations where it would be more useful to allow the
dependent variable to take on other values.
The Linear Probability Model
 We will first examine a simple and obvious, but unfortunately flawed, method
for dealing with binary dependent variables, known as the linear probability
model.
 it is based on an assumption that the probability of an event occurring, Pi, is
linearly related to a set of explanatory variables
Pi p( yi  1)  1   2 x2i  3 x3i    k xki  ui
 The actual probabilities cannot be observed, so we would estimate a model

where the outcomes, yi (the series of zeros and ones), would be the dependent
variable.
 This is then a linear regression model and would be estimated by OLS.
 The set of explanatory variables could include either quantitative variablesor
dummies or both.
 The fitted values from this regression are the estimated probabilities for yi =1
for each observation i.
The Linear Probability Model
 The slope estimates for the linear probability model can be interpreted as the
change in the probability that the dependent variable will equal 1 for a one-
unit change in a given explanatory variable, holding the effect of all other
explanatory variables fixed.
 Suppose, for example, that we wanted to model the probability that a firm i
will pay a dividend p(yi = 1) as a function of its market capitalisation (x2i,
measured in millions of US dollars), and we fit the following line:
Pˆ 0.3 0.012x
i 2i
where Pˆidenotes the fitted or estimated probability for firm i.

 This model suggests that for every $1m increase in size, the probability that
the firm will pay a dividend increases by 0.012 (or 1.2%).
 A firm whose stock is valued at $50m will have a -0.3+0.01250=0.3(or
30%) probability of making a dividend payment.
The Fatal Flaw of the Linear Probability Model
 Graphically, the situation we have is
Disadvantages of the Linear Probability Model
 While the linear probability model is simple to estimate and intuitive to

interpret, the diagram on the previous slide should immediately signal a
problem with this setup.
 For any firm whose value is less than $25m, the model-predicted probability
of dividend payment is negative, while for any firm worth more than $88m,
the probability is greater than one.
 Clearly, such predictions cannot be allowed to stand, since the probabilities

should lie within the range (0,1).
 An obvious solution is to truncate the probabilities at 0 or 1, so that a

probability of -0.3, say, would be set to zero, and a probability of, say, 1.2,
would be set to 1.
Disadvantages of the Linear Probability Model 2
 However, there are at least two reasons why this is still not adequate.
 The process of truncation will result in too many observations for which the
estimated probabilities are exactly zero or one.
 More importantly, it is simply not plausible to suggest that the firm's probability
of paying a dividend is either exactly zero or exactly one. Are we really certain
that very small firms will definitely never pay a dividend and that large firms will
always make a payout?
 Probably not, and so a different kind of model is usually used for binary
dependent variables either a logit or a probit specification.
Disadvantages of the Linear Probability Model 3
 The LPM also suffers from a couple of more standard econometric problems that
we have examined in previous chapters.
 Since the dependent variable only takes one or two values, for given (fixed in
repeated samples) values of the explanatory variables, the disturbance term will
also only take on one of two values.
 Hence the error term cannot plausibly be assumed to be normally distributed.
 Since the disturbance term changes systematically with the explanatory variables,
the former will also be heteroscedastic.
 It is therefore essential that heteroscedasticity-robust standard errors are always used
in the context of limited dependent variable models.
Logit and Probit: Better Approaches
 Both the logit and probit model

approaches are able to overcome
the limitation of the LPM that it
can produce estimated
probabilities that are negative or
greater than one.
 They do this by using a function
that effectively transforms the
regression model so that the fitted
values are bounded within the
(0,1) interval.
 Visually, the fitted regression
model will appear as an S-shape
rather than a straight line, as was
the case for the LPM.
The Logit Model
 The logit model is so-called because it uses a the cumulative logistic

distribution to transform the model so that the probabilities follow the S-
shape given on the previous slide.
 With the logistic model, 0 and 1 are asymptotes to the function and thus the
probabilities will never actually fall to exactly zero or rise to one, although
they may come infinitesimally close.
 The logit model is not linear (and cannot be made linear by a transformation)
and thus is not estimable using OLS.
 Instead, maximum likelihood is usually used to estimate the parameters of

the model.
Probability of
owning a house or
paying a dividend
Other regression models
 When number of nominal categories in dependent variable are
more than two then it’s a Multinomial Regression
 Buy, Hold, Sell decisions in equity trading
 When number of ordered categories in dependent variable are
more than two then it’s a Ordered logistic regression
 Credit ratings: AAA, A, BBB etc.
Roadmap of Statistical Tools…via Regression
If only predictor is dichotomous,RA is the If your several predictors are categorical, RA is the
same as a t-test. same asANOVA.
If your residuals are not

Regression
Analysis Yi  0  1 X1i   2 X 2i   i independent, replace
OLS by GLS regression
analysis (multilevel
modeling)
If your outcome is
categorical, you need to
use…
If you have
more
predictors than
Multinomial Ordinal you can deal
Binomial logistic If your outcome vs.
logistic logistic with,
regression predictor
regression regression
analysis relationship is non-
analysis analysis
(dichotomous linear, Transform the
(polytomous (ordinal
outcome) outcome or
outcome) outcome)
predictor
Create Form composites of
taxonomies of the indicators of Use non-linear
fitted models any common regression
If time is a predictor, you need discrete-
and compare. construct. analysis.
time survival analysis…
Logisitic Regression in R
setwd("D:/pgp/201819/FAuR/data")
admit=read.csv("binary.csv")
head(admit)
tail(admit)
summary(admit)
sapply(admit,sd) # sapply is to apply a function to all columns in a dataset..here sd is stdev
admit$rank=factor(admit$rank) # converting integer variable into categorical variable
xtabs(~admit+rank,data=admit) #contingency table
mylogit <- glm(admit ~ gre + gpa + rank, data = admit, family = "binomial") # to estimatelogistic
regression
summary(mylogit)
For every one unit change in gre, the log odds of admission (versus non-admission) increases by0.002.
For a one unit increase in gpa, the log odds of being admitted to graduate school increases by 0.804.
The indicator variables for rank have a slightly different interpretation. For example, having attendedan
undergraduate institution with rank of 2, versus an institution with a rank of 1, changes the log odds of
admission by -0.675.
coef1=exp(coef(mylogit)) # odds ratio
coef2=coef1/(1+coef1) # probabilities
Time Series Econometrics
Time Series Data?
 What is a time series?
 Anything observed sequentially (by time?)
 Returns, volatility, interest rates, exchange rates, bond yields, …
 Trade and Order book
 How is it
different?
 The observations are not independent.
 There is correlation from observation to observation / time to time.
 Discrete time series

may arise in two ways:
 By sampling a continuous time series
 By accumulating a variable over a period of time
 Characteristics of time series
 Time periods are of equal length
 No missing values
How do we determine correlation in time series data?
 Autocorrelation Function and Partial

Autocorrelation Function Log Quarterly Earnings for J&J
 Sample ACF – sample estimate of the

autocorrelation function.
2
 Substitute sample estimates of the covariance
1
between X(t) and X(t+h). Note:We do not
have “n” pairs but “n-h”pairs.
0
 Substitute sample estimate of variance. Jan 60 Jan 64 Jan 68 Jan 72 Jan 76 Jan 80
Ti me in quarters
Histogram ACF PACF
1.0
1.0
15
0.5
0.5
10
ACF
ACF
0.0
0.0
5
-0.5
-0.5
-1.0
-1.0
0
-1 0 1 2 3 0 5 10 15 0 5 10 15
Log Quarterly Earnings for J&J Lag Lag

What we mean by Auto(Serial) Correlation?
 Autocorrelation is a systematic pattern in the errors that can be either attracting

(positive) or repelling (negative) autocorrelation.
 Time series data :Autocorrelation : Cov(ut,ut-1) 0
 Cross section data : Serial Correlation : Cov(ui, uj) 0
 So why one should worry if errors are autocorrelated?

 Remember, in regression, we assume NO autocorrelation of residuals !
 For efficiency (accurate estimation/prediction) all systematic information
needs to be incorporated into the regression model.
 Correct inference requires incorporating knowledge of the dependence structure.
 Correlogram
 Plot of Autocorrelation against its lag
Ut Attracting
Postive
. . .. . . . ..
Auto. 0
. . . . .. . . .
. .. . t
Ut . . .. . . . . .
Random
No . . .
Auto. 0
. . .. . . .
. . .. .
. .
. .t
Ut . . .
Reverse
. . .
Negative . . .
Auto. 0
. . . . . . . t
. .
How about our day-to-day Economic Series look like ?
2.5
.7
2.25
.65
2
Share Prices Exchange Rate
1.75 .6
1.5
.55
1.25
.5
1
.45
.75
.5 .4
.25 .35
0 50 100 150 200 250 300 350 400 450

0 50 100 150 200 250
9.3
Mean is changing -- As time changes, mean is

9.2
Income
changing
9.1
9
Normal (and t, F etc) distributions have
8.9 constant mean
8.8
So any analysis with changing means  can’t

8.7
assess thro’ hypothesis testing.
1960 1965 1970 1975
Spurious Regression -- Why Stationarity of time series ?
 Spurious regressions. If two variables are trending over time, a regression of

one on the other could have a high R2 even if the two are totally unrelated. Such
regressions are called as Spurious or Non-sense regressions.
 To avoid spurious regressions, one should make the series– Stationary.

 Stationary time series is one – which has constant mean, constant variance and
constant autocorrelation.
 Mean, Variance and autocorrelation don’t depend on time t
 If these are changing then we call the series as NON-STATIONARY.
 If the variables in the regression model are not stationary, then it can be proved
that the standard assumptions for asymptotic analysis will not be valid. In other
words, the usual “t-ratios” will not follow a t-distribution, so we cannot validly
undertake hypothesis tests about the regression parameters.
Market Efficiency – Speed of convergence to Mkt Efficiency
 Testing for Market Efficiency
 Significance of past lag values
 Yes – Market is inefficient
 No – Market is Efficient
 How fast the market becomes efficient ?
 Run the regressions at daily, hourly, 30-, 15- and 5-minfrequency
 Check the significance of past lag values
Running multiple regressions at a go
library(quantmod)
getSymbols("INFY.NS",from='2015-01-01')
getSymbols("^NSEI", from="2007-01-01",to="2017-07-29") #to download data from yahoo directly

# Cl for closing prices; Op for open, Hi for high, Lo for lowest price and ClCl for daily close-to-close return,Ad for daily
adjusted closing price
head(Ad(NSEI))
infy.m=to.monthly(Ad(INFY.NS)) #to transform data to monthly or quarterly frequencies…returns op,hi,lo,cl of the
period
nifty.ret=diff(log(Ad(NSEI)))*100
symbols=c("RELIANCE.NS","HINDALCO.NS","INFY.NS") # to collect data of several firms at a go, special characters
not allowed
getSymbols(symbols,from="2007-01-01",to="2017-07-29")
infy.ret=diff(log(Ad(INFY.NS)))*100 # to calculate continuous compounding returns

hindalco.ret=diff(log(Ad(HINDALCO.NS)))*100
reliance.ret=diff(log(Ad(RELIANCE.NS)))*100
dataDaily=na.omit(merge(infy.ret,hindalco.ret,reliance.ret,nifty.ret),join='inner') # to merge data into one from different
data files, inner means intersection of files.
reg=list()
for (i in 1:3){
reg[[i]]=lm(dataDaily[,i]~dataDaily[,4])
}
# for loop…where <changing> dependent variables are stored in first three columns and independent variable is in 4th column.
sapply(reg,coef)
sapply(reg,summary)
Stationary Time Series : Meaning
 Xt is stationary if E(Xt),  X2,and the population covariance of Xt and Xt+s are independent of t
t
•A time series Xt is said to be stationary if its expected value and population variance are independent of
time and if the population covariance between its values at time t and time t+s depends on s but not ont.
• Called as Covariance Stationary or WeakStationary
• Any example ?
• a simple AR(1) model

STATIONARY PROCESSES : AR(1) series
Xt is stationary if E(Xt),  2, and the population

covariance of Xt and Xt+s Xarte independent of t
X t   2 X t 1   t  1  2  1
An example of a stationary time series is an AR(1) process Xt = 2Xt-1 + t, provided that –1 < 2 < 1,
where t is a random variable with 0 mean and constant variance and not subject to
autocorrelation.

X t   2 X t 1   t  1  2  1
X t 1   2 X t 2   t1
This can easily be demonstrated. If the relationship is valid for time period t, it is also valid for
time period t-1.

X t   2 X t 1   t  1  2  1
X t 1   2 X t 2   t1
X t   2 (  2 X t 2   t1 )   t
  2X    
2 t2 2 t1 t
  X     ...  
t t1

2 0 2 1 2 t1 t
Substituting for Xt-1 in the first equation, one obtains Xt in terms of Xt-2, t, and t-1.

X t   2 X t 1   t  1  2  1
X t 1   2 X t 2   t1
X t   2 (  2 X t 2   t1 )   t
  22 X t2   2 t1   t
  2t X 0   2t1 1  ...  2 t1   t
Continuing this process of lagging and substituting, one obtains Xt in terms of X0 and the
innovations 1, ..., t.
STATIONARY PROCESSES

X t   2 X t 1   t  1  2  1
X t 1   2 X t 2   t1
X t   2 (  2 X t 2   t1 )   t
  22 X t2   2 t1   t
  2t X 0   2t1 1  ...  2 t1   t
E( X t )   2t X 0   2t1 E( 1 )  ...  E(  t )   2t X 0  0
t  X2 , which
The expected value of each innovation is 0. Hence the expected value of X is t
0
tends to 0 as t increases. Thus E(Xt) is ultimately independent of t.

X t   2 X t 1   t  1  2  1
X t   2t X 0   2t1 1  ...   2 t1   t
 2X  population variance of (  2t X 0   2t1 1  ...   2 t1   t )

t
 population variance of (  t1  ...     

)
2 1 2 t1 t
  2t 2 2  ...   2 2   2
 2  
1  22t 2
2
1
  2
 
1  2
1  2 
2 2
Next we will show that the population variance of Xt is independent oftime.

X t   2 X t 1   t  1  2  1
X t   2t X 0   2t1 1  ...   2 t1   t

t
 population variance of (  2t1 1  ...   2 t1   t )

  2t 2 2  ...   2 2   2
 2  
1  22t 2
2
1
  2
 
1  2
1  2 
2 2
2tX0 is an additive constant and therefore does not affect the variance.

X t   2 X t 1   t  1  2  1
X t   2t X 0   2t1 1  ...   2 t1   t

t

  2t 2 2  ...   2 2   2
 2  
1  22t 2
2
1
  2
 
1  2
1  2 
2 2
The innovations are assumed to be generated independently of each other, and hence their
population covariances are 0.

X t   2 X t 1   t  1  2  1
X t   2t X 0   2t1 1  ...   2 t1   t

t

  22t2 2  ...   22 2   2
1  2 2 1
 2t 2
2   2 
1  1 
2 2
The 2 factors are squared when taken out of the variance terms
Xt is stationary if E(Xt),  2, and the population

Z  2
covariance of2tX  X2tt+s4Xaret independent
2t and ...   2 1of t
2 2
X t  2t2 X t 12t2
 t 12  2  1
 Z   2   2  ...   2
2
X t   t X   t1  ...    t   t
2
2 0 2 1
2
1   2t2 1
(1   22 )Z  1   22t Z 
2  population variance of (  t X   1 t122  ...     )
Xt 2 0 2 1 2 t1 t
 population variance of ( t1  ...    )

2 1 2 t1 t
  22t2 2  ...   22 2   2

1  22t 2 1
     2
1  22 1  22 
The terms involving 2 form a geometric progression that is easily summed.


X t   2 X t 1   t  1  2  1
X t   2t X 0   2t1 1  ...   2 t1   t

t

  22t2 2  ...   22 2   2
1  22t 2 1
     2
1  22 1  22 
The term  22t in the numerator tends to 0 as t increases, so we have demonstrated that the
variance is also independent of time.

X t   2 X t 1   t  1  2  1
X t   2t X 0   2t1 1  ...   2 t1   t
X ts   2s X t   2s1 t1  ...   2 ts1   ts
Next we will consider the population covariance between Xt and Xt+s. It is convenient to
start by writing Xt+s in terms of Xt and the innovations t+1, ..., t+s. This is done by lagging
and substituting, as before.

X t   2 X t 1   t  1  2  1
X t   2t X 0   2t1 1  ...   2 t1   t
X ts   2s X t   2s1 t1  ...  2 ts1   ts
population covarianceof X t and X ts   2s X2 t

 2s 2
 
1  2
2
Now Xt is fixed at time t and is therefore independent of the innovations after time t. Hence the
population covariance of X tand X t+s reduces to the population covariance of X t and  2sXt . Thus it
is equal to the population variance of X t multiplied by  2s.

X t   2 X t 1   t  1  2  1
X t   2t X 0   2t1 1  ...   2 t1   t
X ts   2s X t   2s1 t1  ...  2 ts1   ts
population covarianceof X t and X ts   2s X2 t

 2s
  2
1   22 
It is therefore equal to the expression shown. This depends on s, but it is independent of t.

Hence all three conditions required for the stationarity of the process are satisfied.
20
15
10
0 41 71
-5
-10
-15
Here is a series generated by this process with 2 = 0.7 and random numbers for the innovations.
Non StationaryTime Series
There is no long-run mean to which the series returns
The variance is time dependent and goes to infinity

as time approaches to infinity
Theoretical autocorrelations do not decay but, in finite

samples, the sample correlogram dies outslowly
When a AR(1) series will beNon-stationary?
Lets get in touch with RandomWalk

NONSTATIONARY PROCESSES
Random walk
X t  X t 1   t
X t  X 0  1  ...   t1   t
The condition –1< 2 <1 was crucial for stationarity. If 2 is equal to 1, the series becomes a
nonstationary process known as a random walk.
It will be assumed, as before, that the innovations  are generated independently from a
fixed distribution with mean 0 and population variance 2.
If the process starts at X0 at time 0, its value at time t is given by X0 plus the sum of the
innovations in periods 1 to t.
Random walk
X t  X t 1   t
X t  X 0  1  ...   t1   t
E( X t )  X 0  E(  1 )  ...  E( n )  X 0
If expectations are taken at time 0, the expected value at any future time t is fixed at X0 because the
expected values of the future innovations are all 0. Thus E(Xt) is independent of t and the first
condition for stationarity remains satisfied.
Random walk
X t  X t 1   t
X t  X 0  1  ...   t1   t
E( X t )  X 0  E(  1 )  ...  E( n )  X 0
 2X  population variance of ( X 0  1  ...   t1   t )

t
 population variance of (1  ...   t1   t )

  2  ...   2   2
  
 t 2

However, the condition that the variance of Xt be independent of time is not satisfied.
Random walk
X t  X t 1   t
X t  X 0  1  ...   t1   t
E( X t )  X 0  E(  1 )  ...  E( n )  X 0

t
 population variance of (1  ...   t1   t )

  2  ...   2   2
  
 t 2

The variance of Xt is equal to the variance of X0 plus the sum of the innovations. X0 may be
dropped from the expression because it is an additive constant
Random walk
X t  X t 1   t
X t  X 0  1  ...   t1   t
E( X t )  X 0  E(  1 )  ...  E( n )  X 0

t

  2  ...   2   2
  
 t  2
The variance of the sum of the innovations is equal to the sum of their individual variances. The
covariances are all 0 because the innovations are assumed to be generated independently.
Random walk
X t  X t 1   t
X t  X 0  1  ...   t1   t
E( X t )  X 0  E(  1 )  ...  E( n )  X 0

t

  2  ...   2   2
  
 t 2
The variance of each innovation is equal to , by assumption. Hence the population variance of
2
Xt is directly proportional to t. Its distribution bec omes wider and flatter, the further one looks into
the future.
20
15
10
5
Random walk
0
-5
-10
-15
The chart shows a typical random walk. If it were a stationary process, there would be a
tendency for the series to return to 0 periodically. Here there is no such tendency.
Some nonstationary series
TESTING FOR NONSTATIONARITY
1.0
0.8
X t   2 X t 1   t
 k   2k
0.6
 2  0.8
0.4
0.2
0.0
1 4 7 10 13 16 19
Correlogram of an AR(1) process
For stationary processes the autocorrelation coefficients tend to 0 quite quickly as k
increases. The figure shows the correlogram for an AR(1) process with 2 = 0.8.
Higher order AR(p) processes will exhibit more complex behavior, but if they are stationary,
the coefficients will eventually decline to 0.
1.0
0.8 X t  X t 1   t
0.6
0.4
0.2
0.0
1 4 7 10 13 16 19
Correlogram of a random walk
In the case of nonstationary processes, the theoretical autocorrelation coefficients are not defined
but one may be able to obtain an expression for E(rk), the expected value of the sample
autocorrelation coefficients. For long time series, these coefficients decline slowly.
Hence time series analysts can make an initial judgment as to whether a time series is
nonstationary or not by computing its sample correlogram and seeing how quickly the coefficients
decline.
X t  1   2 X t 1   t
A more formal method of detecting nonstationarity is often described as testing for unit roots.
The early and pioneering work on testing for a unit root in time series was done by Dickey and
Fuller (Dickey and Fuller 1979, Fuller 1976). The basic objective of the test is to test the null
hypothesis that β2=1 against the alternative β2<1
X t  1   2 X t 1   t
H0 : 2  1  2X  t 2
t
2  1
 2
H 1 :  2 1   2
2
 1   2 1
1 2
X
In practice, there will be just two possibilities: 2 = 1, and -1 < 2 < 1. If 2 = 1, the process is
nonstationary because its variance increases with t. If 2 lies between 1 and -1, the variance is
fixed and the series is stationary.
The test is intended to discriminate between the two possibilities. The null hypothesis is that the
process is nonstationary. We need a specific value of b2 when we define the null hypothesis, so
we make H0: b2 = 1. The alternative hypothesis is then H1: b2 < 1.
X t  1   2 X t 1   t
H0 : 2  1  2X  t 2
t
2  1
 2
H1 : 2  1   2
2
 1   2 1
1 2
X
X t  X t 1  1  (  2  1)X t 1   t
X t  1  (  2  1)X t 1   t
H 0 :  2  1 0
H1 : 2  1  0
Before performing the test, it is convenient to rewrite the model subtracting Xt-1 from both sides.
To perform the test, we regress Xt on Xt-1 and test whether the slope coefficient is significantly
different from 0.
2200
2000
1800
1600
1400
1200
1000
800
100 200 300 400 500
NIFTY
Here is the NSE Nifty daily series from the last two years. It is clearly nonstationary, but nevertheless we
will perform a formal test.
In EViews you can perform a test of nonstationarity by clicking on the name of the series to be tested,
clicking on the View tab in the window that opens, and then clicking on Unit Root Test on the menu that
appears. The next slide shows the output for the test for NIfty.
X t  (  2   3  1)X t 1   3 X t 1  1   4 t   t
The key items are the coefficient of Xt-1, here Nifty(-1), and its t statistic. The coefficient is close
to 0, as it would be under the null hypothesis of nonstationarity.
The t statistic is reproduced at the top of the output, where it is described as the Augmented
Dickey-Fuller test statistic.
EViews calculates the critical values for you. In this case you would not reject the null hypothesis
that Nifty is a nonstationary series. The test result thus corroborates the conclusion we drew
looking at the graph.
150
D(LGHOUS)
100
50
-5 0
-1 0 0
-1 5 0
-2 0 0
100 200 300 400 500
DNIF
Here is the series for the first differences of Nifty daily series. Does it look stationary or
nonstationary?
The coefficient is far from 0 and it has a high t statistic. We can reject the null hypothesis of
nonstationarity at the 1 percent level. Thus it would appear that Nifty can be rendered stationary by
differencing once and hence I (1)
Different Time-Series Processes
 Autoregressive Model
 Yt depends only on its own past values Yt-1,Yt-2 ,Yt-3 ,
A common representation:Yt depends on p of its past values called as AR(p) model
Yt =  + 1Yt-1 + 2Yt-2 + 3Yt-3 + …..+
pYt-p + t
 Example:This year’s income depends on past incomes
 Moving Average Model
 Yt depends only on the random error terms which follow a white noise process
 A common representation:Yt depends on q of its past values called as MA(q) model
 Yt = + t - 1 t-1 - 2 t-2 - ….- q t-q
 Example: Number of patients discharged on a day depends on how many patients get admitted
and stay for one day, two days ot three days.
 Autoregressive Moving Average Model
 The time-series may be represented as a mix of both AR and MA model referred as ARMA
(p,q)
 general form of such a time-series model, which depends on p of its own past values and q
past values of random disturbances
Yt =  + 1Yt-1 + 2Yt-2 + ….. + pYt-p + t - 1 t-1 - 2 t-2 - ….- q t-q

Market Efficiency – Speed of convergence to Mkt Efficiency
 Testing for Market Efficiency
 Significance of past lag values
 Yes – Market is inefficient
 No – Market is Efficient
 How fast the market becomes efficient ?
 Run the regressions at daily, hourly, 30-, 15- and 5-minfrequency
 Check the significance of past lag values
R code
Reading indexfutures.csv dataset

 Nifty=read.csv(“indexfutures.csv”)
 plot(Nifty$AdjClose,type=‘l’)
 acf(Nifty$AdjClose,lag=10)
 Ret=diff(log(Nifty$AdjClose))*100
 plot(Ret,type=‘l’)
 acf(Ret,lag=10)
 library(tseries)
 adf.test(Nifty$AdjClose)
 adf.test(ret)
 meqn1=arima(ret,c(1,0,0))
 meqn1
 meqn2
 meqn3
Modelling Volatility
Volatility … simple concept  but difficult task to measure
 A series is said to be‘volatile’

 Which moves a lot / swings wildly
 Measures variability / dispersion about a central tendency
 Volatiltiy can’t be observed

 Measure of financial vulnerability : Standard Deviation
 Crucial role of ‘volatility’ in varied applications of finance

 Portfolio selection; Option Pricing; Risk Management etc.
“less risk is preferred to more” …A good market should not be volatile?

Stylized Features of Volatility
 TimeVaryingVolatility
 Volatility Clustering
 AsymmetricVolatility
 Comovement
 High Risk – High Return
 FatTails
How to estimate Volatility?
 HistoricalVolatility
 Extreme Valueestimators
 Standard Deviation
 Squared Returns
 Semi-variance (Downside Risk Measures)
 ImpliedVolatility
 Implicit in the market price of an option
 Market’s forecast of volatility of the underlying asset over the life of
option
 Best guess of volatility, if the market is efficient and the pricing model is
appropriate 
 Problems with Implied volatility
ARE THESE DAYS THE SAME?
CAN WE USE THIS INFORMATION TO MEASURE VOLATILITY BETTER?

Example of Downside Risk Measures / Semi-Variance :
To see how these alternative risk statistics compare to the variance consider the following
probability distributions for two investment portfolios:
Potential Prob. of Return for Prob. of Return for

Return Portfolio #1 Portfolio #2
-15% 5% 0%
-10 8 0
-5 12 25
0 16 35
5 18 10
10 16 7
15 12 9
20 8 5
25 5 3
30 0 3
35 0 3
Notice that the expected return for both of these portfolios is 5%:
E(R) 1 = (.05)(-0.15) + (.08)(-0.10) + ...+ (.05)(0.25) = 0.05

and
E(R) 2 = (.25)(-0.05) + (.35)(0.00) + ...+ (.03)(0.35) = 0.05
Example of Downside Risk Measures …
Clearly, however, these portfolios would be viewed differently by different investors. These
nuances are best captured by measures of return dispersion (i.e., risk).
1. Variance
As seen earlier, this is the traditional measure of risk, calculated the sum of the weighted squared
differences of the potential returns from the expected outcome of a probability distribution. For
these two portfolios the calculations are:
(Var) 1 = (.05)[-0.15 - 0.05] 2 + (.08)[-0.10 - 0.05] 2 + ... + (.05)[0.25 - 0.05] 2 = 0.0108
and
(Var) 2 = (.25)[-0.05 - 0.05] 2 + (.35)[0.00 - 0.05] 2 + ... + (.03)[0.35 - 0.05] 2 = 0.0114
Taking the square roots of these values leaves:
SD 1 = 10.39% and SD 2 = 10.65%

Example of Downside Risk Measures (cont.):
2. Semi-Variance
The semi-variance adjusts the variance by considering only those potential outcomes that fall
below the expected returns. For our two portfolios we have:
(SemiVar)1 = (.05)[-0.15 - 0.05]2 + (.08)[-0.10 - 0.05]2 + (.12)[-0.05 - 0.05]2 +

(.16)[0.00 - 0.05]2 = 0.0054
and
(SemiVar)2 = (.25)[-0.05 - 0.05]2 + (.35)[0.00 - 0.05]2 = 0.0034
Also, the semi-standard deviations can be derived as the square roots of these values:
(SemiSD)1 = 7.35% and (SemiSD)2 = 5.81%
Notice here that although Portfolio #2 has a higher standard deviation than Portfolio #1, it's semi-
standard deviation is smaller.
Regression Model
Yi =  + 1X1i + 2X2i + Ui
Homoskedasticity: Var(Ui) =  2
Heteroskedasticity: Var(Ui) = i 2
A larger variance when values of some Xi (or theYi’s themselves)
are large (or small)
Whats the problem with the afore-mentioned remedies?
 For finance people….variance is a measure of risk….and which takes‘central’ part in any procedure.
 Changes in Risk are almost order of the day….so explicitly modeling how risk(volatility / variance) is
changing is of utmost need.
 Simplest problem :What is the estimate of “Volatility now”?

 standard deviation over the last 5 years (old info??)
 standard deviation over the last 5 days (little info??)
 Alternative??
 Weighted average of these two…higher weights on recent past & non-zero small weights on
distant
 But how to determine the weights?
 Determine from historical data….statistical estimation problem….which is answered byARCH
 Autoregressive Conditional Heteroskedasticity

 Predictive (conditional)
 Uncertainty (heteroskedasticity)
 That fluctuates over time (autoregressive)
So ARCH looks like ?
 Proposed by Engle in 1982 to explain the variation in Inflation
 Basic ARCH (1) model

 conditional variance of a shock at time t is a function of the squares of past shocks: . (ht is the
variance and  is a “shock,”“news,” or “error”).
 Since the conditional variance needs to be nonnegative, the conditions have to be met. If 1 = 0,
then the conditional variance is constant and is conditionally homoscedastic.
 Can be generalized to‘p’ lags…ARCH(p) model
 Remember…there are two eqns in the regression model now.

 One for Mean level
 Other for Variancelevel
 Both have to be estimated simultaneously….thats where complexity arises
ARCH (Engle, 1982) and three stylized facts
 Conditional variance change over time, sometimes quite substantially
 There is volatility clustering – large (small) changes in unpredictable

returns tend to be followed by large (small) changes
 The unconditional distribution of returns has ‘fat’ tails giving a relatively

large probability of ‘outliers’ relative to the normal distribution.
87
Then what is GARCH ?
 Due to Bollerslev…1986…GENERALIZEDARCH
 Today’s variance is a weighted average of the ht     1 2t1  1ht 1
 Long run average variance
 Yesterday’s variance forecast -- Persistence coefficient
 The news (yday’s squared shock) -- Information absorption coefficient
 Non-negativity constraints: w >0;Alpha1>=0; Beta1>=0
 Stationarity constraints:Alptha1+Beta1<1
 GARCH(1,1)--Parsimonious representation ofARCH(p)
 GARCH(1,1) is empirically successful model

Testing for presence of “ARCH Effects”
1. First, run any postulated linear regression of the form given in the equation
above, e.g. yt = 1 + 2x2t + ... + kxkt +ut
saving the residuals, uˆt.
2. Then square the residuals, and examine the autocorrelation in squared residuals
3. If the variance of residuals is constant then we should not see any autocorrelation in
squared residuals.
If we observe significant autocorrelation in residual squares then there is a case of

Changing Variance and hence there is a need to fit a GARCH model.
4. If the GARCH model is appropriate then the residual squares obtained after fitting
GARCH model should not have any autocorrelation.
Generalised ARCH (GARCH) Models
 Due to Bollerslev (1986). Allow the conditional variance to be dependent

upon previous own lags
 The variance equation is now
 t 2 = 0 + 1 ut1
2
+t-12 (1)
 This is a GARCH(1,1) model, which is like an ARMA(1,1) model for the
variance equation.
 We could also wri2te
t-1 = 0 + 1u t2
2
+t-22
 t-22 = 0 + 1 ut3
2
+t-32
 Substituting into (1) for t-12 :
 t 2 = 0 + 1 ut21 +(0 + 1ut2
2
+t-22)
= 0 + 1 u t21 +0 +  1ut2
2
+t-22
Generalised ARCH (GARCH) Models (cont’d)
 Now substituting into (2) for t-22
 t 2 = 0 + 1 ut21 +0 + 1ut2

2
+02 +  12u2t3 +3t-32
 t2 =0 + 1 u t21 +0 + 1ut2
2
+2(0 + 1ut3
2
+t-32 )
 An infinite number of successive substitutions would yield
 t 2 = 0 (1++2) + 1 ut1
2
(1+L+2L2 ) + 3 t-32
 So the GARCH(1,1) model can be written as an infinite order ARCH model.
 t 2 = 0 (1++2+...) + 1 ut21 (1+L+2L2+...) + 0 2
 We can again extend the GARCH(1,1) model to a GARCH(p,q):
t2 = 0+ 1u t21 +2u t2

2
+...+qut2q +1t-1 2+2t-22+...+pt-p 2
t2 =   q p
0 i u 2
t i    j t j 2
i1 j1
Why is GARCH Better thanARCH?

- more parsimonious - avoids overfitting
- less likely to breech non-negativity constraints
The Unconditional Variance under the GARCH Specification
 The unconditional variance of ut is givenby

0
Var(ut) =
1 (1 )
when 1   < 1
 1    1 is termed “non-stationarity” in variance
 1   =1 is termed intergrated GARCH
 For non-stationarity in variance, the conditional variance forecasts will

not converge on their unconditional value as the horizon increases.
Estimation of ARCH / GARCH Models
 Since the model is no longer of the usual linear form, we cannot use
OLS.
 We use another technique known as maximum likelihood.
 The method works by finding the most likely values of the parameters
given the actual data.
 More specifically, we form a log-likelihood function and maximise it.

Steps in estimation of ARMA-GARCH model ?
 Say, you would like to build a time series model to Nifty Returns….lets list down the steps that you
do sequentially ?
 Plot the series; Check for Stationarity

 ADF test
 ExamineAutocorrelation and PartialAutocorrelation functions of the series
 ACF and PACF plots
 Fit an ARMA model with appropriate lags
 Box-Jenkins model : ARMAXmodels
 Validate the model with diagnostics
 Residual plot, ResidualACF
 Check for presence of time varying volatility
 Plot sq.returns; ACF forrtsq
 Fit an ARMA-GARCH model
 BHHH algorithm
 Validate the model with diagnostics
 Residual plot, ResidualACF, SqResidualACF
 Obtain / Use the model outcome, ht as proxy for volatility
 Plot ht, Compare with realized volatility
Steps in estimation of ARMA-GARCH model ?
 Say, you would like to build a time series model to Nifty or any series ...lets list down the steps that you
do sequentially
 You have Nifty levels data  Plot the series (trend present?); Plot the correlogram (long
memory?);Objective test for Stationarity (adf test).
 If the series is stationary then go ahead to model. If not then difference the series to make it stationary
 If the series is stationary then check its dependence withpast
 Correlogram and note the lags at which autocorrelation is significant
 In general <for financial daily returns>, look for the maximum ten lags.
 Fit an ARMA model with appropriate lags
 Check and compare forAR, MA andARMA models at the lag significant in correlogram
 ARMA-X models (in case you have some other explanatory variable)
 Preliminary selection based on the lowest AIC (lowest residual sum ofsquares)
 Validate the selected model with diagnostics (check the validity of assumptions on errors)
 Zero autocorrelation of residuals (Residual correlogram – insignificant correlations)
 If you see significant correlations then adjust the order ofAR / MA /ARMA model
 Constant variance of residuals (Squared Residuals correlogram – insignificant correlations)
 If you see significant correlations then it’s a changing variance problem  time varying volatility n e e d to fit GARCH
model
 Fit an ARMA-GARCH model
 BHHH algorithm; check non-stationarity and non-negativityconstraints
 Validate the model with diagnostics (check the assumptions on errors)
 Residual plot, Residual correlogram, SqResidual correlogram
 All need to have insignificant correlations then the model is appropriate and can be used. Otherwise need to modify model.
 Obtain / Use the model outcome, sigmat as proxy forvolatility
 Plot Sigmat, Compare with realized volatility
GARCH Estimation in R using RUGARCH package
F i l e  Change Directory and then select the folder where data is stored
Nifty<- read.csv(“indexfutures.csv”) # indexfutures.csv file contains raw data of Nifty
ret=diff(log(nifty$AdjClose))*100 # calculate log returns of NSE Nifty
library(tseries) # invoke ‘tseries’ package
Install.packages(“rugarch”)
library(rugarch) # invoke ‘ruarch’ package
acf(ret,lag.max=10) # plot correlogram (acf n pacf) of ret series; see help: ?acf
meqn2=arima(ret,order=c(0,0,1)) # fitting MA(1) model for ret series n name it as meqn2
res=residuals(meqn2) # extract residuals of meqn1 and standardize the residuals
sres=(res-mean(res))/sd(res)
sresq=sres*sres # Examine correlogram of std residuals and std residual squares
acf(sres) # presence of significant spikes in residual squares correlogram
acf(sresq) # indicates time varying volatility feature n need to use GARCH
fit1=ugarchfit(spec=ugarchspec(variance.model=list(model="sGARCH",garchOrder=c(1,1)),mea
n.model=list(armaOrder=c(1,0))),data=ret) # Fitting MA(1)-GARCH(1,1) model to ret
gres<-residuals(fit1,standardize=TRUE) # fits a AR(1)-GARCH(1,1) model for ret series;
acf(gsres) # Extract standardized residuals of fitted AR-GARCH model
gsresq=gsres**2 # construct squared standardized residuals
acf(gsresq) # check correlogram of std residuals and its squares
acf(gsresq,lag.max=10) #
fit1 # displays estimation results of fitted AR-GARCH model
volatility=sigma(fit1) # Extracts estimated GARCH volatility (std deviation)
plot(volatility, type="l") # plotting estimated garch volatility
Stylized Features of Volatility
 TimeVaryingVolatility
 Volatility Clustering
 AsymmetricVolatility
 Comovement
 High Risk – High Return
 FatTails
Extensions to the Basic GARCH Model
 Since the GARCH model was developed, a huge number of extensions and variants
have been proposed.
 Asymmetric GARCH Leverage Effects
 Suppose there is a negative shock to the equity return of a company.
 This increases the leverage of the firm (equity value down, debt unchanged). So the risk
of the equity has risen
 A positive shock to the equity reduces leverage and has a negative impact on risk
 A negative error has a larger effect than a positive error
 GJR-GARCH Model
(Glosten, Jaganathan, and Runkle)
 t2   0   1ut21  1 t21   u t21I t1
It 1  1 if ut 1  0 (bad news)
 0 otherwise (good news)
 Leverage effect would suggest  > 0

 Non-negativity constraint is 0>0, 1>0, 1>0 and 1+  >0
News Impact Curves
The news impact curve plots the next period volatility (ht) that would arise from various
positive and negative values of ut-1, given an estimated model.
News Impact Curves for Nifty Returns using Coefficients from GARCH and GJR Model
Estimates:
0.14
GARCH
GJR
0.12
Value of Conditional Varianc
0.1
0.08
0.06
0.04
0.02
0
-1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Value of Lagged Shock
High Risk – High Return : GARCH-in-Mean model
 This type of model introduces the conditional variance (or standard
deviation) into the mean equation.
 These are often used in asset return equations, where both return and
risk are to be considered.
 If the coefficient on this risk variable is positive and significant, it
shows that increased risk leads to a higher return.
R ˆ Bt  0 . 7  0.3  t  1
(0.7) (0.1)
ˆ2 t  0 . 8  0 . 5 u t2 1  0 . 4  2 t  1
(0.4) (0.3) (0.1)
 the positive sign and significant t-statistic indicate that the risk of the
bond leads to a higher return.
Comovement of volatility -- GARCH –X
 Volatility of any asset co-moves along with other financial assets.

 Indian market “dances” to the “tunes” of US market
 Stock market volatility is a function of exchange rate volatility
 Spot market volatility changes as volatility of futures market change
 The distribution has heavier tails than the normal distribution
 To capture these comovements, GARCH-X models are proposed
ˆ2 t  0 . 8  0 . 5 u t2 1  0 . 4  2 t  1  0 . 8 X t1

Heavy (Fat) Tails -- GARCH with GED / Student T
< Generalized Error Distribution>
 Daily financial returns (exchange rate, stocks, commodities) are not
normally distributed
 The distribution has heavier tails than the normal distribution
 It is more peaked than the normal distribution
 This means that small changes and large changes are more likely than the
normal distribution would suggest
 Many market variables have this property, known as excess kurtosis
 Use Error Distribution as Student-T or Generalized Error Distrn instead of
Normal
How ‘good’ is a volatility forecast?
 Can be tested in a Regression framework
RV t     *VF t   t
 When you sayVF is “unbiased” estimate of Realized /ActualVolatility?
 Yes, by looking at F-test; H0: α=0 & β=1
 What can beActualVolatility (/ RealizedVolatility), expost?

 For a daily volatility, one need to use standard deviation of h-minute returns from intra-daydata
 Similarly, for a monthly volatility  standard deviation of daily returns.
IVIX
 Observation reveals that inverse relationship exists between Nifty and
India VIX Date Return of Nifty Return of India VIX
Historical Fluctuations of India VIX and Nifty 50 (4-Mar-2008 to 31- 24-08-2015 -6.10 64.36
Aug-2015)
India VIX Nifty 50 22-09-2011 -4.17 22.44
90 10000 16-08-2013 -4.17 26.42

80 9000
03-09-2013 -3.84 17.93
8000
India VIX levels
70
Nifty 50
7000 27-08-2013 -3.51 11.52
60
6000
50 24-02-2011 -3.26 6.21
5000
40 27-01-2010 -3.14 12.01
4000
30
3000 06-01-2015 -3.04 23.09
20 2000
19-05-2010 -2.94 20.72
10 1000
20-06-2013 -2.90 3.95
0 0
012
008
009
010
013
014
25-05-2010 -2.81 10.29
06-05-2015 -2.77 13.27
27-02-2012 -2.77 10.31
 So, does India VIX ( India VIX futures-Introduced by NSE on 26th February,2014)
provide an excellent hedge in extreme market conditions?
 India VIX is a forward looking volatility index –so does India VIX predict Nifty
movements?
Intraday Volatility Pattern – looks like ‘U’ shape
Converting Irregular time-series data equi-distant time series data
--Time SeriesAnalysis requires data to be of regularly time stamped data

-- Monthly,Weekly, Daily etc.
--Raw trades data contain irregular time stamped data and to most time series techniques require
regularly time stamped data
Irregular frequency data Regular frequency data

 trds=read.table(“20090825.trd”,sep=”|”,col.names=c("tid","symbol","series","time","price","qty"))
 library(quantmod) # invoking‘quantmod’package,see:www.quantmod.com/examples/
 acc=subset(trds,trds$symbol=='ACC') #selecting ‘ACC’data
 date="2009-08-26“
 acc_xts=as.xts(acc[,"price"],order.by=as.POSIXct(paste(date,acc[,"time"]))) ##converting acc into xts format, taking price
column, appending dt column
 colnames(acc_xts)="price"
 acc_xts=acc_xts["T09:55/T15:30"] #subsetting time series data between 9:55 am and 15:30
 acc_1min=period.apply(acc_xts$price, endpoints(acc_xts, 'minutes' ,k=1), last)# #selecting last price in every one minute and
then aligning time to nearest minute
 acc_1min=align.time(acc_1min,n=60)
 ret=diff(log(acc_1min$price))
 ret$tod["T10:00/T11:00"]=1
 ret$tod["T11:00/T12:00"]=2
 ret$tod["T12:00/T13:00"]=3
 ret$tod["T13:00/T14:00"]=4
 ret$tod["T14:00/T15:30"]=5
 stdev_tod=aggregate(.~tod, data=ret, FUN=sd, na.rm =TRUE, na.action=NULL) # calculating standard deviation for every
period (time of the day)
 plot(stdev_tod,type='l')
 acc_minutes=to.period(acc_xts,period="minutes",k=1)#generate minute wise data in OHLC format
 acc_minutes=align.time(acc_minutes,n=60) #Calculate minute-wise returns of closing price
 Vol_garman=volatility(acc_minutes,calc="garman") # calculates Garman-Klass volatility estimate
 Vol_park= volatility(acc_minutes, calc="parkinson") # calculates Parkinson range basedvolatility

K Kiran Kumar IIM Indore

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

K Kiran Kumar IIM Indore

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

K Kiran Kumar IIM Indore

Uploaded by

Copyright:

Available Formats

Financial

Financial Analytics Using R

 Examples include day-of-the-week effects, open- or close-of-market effect, January

 Their existence is not necessarily inconsistent with the EMH

D1t = 1 in Q1 and zero otherwise

Whats the major result from the table 1?

 Whats the major result from Table 2 ?

 The actual probabilities cannot be observed, so we would estimate a model

where Pˆidenotes the fitted or estimated probability for firm i.

 While the linear probability model is simple to estimate and intuitive to

 Clearly, such predictions cannot be allowed to stand, since the probabilities

 An obvious solution is to truncate the probabilities at 0 or 1, so that a

 Both the logit and probit model

 The logit model is so-called because it uses a the cumulative logistic

 Instead, maximum likelihood is usually used to estimate the parameters of

If your residuals are not

 Discrete time series

 Characteristics of time series

 Time periods are of equal length

 Autocorrelation Function and Partial

 Sample ACF – sample estimate of the

Histogram ACF PACF

Log Quarterly Earnings for J&J Lag Lag

 Autocorrelation is a systematic pattern in the errors that can be either attracting

 So why one should worry if errors are autocorrelated?

0 50 100 150 200 250 300 350 400 450

Mean is changing -- As time changes, mean is

So any analysis with changing means  can’t

 Spurious regressions. If two variables are trending over time, a regression of

 To avoid spurious regressions, one should make the series– Stationary.

getSymbols("^NSEI", from="2007-01-01",to="2017-07-29") #to download data from yahoo directly

infy.ret=diff(log(Ad(INFY.NS)))*100 # to calculate continuous compounding returns

• a simple AR(1) model

Xt is stationary if E(Xt),  2, and the population

Xt is stationary if E(Xt),  2, and the population

Xt is stationary if E(Xt),  2, and the population

Xt is stationary if E(Xt),  2, and the population

Xt is stationary if E(Xt),  2, and the population

E( X t )   2t X 0   2t1 E( 1 )  ...  E(  t )   2t X 0  0

Xt is stationary if E(Xt),  2, and the population

X t   2t X 0   2t1 1  ...   2 t1   t

 2X  population variance of (  2t X 0   2t1 1  ...   2 t1   t )

 population variance of (  t1  ...     

Xt is stationary if E(Xt),  2, and the population

X t   2t X 0   2t1 1  ...   2 t1   t

 2X  population variance of (  2t X 0   2t1 1  ...   2 t1   t )

 population variance of (  2t1 1  ...   2 t1   t )

Xt is stationary if E(Xt),  2, and the population

X t   2t X 0   2t1 1  ...   2 t1   t

 2X  population variance of (  2t X 0   2t1 1  ...   2 t1   t )

 population variance of (  2t1 1  ...   2 t1   t )

Xt is stationary if E(Xt),  2, and the population

X t   2t X 0   2t1 1  ...   2 t1   t

 2X  population variance of (  2t X 0   2t1 1  ...   2 t1   t )

 population variance of (  2t1 1  ...   2 t1   t )

Xt is stationary if E(Xt),  2, and the population

 population variance of ( t1  ...    )