Markovchain Package
Markovchain Package
Markovchain Package
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav
Abstract
The markovchain package aims to fill a gap within the R framework providing S4
classes and methods for easily handling discrete time Markov chains, homogeneous and
simple inhomogeneous ones as well as continuous time Markov chains. The S4 classes
for handling and analysing discrete and continuous time Markov chains are presented, as
well as functions and method for performing probabilistic and statistical analysis. Finally,
some examples in which the package’s functions are applied to Economics, Finance and
Natural Sciences topics are shown.
Keywords: discrete time Markov chains, continuous time Markov chains, transition matrices,
communicating classes, periodicity, first passage time, stationary distributions.
1. Introduction
Markov chains represent a class of stochastic processes of great interest for the wide spectrum
of practical applications. In particular, discrete time Markov chains (DTMC) permit to model
the transition probabilities between discrete states by the aid of matrices.Various R packages
deal with models that are based on Markov chains:
Nevertheless, the R statistical environment (R Core Team 2013) seems to lack a simple package
that coherently defines S4 classes for discrete Markov chains and allows to perform probabilis-
tic analysis, statistical inference and applications. For the sake of completeness, markovchain
is the second package specifically dedicated to DTMC analysis, being DTMCPack (Nichol-
son 2013) the first one. Notwithstanding, markovchain package (Spedicato 2017) aims to
offer more flexibility in handling DTMC than other existing solutions, providing S4 classes
for both homogeneous and non-homogeneous Markov chains as well as methods suited to
perform statistical and probabilistic analysis.
The markovchain package depends on the following R packages: expm (Goulet, Dutang, Mae-
chler, Firth, Shapira, Stadelmann, and expm-developers@lists.R-forge.R-project.org 2013) to
2 markovchain package: discrete Markov chains in R
perform efficient matrices powers; igraph (Csardi and Nepusz 2006) to perform pretty plotting
of markovchain objects and matlab (Roebuck 2011), that contains functions for matrix ma-
nagement and calculations that emulate those within MATLAB environment. Moreover, other
scientific softwares provide functions specifically designed to analyze DTMC, as Mathematica
9 (Wolfram Research 2013b).
The paper is structured as follows: Section 2 briefly reviews mathematics and definitions re-
garding DTMC, Section 3 discusses how to handle and manage Markov chain objects within
the package, Section 4 and Section 5 show how to perform probabilistic and statistical model-
ling, while Section 6 presents some applied examples from various fields analyzed by means
of the markovchain package.
The set of possible states S = {s1 , s2 , ..., sr } of Xn can be finite or countable and it is named
the state space of the chain.
The chain moves from one state to another (this change is named either ‘transition’ or ‘step’)
and the probability pij to move from state si to state sj in one step is named transition
probability:
❼ A state x is transient if P (T x→x < +∞) < 1 (equivalently P (T x→x = +∞) > 0).
It is possible to analyze the timing to reach a certain state. The first passage time (or hitting
time) from state si to state sj is the number Tij of steps taken by the chain until it arrives
for the first time to state sj , given that X0 = si . The probability distribution of Tij is defined
4 markovchain package: discrete Markov chains in R
by Equation 5
and can be found recursively using Equation 6, given that hij (n) = pij .
X
hij (n) = pik hkj (n−1) . (6)
k∈S−{sj }
A commonly used quantity related to h is its average value, i.e. the mean first passage time
P (n)
(also expected hitting time), namely h̄ij = n=1...∞ n hij .
If in the definition of the first passage time we let si = sj , we obtain the first return time
Ti = inf{n ≥ 1 : Xn = si |X0 = si }. A state si is said to be recurrent if it is visited infinitely
often, i.e., P r(Ti < +∞|X0 = si ) = 1. On the opposite, si is called transient if there is a
positive probability that the chain will never return to si , i.e., P r(Ti = +∞|X0 = si ) > 0.
Given a time homogeneous Markov chain with transition matrix P, a stationary
P distribution
z is a stochastic row vector such that z = z · P , where 0 ≤ zj ≤ 1 ∀j and j zj = 1.
If a DTMC {Xn } is irreducible and aperiodic, then it has a limit distribution and this distri-
bution is stationary. As a consequence, if P is the Pk k × k transition matrix of the chain and
z = (z1 , ..., zk ) is the eigenvector of P such that i=1 zi = 1, then we get
lim P n = Z, (7)
n→∞
where Z is the matrix having all rows equal to z. The stationary distribution of {Xn } is
represented by z.
(AB)
Given two absorbing states sA (source) and sB (sink), the committor probability qj is the
probability that a process starting in state si is absorbed in state sB (rather than sA ) (Noé,
Schütte, Vanden-Eijnden, Reich, and Weikl 2009). It can be computed via
In P , p11 = 0.5 is the probability that X1 = s1 given that we observed X0 = s1 is 0.5, and so
on.It is easy to see that the chain is irreducible since all the states communicate (it is made
by one communicating class only).
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 5
Suppose that the current state of the chain is X0 = s2 , i.e., x(0) = (0 1 0), then the probability
distribution of states after 1 and 2 steps can be computed as shown in Equations (10) and
(11).
0.5 0.2 0.3
x(1) = (0 1 0) 0.15 0.45 0.4 = (0.15 0.45 0.4) . (10)
0.25 0.35 0.4
0.5 0.2 0.3
x(n) = x(n−1) P → (0.15 0.45 0.4) 0.15 0.45 0.4 = (0.2425 0.3725 0.385) . (11)
0.25 0.35 0.4
If, f.e., we are interested in the probability of being in the state s3 in the second step, then
P r (X2 = s3 |X0 = s2 ) = 0.385.
6 markovchain package: discrete Markov chains in R
R> library("markovchain")
The markovchain and markovchainList S4 classes (Chambers 2008) are defined within the
markovchain package as displayed:
Slots:
Slots:
The first class has been designed to handle homogeneous Markov chain processes, while
the latter (which is itself a list of markovchain objects) has been designed to handle non-
homogeneous Markov chains processes.
Any element of markovchain class is comprised by following slots:
1. states: a character vector, listing the states for which transition probabilities are
defined.
2. byrow: a logical element, indicating whether transition probabilities are shown by row
or by column.
3. transitionMatrix: the probabilities of the transition matrix.
4. name: optional character element to name the DTMC.
The markovchain objects can be created either in a long way, as the following code shows
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 7
The quicker way to create markovchain objects is made possible thanks to the implemented
initialize S4 method that checks that:
Method Purpose
* Direct multiplication for transition matrices.
[ Direct access to the elements of the transition matrix.
== Equality operator between two transition matrices.
as Operator to convert markovchain objects into data.frame and
table object.
dim Dimension of the transition matrix.
names Equal to states.
names<- Change the states name.
name Get the name of markovchain object.
name<- Change the name of markovchain object.
plot plot method for markovchain objects.
print print method for markovchain objects.
show show method for markovchain objects.
sort sort method for markovchain objects.
states Name of the transition states.
t Transposition operator (which switches byrow slot value and modifies
the transition matrix coherently).
R> round(after7Days, 3)
A similar answer could have been obtained defining the vector of probabilities as a column
vector. A column - defined probability matrix could be set up either creating a new matrix
or transposing an existing markovchain object thanks to the t method.
[,1]
sunny 0.390
cloudy 0.355
rain 0.255
R> round(after7Days, 3)
[,1]
sunny 0.462
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 9
cloudy 0.319
rain 0.219
The initial state vector previously shown can not necessarily be a probability vector, as the
code that follows shows:
R> fvals<-function(mchain,initialstate,n) {
R+ out<-data.frame()
R+ names(initialstate)<-names(mchain)
R+ for (i in 0:n)
R+ {
R+ iteration<-initialstate*mchain^(i)
R+ out<-rbind(out,iteration)
R+ }
R+ out<-cbind(out, i=seq(0,n))
R+ out<-out[,c(4,1:3)]
R+ return(out)
R+ }
R> fvals(mchain=mcWeather,initialstate=c(90,5,5),n=4)
Basic methods have been defined for markovchain objects to quickly get states and transition
matrix dimension.
R> states(mcWeather)
R> names(mcWeather)
R> dim(mcWeather)
[1] 3
Methods are available to set and get the name of markovchain object.
R> name(mcWeather)
10 markovchain package: discrete Markov chains in R
[1] "Weather"
R> markovchain:::sort(mcWeather)
New Name
A 3 - dimensional discrete Markov Chain defined by the following states:
cloudy, rain, sunny
The transition matrix (by rows) is defined as follows:
cloudy rain sunny
cloudy 0.40 0.30 0.3
rain 0.45 0.35 0.2
sunny 0.20 0.10 0.7
[1] 0.3
R> mcWeather[2,3]
[1] 0.3
The transition matrix of a markovchain object can be displayed using print or show methods
(the latter being less verbose). Similarly, the underlying transition probability diagram can
be plotted by the use of plot method (as shown in Figure 1) which is based on igraph
package (Csardi and Nepusz 2006). plot method for markovchain objects is a wrapper of
plot.igraph for igraph S4 objects defined within the igraph package. Additional parameters
can be passed to plot function to control the network graph layout. There are also diagram
and DiagrammeR ways available for plotting as shown in Figure 2. The plot function also uses
communicatingClasses function to separate out states of different communicating classes.
All states that belong to one class have same colour.
R> print(mcWeather)
rain 0.35
0.45
0.2
0.3
0.1
cloudy 0.4
0.3
0.2
sunny 0.7
R> show(mcWeather)
New Name
A 3 - dimensional discrete Markov Chain defined by the following states:
sunny, cloudy, rain
The transition matrix (by rows) is defined as follows:
sunny cloudy rain
sunny 0.7 0.20 0.10
cloudy 0.3 0.40 0.30
rain 0.2 0.45 0.35
decompose, spectrum
union
If one would like to use the MmgraphR package (Adamopoulou 2018) to plot the transition
matric, the following code shows how to do:
R> library("MmgraphR")
0.35
rain
0.45 0.1
0.2
0.3
Website: http://traminer.unige.ch
3
States
t t+1
Time
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 13
Import and export from some specific classes is possible, as shown in Figure 3 and in the
following code.
t0 t1 prob
1 sunny sunny 0.70
2 sunny cloudy 0.20
3 sunny rain 0.10
4 cloudy sunny 0.30
5 cloudy cloudy 0.40
6 cloudy rain 0.30
7 rain sunny 0.20
8 rain cloudy 0.45
9 rain rain 0.35
R> require(msm)
R> library(etm)
R> data(sir.cont)
R> sir.cont <- sir.cont[order(sir.cont$id, sir.cont$time), ]
R> for (i in 2:nrow(sir.cont)) {
14 markovchain package: discrete Markov chains in R
R+ if (sir.cont$id[i]==sir.cont$id[i-1]) {
R+ if (sir.cont$time[i]==sir.cont$time[i-1]) {
R+ sir.cont$time[i-1] <- sir.cont$time[i-1] - 0.5
R+ }
R+ }
R+ }
R> tra <- matrix(ncol=3,nrow=3,FALSE)
R> tra[1, 2:3] <- TRUE
R> tra[2, c(1, 3)] <- TRUE
R> tr.prob <- etm(sir.cont, c("0", "1", "2"), tra, "cens", 1)
R> tr.prob
Possible transitions:
from to
0 1
0 2
1 0
1 2
Coerce from matrix method, as the code below shows, represents another approach to create
a markovchain method starting from a given squared probability matrix.
markovchain
dataframe
igraph
sparseMatrix
table
Non-homogeneous Markov chains can be created with the aid of markovchainList object.
The example that follows arises from health insurance, where the costs associated to patients
in a Continuous Care Health Community (CCHC) are modelled by a non-homogeneous Mar-
kov Chain, since the transition probabilities change by year. Methods explicitly written for
markovchainList objects are: print, show, dim and [.
Markovchain 2
state t1
A 3 - dimensional discrete Markov Chain defined by the following states:
H, I, D
The transition matrix (by rows) is defined as follows:
H I D
H 0.5 0.3 0.2
I 0.0 0.4 0.6
D 0.0 0.0 1.0
Markovchain 3
state t2
A 3 - dimensional discrete Markov Chain defined by the following states:
H, I, D
The transition matrix (by rows) is defined as follows:
H I D
H 0.3 0.2 0.5
I 0.0 0.2 0.8
D 0.0 0.0 1.0
Markovchain 4
state t3
A 3 - dimensional discrete Markov Chain defined by the following states:
H, I, D
The transition matrix (by rows) is defined as follows:
H I D
H 0 0 1
I 0 0 1
D 0 0 1
R> mcCCRC[[1]]
state t0
A 3 - dimensional discrete Markov Chain defined by the following states:
H, I, D
The transition matrix (by rows) is defined as follows:
H I D
H 0.7 0.2 0.1
I 0.1 0.6 0.3
D 0.0 0.0 1.0
R> dim(mcCCRC)
[1] 4
The markovchain package contains some data found in the literature related to DTMC models
(see Section 6. Table 2 lists datasets and tables included within the current release of the
package.
Dataset Description
blanden Mobility across income quartiles, Jo Blanden and Machin (2005).
craigsendi CD4 cells, B. A. Craig and A. A. Sendi (2002).
kullback raw transition matrices for testing homogeneity, Kullback et al. (1962).
preproglucacon Preproglucacon DNA basis, P. J. Avery and D. A. Henderson (1999).
rain Alofi Island rains, P. J. Avery and D. A. Henderson (1999).
holson Individual states trajectiories.
sales Sales of six beverages in Hong Kong Ching et al. (2008).
Finally, Table 3 lists the demos included in the demo directory of the package.
identifying absorbing and transient states. Many of these methods come from MATLAB
listings that have been ported into R. For a full description of the underlying theory and
algorithm the interested reader can overview the original MATLAB listings, Feres (2007) and
Montgomery (2009).
Table 4 shows methods that can be applied on markovchain objects to perform probabilistic
analysis.
Method Returns
absorbingStates the absorbing states of the transition matrix, if any.
steadyStates the vector(s) of steady state(s) in matrix form.
meanFirstPassageTime matrix or vector of mean first passage times
committorAB committor probabilities
communicatingClasses list of communicating classes.
sj , given actual state si .
canonicForm the transition matrix into canonic form.
is.accessible checks whether a state j is reachable from state i.
is.irreducible checks whether a DTMC is irreducible.
period the period of an irreducible DTMC.
recurrentClasses list of recurrent classes.
summary DTMC summary.
transientStates the transient states of the transition matrix, if any.
0 ≤ πj ≤ 1
P
j∈S πj = 1 (12)
π∗P =π
Steady states are associated to P eigenvalues equal to one. Therefore the steady states vector
can be identified by the following:
Numeric issue (negative values) can arise when the Markov Chain contains more closed classes.
If negative values are found in the initial solution, the above described algorithm is performed
on the submatrix corresponding to recurrent P classes. Another vignette in the package
focuses on this issue.
The result is returned in matrix form.
R> steadyStates(mcWeather)
It is possible for a Markov chain to have more than one stationary distribution, as the gambler
ruin example shows.
reshape
find, fix
20 markovchain package: discrete Markov chains in R
sum
R> steadyStates(mcGR4)
0 1 2 3 4
[1,] 0 0 0 0 1
[2,] 1 0 0 0 0
R> absorbingStates(mcGR4)
R> absorbingStates(mcWeather)
character(0)
The key function used within Feres (2007) (and markovchain’s derived functions) is .commclassKernel,
that is called below.
R+ a <- d
R+ }
R+ T[i,] <- b
R+ i <- i+1 }
R+ F <- t(T)
R+ C <- (T > 0)&(F > 0)
R+ v <- (apply(t(C) == t(T), 2, sum) == m)
R+ colnames(C) <- stateNames
R+ rownames(C) <- stateNames
R+ names(v) <- stateNames
R+ out <- list(C = C, v = v)
R+ return(out)
R+ }
The .commclassKernel function gets a transition matrix of dimension n and return a list of
two items:
1. C, an adjacency matrix showing for each state sj (in the row) which states lie in the
same communicating class of sj (flagged with 1).
2. v, a binary vector indicating whether the state sj is transient (0) or not (1).
These functions are used by two other internal functions on which the summary method for
markovchain objects works.
The example matrix used in Feres (2007) well exemplifies the purpose of the function.
$C
a b c d e f g h i j
a TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
b FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE
22 markovchain package: discrete Markov chains in R
c TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
d FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE FALSE FALSE
e FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE FALSE FALSE
f FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
g FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE
h FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
i FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE
j FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
$v
a b c d e f g h i j
TRUE TRUE TRUE FALSE FALSE TRUE TRUE FALSE TRUE FALSE
R> summary(probMc)
All states that pertain to a transient class are named “transient” and a specific method has
been written to elicit them.
R> transientStates(probMc)
Listings from Feres (2007) have been adapted into canonicForm method that turns a Markov
chain into canonic form.
Probability MC
A 10 - dimensional discrete Markov Chain defined by the following states:
a, b, c, d, e, f, g, h, i, j
The transition matrix (by rows) is defined as follows:
a b c d e f g h i
a 0.5 0.0000000 0.50 0.0000000 0.0000000 0 0.0000000 0.00 0.0000000
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 23
R> probMcCanonic
Probability MC
A 10 - dimensional discrete Markov Chain defined by the following states:
a, c, b, g, i, f, d, e, h, j
The transition matrix (by rows) is defined as follows:
a c b g i f d e h
a 0.5 0.50 0.0000000 0.0000000 0.0000000 0 0.0000000 0.0000000 0.00
c 1.0 0.00 0.0000000 0.0000000 0.0000000 0 0.0000000 0.0000000 0.00
b 0.0 0.00 0.3333333 0.6666667 0.0000000 0 0.0000000 0.0000000 0.00
g 0.0 0.00 0.0000000 0.2500000 0.7500000 0 0.0000000 0.0000000 0.00
i 0.0 0.00 1.0000000 0.0000000 0.0000000 0 0.0000000 0.0000000 0.00
f 0.0 0.00 0.0000000 0.0000000 0.0000000 1 0.0000000 0.0000000 0.00
d 0.0 0.00 0.0000000 0.0000000 0.0000000 0 0.0000000 1.0000000 0.00
e 0.0 0.00 0.0000000 0.0000000 0.3333333 0 0.3333333 0.3333333 0.00
h 0.0 0.25 0.0000000 0.0000000 0.0000000 0 0.2500000 0.0000000 0.25
j 0.0 0.00 0.3333333 0.0000000 0.0000000 0 0.0000000 0.3333333 0.00
j
a 0.0000000
c 0.0000000
b 0.0000000
g 0.0000000
i 0.0000000
f 0.0000000
d 0.0000000
e 0.0000000
24 markovchain package: discrete Markov chains in R
h 0.2500000
j 0.3333333
The function is.accessible permits to investigate whether a state sj is accessible from state
si , that is whether the probability to eventually reach sj starting from si is greater than zero.
[1] TRUE
[1] FALSE
In Section 2.2 we observed that, if a DTMC is irreducible, all its states share the same
periodicity. Then, the period function returns the periodicity of the DTMC, provided that
it is irreducible. The example that follows shows how to find if a DTMC is reducible or
irreducible by means of the function is.irreducible and, in the latter case, the method
period is used to compute the periodicity of the chain.
[1] TRUE
R> period(mcE)
[1] 2
The example Markov chain found in Mathematica web site (Wolfram Research 2013a) has
been used, and is plotted in Figure 4.
R> require(matlab)
R> mathematicaMatr <- zeros(5)
R> mathematicaMatr[1,] <- c(0, 1/3, 0, 2/3, 0)
R> mathematicaMatr[2,] <- c(1/2, 0, 0, 0, 1/2)
R> mathematicaMatr[3,] <- c(0, 0, 1/2, 1/2, 0)
R> mathematicaMatr[4,] <- c(0, 0, 1/2, 1/2, 0)
R> mathematicaMatr[5,] <- c(0, 0, 0, 0, 1)
R> statesNames <- letters[1:5]
R> mathematicaMc <- new("markovchain", transitionMatrix = mathematicaMatr,
R+ name = "Mathematica MC", states = statesNames)
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 25
c 0.5
0.5
0.5
d 0.5
0.67
0.33
0.5
0.5
e 1
We conclude that the probability for the first rainy day to be the third one, given that the
current state is sunny, is given by:
[1] 0.121
To compute the mean first passage times, i.e. the expected number of days before it rains
given that today is sunny, we can use the meanFirstPassageTime function:
R> meanFirstPassageTime(mcWeather)
indicating e.g. that the average numer of days of sun or cloud before rain is 6.67 if we start
counting from a sunny day, and 5 if we start from a cloudy day. Note that we can also specify
one or more destination states:
R> meanFirstPassageTime(mcWeather,"rain")
sunny cloudy
6.666667 5.000000
The implementation follows the matrix solutions by (Grinstead and Snell 2006). We can check
the result by averaging the first passage probability density function:
[1] 6.666664
R> committorAB(mcWeather,3,1)
5. Statistical analysis
Table 5 lists the functions and methods implemented within the package which help to fit,
simulate and predict DTMC.
Function Purpose
markovchainFit Function to return fitted Markov chain for a given sequence.
predict Method to calculate predictions from markovchain or
markovchainList objects.
rmarkovchain Function to sample from markovchain or markovchainList objects.
5.1. Simulation
Simulating a random sequence from an underlying DTMC is quite easy thanks to the function
rmarkovchain. The following code generates a year of weather states according to mcWeather
underlying stochastic process.
iteration values
1 1 H
2 1 I
3 1 D
4 1 D
5 1 D
6 2 H
7 2 H
8 2 H
9 2 H
10 2 D
28 markovchain package: discrete Markov chains in R
Two advance parameters are availabe to the rmarkovchain method which helps you decide
which implementation to use. There are four options available : R, R in parallel, C++
and C++ in parallel. Two boolean parameters useRcpp and parallel will decide which
implementation will be used. Default is useRcpp = TRUE and parallel = FALSE i.e. C++
implementation. The C++ implementation is generally faster than the R implementation. If
you have multicore processors then you can take advantage of parallel parameter by setting
it to TRUE. When both Rcpp=TRUE and parallel=TRUE the parallelization has been carried
out using RcppParallel package (Allaire, Francois, Ushey, Vandenbrouck, Geelnard, and Intel
2016).
5.2. Estimation
A time homogeneous Markov chain can be fit from given data. Four methods have been
implemented within current version of markovchain package: maximum likelihood, maximum
likelihood with Laplace smoothing, Bootstrap approach, maximum a posteriori.
Equation 13 shows the maximum likelihood estimator (MLE) of the pij entry, where the nij
element consists in the number sequences (Xt = si , Xt+1 = sj ) found in the sample, that is
nij
p̂M
ij
LE
= k
. (13)
P
niu
u=1
p̂M
ij
LE
SEij = √ (14)
nij
Weather MLE
A 3 - dimensional discrete Markov Chain defined by the following states:
cloudy, rain, sunny
The transition matrix (by rows) is defined as follows:
cloudy rain sunny
cloudy 0.3478261 0.3304348 0.3217391
rain 0.5125000 0.3125000 0.1750000
sunny 0.1952663 0.1065089 0.6982249
R> weatherFittedMLE$standardError
The Laplace smoothing approach is a variation of the MLE, where the nij is substituted by
nij + α (see Equation 15), being α an arbitrary positive stabilizing parameter.
nij + α
p̂LS
ij = k
(15)
P
(niu + α)
u=1
Weather LAPLACE
A 3 - dimensional discrete Markov Chain defined by the following states:
cloudy, rain, sunny
The transition matrix (by rows) is defined as follows:
cloudy rain sunny
cloudy 0.3478223 0.3304355 0.3217422
rain 0.5124328 0.3125078 0.1750594
sunny 0.1952908 0.1065491 0.6981601
(NOTE: The Confidence Interval option is enabled by default. Remove this option to fasten
computations.) Both MLE and Laplace approach are based on the createSequenceMatrix
functions that returns the raw counts transition matrix.
stringchar could contain NA values, and the transitions containing NA would be ignored.
An issue occurs when the sample contains only one realization of a state (say Xβ ) which is
located at the end of the data sequence, since it yields to a row of zero (no sample to estimate
the conditional distribution of the transition). In this case the estimated transition matrix is
corrected assuming pβ,j = 1/k, being k the possible states.
Create sequence matrix can also be used to obtain raw count transition matrices from a given
n ∗ 2 matrix as the following example shows:
R> myMatr<-matrix(c("a","b","b","a","a","b","b","b","b","a","a","a","b","a"),ncol=2)
R> createSequenceMatrix(stringchar = myMatr,toRowProbs = TRUE)
a b
a 0.6666667 0.3333333
b 0.5000000 0.5000000
30 markovchain package: discrete Markov chains in R
A bootstrap estimation approach has been developed within the package in order to provide
an indication of the variability of p̂ij estimates. The bootstrap approach implemented within
the markovchain package follows these steps:
1. bootstrap the data sequences following the conditional distributions of states estima-
ted from the original one. The default bootstrap samples is 10, as specified in nboot
parameter of markovchainFit function.
2. apply MLE estimation on bootstrapped data sequences that are saved in bootStrapSamples
slot of the returned list.
3. the pBOOT ST RAP ij is the average of all pM LE ij across the bootStrapSamples list, nor-
malized by row. A standardError of pM LE ˆ estimate is provided as well.
ij
BootStrap Estimate
A 3 - dimensional discrete Markov Chain defined by the following states:
cloudy, rain, sunny
The transition matrix (by rows) is defined as follows:
cloudy rain sunny
cloudy 0.3473521 0.3351076 0.3175402
rain 0.5174037 0.3084494 0.1741469
sunny 0.1891193 0.1095270 0.7013537
R> weatherFittedBOOT$standardError
The bootstrapping process can be done in parallel thanks to RcppParallel package (Allaire
et al. 2016). Parallelized implementation is definitively suggested when the data sample size
or the required number of bootstrap runs is high.
The parallel bootstrapping uses all the available cores on a machine by default. However, it is
also possible to tune the number of threads used. Note that this should be done in R before
calling the markovchainFit function. For example, the following code will set the number of
threads to 4.
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 31
R> RcppParallel::setNumThreads(2)
where nij is the entry of the frequency matrix and pij is the entry of the transition probability
matrix.
R> weatherFittedMLE$logLikelihood
[1] -343.7674
R> weatherFittedBOOT$logLikelihood
[1] -343.8035
Confidence matrices of estimated parameters (parametric for MLE, non - parametric for
BootStrap) are available as well. The confidenceInterval is provided with the two matrices:
lowerEndpointMatrix and upperEndpointMatrix. The confidence level (CL) is 0.95 by
default and can be given as an argument of the function markovchainFit. This is used to
obtain the standard score (z-score). Equations 17 and 18 (Skuriat-Olechnowska 2005) show
the confidenceInterval of a fitting. Note that each entry of the matrices is bounded between
0 and 1.
R> weatherFittedMLE$confidenceInterval
NULL
R> weatherFittedBOOT$confidenceInterval
$confidenceLevel
[1] 0.95
$lowerEndpointMatrix
cloudy rain sunny
cloudy 0.3290363 0.3148453 0.2993677
rain 0.4980677 0.2867110 0.1575858
sunny 0.1793644 0.1023107 0.6898798
32 markovchain package: discrete Markov chains in R
$upperEndpointMatrix
cloudy rain sunny
cloudy 0.3656680 0.3553700 0.3357127
rain 0.5367397 0.3301878 0.1907080
sunny 0.1988742 0.1167433 0.7128276
R> multinomialConfidenceIntervals(transitionMatrix =
R+ weatherFittedMLE$estimate@transitionMatrix,
R+ countsTransitionMatrix = createSequenceMatrix(weathersOfDays))
$confidenceLevel
[1] 0.95
$lowerEndpointMatrix
cloudy rain sunny
cloudy 0.2521739 0.23478261 0.2260870
rain 0.4125000 0.21250000 0.0750000
sunny 0.1301775 0.04142012 0.6331361
$upperEndpointMatrix
cloudy rain sunny
cloudy 0.4491348 0.4317435 0.4230479
rain 0.6361202 0.4361202 0.2986202
sunny 0.2652502 0.1764928 0.7682088
The functions for fitting DTMC have mostly been rewritten in C++ using Rcpp Eddelbuettel
(2013) since version 0.2.
It is also possible to fit a DTMC object from matrix or data.frame objects as shown in
following code.
R> data(holson)
R> singleMc<-markovchainFit(data=holson[,2:12],name="holson")
R> mcListFit<-markovchainListFit(data=holson[,2:6],name="holson")
R> mcListFit$estimate
Markovchain 2
Unnamed Markov chain
A 3 - dimensional discrete Markov Chain defined by the following states:
1, 2, 3
The transition matrix (by rows) is defined as follows:
1 2 3
1 0.9323410 0.0676590 0.0000000
2 0.2551724 0.5103448 0.2344828
3 0.0000000 0.0862069 0.9137931
Markovchain 3
Unnamed Markov chain
A 3 - dimensional discrete Markov Chain defined by the following states:
1, 2, 3
The transition matrix (by rows) is defined as follows:
1 2 3
1 0.94765840 0.04820937 0.004132231
2 0.26119403 0.66417910 0.074626866
3 0.01428571 0.13571429 0.850000000
Markovchain 4
Unnamed Markov chain
A 3 - dimensional discrete Markov Chain defined by the following states:
1, 2, 3
The transition matrix (by rows) is defined as follows:
1 2 3
1 0.9172414 0.07724138 0.005517241
2 0.1678322 0.60839161 0.223776224
3 0.0000000 0.03030303 0.969696970
Finally, given a list object, it is possible to fit a markovchain object or to obtain the raw
transition matrix.
R> c1<-c("a","b","a","a","c","c","a")
R> c2<-c("b")
R> c3<-c("c","a","a","c")
R> c4<-c("b","a","b","a","a","c","b")
R> c5<-c("a","a","c",NA)
34 markovchain package: discrete Markov chains in R
R> c6<-c("b","c","b","c","a")
R> mylist<-list(c1,c2,c3,c4,c5,c6)
R> mylistMc<-markovchainFit(data=mylist)
R> mylistMc
$estimate
MLE Fit
A 3 - dimensional discrete Markov Chain defined by the following states:
a, b, c
The transition matrix (by rows) is defined as follows:
a b c
a 0.4 0.2000000 0.4000000
b 0.6 0.0000000 0.4000000
c 0.5 0.3333333 0.1666667
$standardError
a b c
a 0.2000000 0.1414214 0.2000000
b 0.3464102 0.0000000 0.2828427
c 0.2886751 0.2357023 0.1666667
$confidenceLevel
[1] 0.95
$lowerEndpointMatrix
a b c
a 0.07102927 0 0.07102927
b 0.03020599 0 0.00000000
c 0.02517166 0 0.00000000
$upperEndpointMatrix
a b c
a 0.7289707 0.4326174 0.7289707
b 1.0000000 0.0000000 0.8652349
c 0.9748283 0.7210291 0.4408089
R> markovchainListFit(data=mylist)
$estimate
list of Markov chain(s)
Markovchain 1
Unnamed Markov chain
A 3 - dimensional discrete Markov Chain defined by the following states:
a, b, c
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 35
Markovchain 2
Unnamed Markov chain
A 3 - dimensional discrete Markov Chain defined by the following states:
a, b, c
The transition matrix (by rows) is defined as follows:
a b c
a 0.3333333 0.3333333 0.3333333
b 1.0000000 0.0000000 0.0000000
c 0.0000000 1.0000000 0.0000000
Markovchain 3
Unnamed Markov chain
A 3 - dimensional discrete Markov Chain defined by the following states:
a, b, c
The transition matrix (by rows) is defined as follows:
a b c
a 0.5000000 0.0000000 0.5000000
b 0.5000000 0.0000000 0.5000000
c 0.3333333 0.3333333 0.3333333
Markovchain 4
Unnamed Markov chain
A 2 - dimensional discrete Markov Chain defined by the following states:
a, c
The transition matrix (by rows) is defined as follows:
a c
a 0.5 0.5
c 1.0 0.0
Markovchain 5
Unnamed Markov chain
A 2 - dimensional discrete Markov Chain defined by the following states:
a, c
The transition matrix (by rows) is defined as follows:
a c
a 0 1
c 0 1
Markovchain 6
Unnamed Markov chain
A 3 - dimensional discrete Markov Chain defined by the following states:
36 markovchain package: discrete Markov chains in R
a, b, c
The transition matrix (by rows) is defined as follows:
a b c
a 0.3333333 0.3333333 0.3333333
b 0.3333333 0.3333333 0.3333333
c 0.5000000 0.5000000 0.0000000
If any transition contains NA, it will be ignored in the results as the above example showed.
5.3. Prediction
The n-step forward predictions can be obtained using the predict methods explicitly written
for markovchain and markovchainList objects. The prediction is the mode of the conditional
distribution of Xt+1 given Xt = sj , being sj the last realization of the DTMC (homogeneous
or non-homogeneous).
The prediction has stopped at time sequence since the underlying non-homogeneous Markov
chain has a length of four. In order to continue five years ahead, the continue=TRUE parameter
setting makes the predict method keeping to use the last markovchain in the sequence list.
and the divergence test for empirically estimated transition matrices (divergenceTest). Most
of such tests are based on the χ2 statistics. Relevand references are Kullback et al. (1962)
and Anderson and Goodman (1957).
All such tests have been designed for small samples, since it is easy to detect departures
from Markov property as long as the sample size increases. In addition, the accuracy of
the statistical inference functions has been questioned and will be thoroughly investigated in
future versions of the package.
R> sample_sequence<-c("a", "b", "a", "a", "a", "a", "b", "a", "b", "a",
R+ "b", "a", "a", "b", "b", "b", "a")
R> verifyMarkovProperty(sample_sequence)
Table 6: Contingency table to assess the order for the present state a.
Using the table, the function performs the χ2 test by calling the chisq.test function. This
test returns a list of the chi-squared value and the p-value. If the p-value is greater than the
given significance level, we cannot reject the hypothesis that the sequence is of first order.
38 markovchain package: discrete Markov chains in R
R> data(rain)
R> assessOrder(rain$rain)
For each possible state, we construct a contingency table of the estimated transition proba-
bilities over time as shown in Table 7.
Using the table, the function performs the χ2 test by calling the chisq.test function. This
test returns a list of the chi-squared value and the p-value. If the p-value is greater than the
given significance level, we cannot reject the hypothesis that the sequence is stationary.
R> sequence<-c(0,1,2,2,1,0,0,0,0,0,0,1,2,2,2,1,0,0,1,0,0,0,0,0,0,1,1,
R+ 2,0,0,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,2,1,0,
R+ 0,2,1,0,0,0,0,0,0,1,1,1,2,2,0,0,2,1,1,1,1,2,1,1,1,1,1,1,1,1,1,0,2,
R+ 0,1,1,0,0,0,1,2,2,0,0,0,0,0,0,2,2,2,1,1,1,1,0,1,1,1,1,0,0,2,1,1,
R+ 0,0,0,0,0,2,2,1,1,1,1,1,2,1,2,0,0,0,1,2,2,2,0,0,0,1,1)
R> mc=matrix(c(5/8,1/4,1/8,1/4,1/2,1/4,1/4,3/8,3/8),byrow=TRUE, nrow=3)
R> rownames(mc)<-colnames(mc)<-0:2; theoreticalMc<-as(mc, "markovchain")
R> verifyEmpiricalToTheoretical(data=sequence,object=theoreticalMc)
$statistic
0
6.551795
40 markovchain package: discrete Markov chains in R
$dof
[1] 6
$pvalue
0
0.3642899
R> data(kullback)
R> verifyHomogeneity(inputList=kullback,verbose=TRUE)
$statistic
[1] 275.9963
$dof
[1] 35
$pvalue
[1] 0
Intro
The markovchain package provides functionality for continuous time Markov chains (CTMCs).
CTMCs are a generalisation of discrete time Markov chains (DTMCs) in that we allow time
to be continuous. We assume a finite state space S (for an infinite state space wouldn’t fit in
memory). We can think of CTMCs as Markov chains in which state transitions can happen
at any time.
More formally, we would like our CTMCs to satisfy the following two properties:
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 41
❼ The Markov property - let FX(s) denote the information about X upto time s. Let
j ∈ S and s ≤ t. Then, P (X(t) = j|FX(s) ) = P (X(t) = j|X(s)).
❼ Time homogenity - P (X(t) = j|X(s) = k) = P (X(t − s) = j|X(0) = k).
Stationary Distributions
The following theorem guarantees the existence of a unique stationary distribution for CT-
MCs. Note that X(t) being irreducible and recurrent is the same as Xn (t) being irreducible
and recurrent.
Suppose that X(t) is irreducible and recurrent. Then X(t) has an invariant measure η, which
is unique up to multiplicative factors. Moreover, for each k ∈ S, we have
πk
ηk =
λ(k)
42 markovchain package: discrete Markov chains in R
where π is the unique invariant measure of the embedded discrete time Markov chain Xn.
Finally, η satisfies
0 < ηj < ∞, ∀j ∈ S
P
and if i ηi < ∞ then η can be normalised to get a stationary distribution.
Estimation
Let the data set be D = {(s0 , t0 ), (s1 , t1 ), ..., (sN −1 , tN −1 )} where N = |D|. Each si is a state
from the state space S and during the time [ti , ti+1 ] the chain is in state si . Let the parameters
be represented by θ = {λ, P } where λ is the vector of holding parameters for each state and
P the transition matrix of the embedded discrete time Markov chain.
Then the probability is given by
P r(D|θ) ∝ λ(s0 )e−λ(s0 )(t1 −t0 ) P r(s1 |s0 ) · . . . · λ(sN −2 )e−λ(sN −2 )(tN −1 −tN −2 ) P r(sN −1 |sN −2 )
Let n(j|i) denote the number of i->j transitions in D, and n(i) the number of times si occurs
in D. Let t(si ) denote the total time the chain spends in state si .
Then the MLEs are given by
ˆ = n(s) , P r(j|i)
λ(s) ˆ =
n(j|i)
t(s) n(i)
Probability at time t
The package provides a function probabilityatT to calculate probability of every state ac-
cording to given ctmc object. Here we use Kolmogorov’s backward equation P (t) = P (0)etQ
for t ≥ 0 and P (0) = I. Here P (t) is the transition function at time t. The value P (t)[i][j]
at time P (t) describes the probability of the state at time t to be eqaul to j if it was equal
to i at time t = 0. It takes care of the case when ctmc object has a generator represented
by columns. If inital state is not provided, the function returns the whole transition matrix
P (t).
Examples
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 43
To create a CTMC object, you need to provide a valid generator matrix, say Q. The CTMC
object has the following slots - states, generator, byrow, name (look at the documentation
object for further details). Consider the following example in which we aim to model the
transition of a molecule from the σ state to the σ ∗ state. When in the former state, if it
absorbs sufficient energy, it can make the jump to the latter state and remains there for some
time before transitioning back to the original state. Let us model this by a CTMC:
To generate random CTMC transitions, we provide an initial distribution of the states. This
must be in the same order as the dimnames of the generator. The output can be returned
either as a list or a data frame.
states time
1 sigma_star 0.319699317028095
2 sigma 1.08757360349614
3 sigma_star 1.22777987434733
n represents the number of samples to generate. There is an optional argument T for rctmc.
It represents the time of termination of the simulation. To use this feature, set n to a very
high value, say Inf (since we do not know the number of transitions before hand) and set T
accordingly.
[[1]]
[1] "sigma" "sigma_star" "sigma" "sigma_star" "sigma"
[6] "sigma_star" "sigma" "sigma_star"
[[2]]
[1] 0.0000000 0.3245419 0.6881124 1.0515503 1.4466669 1.4591308 1.5751001
[8] 1.8251498
R> steadyStates(molecularCTMC)
44 markovchain package: discrete Markov chains in R
sigma sigma_star
[1,] 0.25 0.75
For fitting, use the ctmcFit function. It returns the MLE values for the parameters along
with the confidence intervals.
R> data <- list(c("a", "b", "c", "a", "b", "a", "c", "b", "c"),
R+ c(0, 0.8, 2.1, 2.4, 4, 5, 5.9, 8.2, 9))
R> ctmcFit(data)
$estimate
An object of class "ctmc"
Slot "states":
[1] "a" "b" "c"
Slot "byrow":
[1] TRUE
Slot "generator":
a b c
a -0.9090909 0.6060606 0.3030303
b 0.3225806 -0.9677419 0.6451613
c 0.3846154 0.3846154 -0.7692308
Slot "name":
[1] ""
$errors
$errors$dtmcConfidenceInterval
$errors$dtmcConfidenceInterval$confidenceLevel
[1] 0.95
$errors$dtmcConfidenceInterval$lowerEndpointMatrix
a b c
a 0 0 0
b 0 0 0
c 0 0 0
$errors$dtmcConfidenceInterval$upperEndpointMatrix
a b c
a 0.0000000 1 0.8816179
b 0.8816179 0 1.0000000
c 1.0000000 1 0.0000000
$errors$lambdaConfidenceInterval
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 45
$errors$lambdaConfidenceInterval$lowerEndpointVector
[1] 0.04576665 0.04871934 0.00000000
$errors$lambdaConfidenceInterval$upperEndpointVector
[1] 1 1 1
One approach to obtain the generator matrix is to apply the logm function from the expm
package on a transition matrix. Numeric issues arise, see Israel, Rosenthal, and Wei (2001).
For example, applying the standard method (‘Higham08’) on mcWeather raises an error, whilst
the alternative method (eigenvalue decomposition) is ok. The following code estimates the
generator matrix of the mcWeather transition matrix.
The ctmcd package (Pfeuffer 2017) provides various functions to estimate the generator matrix
(GM) of a CTMC process using different methods. The following code provides a way to join
markovchain and ctmcd computations.
R> require(ctmcd)
R> require(expm)
expm
The following code returns the pseudo bayesian estimate of the transition matrix:
s1 s2
s1 0.17758007 -0.17758007
s2 -0.06885246 0.06885246
R+ apriori = aprioriMc@transitionMatrix
R+ ) - trueMc@transitionMatrix
s1 s2
s1 -0.02298889 0.02298889
s2 -0.04711818 0.04711818
s1 s2
s1 -0.01138052 0.01138052
s2 -0.01003458 0.01003458
D = s0 s1 ...sN −1 , st ∈ A
θ = {p(s|u), s ∈ A, u ∈ A}
P
where s∈A p(s|u) = 1 for each u ∈ A.
Our objective is to find θ which maximises the posterior. That is, if our solution is denoted
by θ̂, then
θ̂ = argmaxP (θ|D)
θ
where the search space is the set of right stochastic matrices of dimension |A|x|A|.
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 49
P
n(u, s) denotes the number of times the word us occurs in D and n(u) = s∈A n(u, s). The
hyperparameters are similarly denoted by α(u, s) and α(u) respectively.
Methods
Given D, its likelihood conditioned on the observed initial state in D is given by
Y Y
P (D|θ) = p(s|u)n(u,s)
s∈A u∈A
Conjugate priors are used to model the prior P (θ). The reasons are two fold:
1. Exact expressions can be derived for the MAP estimates, expectations and even vari-
ances
2. Model order selection/comparison can be implemented easily (available in a future re-
lease of the package)
The hyperparameters determine the form of the prior distribution, which is a product of
Dirichlet distributions
Yn Γ(α(u)) Y o
P (θ) = Q p(s|u)α(u,s))−1
u∈A s∈A Γ(α(u, s)) s∈A
where Γ(.) is the Gamma function. The hyperparameters are specified using the hyperparam
argument in the markovchainFit function. If this argument is not specified, then a default
value of 1 is assigned to each hyperparameter resulting in the prior distribution of each chain
parameter to be uniform over [0, 1].
Given the likelihood and the prior as described above, the evidence P (D) is simply given by
Z
P (D) = P (D|θ)P (θ)dθ
which simplifies to
Q
Γ(α(u)) s∈A Γ(n(u, s)
+ α(u, s)) o
Yn
P (D) = Q
u∈A s∈A Γ(α(u, s)) Γ(α(u) + n(u))
Using Bayes’ theorem, the posterior now becomes (thanks to the choice of conjugate priors)
Yn Γ(n(u) + α(u)) Y o
P (θ|D) = Q p(s|u)n(u,s)+α(u,s))−1
u∈A s∈A Γ(n(u, s) + α(u, s)) s∈A
The square root of this quantity is the standard error, which is returned by the function.
The confidence intervals are constructed by computing the inverse of the beta integral.
Predictive distribution
Given the old data set, the probability of observing new data is P (D′ |D) where D′ is the new
data set. Let m(u, s), m(u) denote the corresponding counts for the new data. Then,
Z
P (D |D) = P (D′ |θ)P (θ|D)dθ
′
We already know the expressions for both quantities in the integral and it turns out to be
similar to evaluating the evidence
Q
Γ(α(u)) s∈A Γ(n(u, s) + m(u, s) + α(u, s))
Yn o
′
P (D |D) = Q
u∈A s∈A Γ(α(u, s)) Γ(α(u) + n(u) + m(u))
holds, the function accepts as input the belief matrix as well as a scaling vector (serves as a
proxy for α(.)) and proceeds to compute α(., .).
Alternatively, the function accepts a data sample and infers the hyperparameters from it.
Since the mode of a parameter (with respect to the prior distribution) is proportional to one
less than the corresponding hyperparameter, we set
α(u, s) − 1 = m(u, s)
where m(u, s) is the u → s transition count in the data sample. This is regarded as a ‘fake
count’ which helps α(u, s) to reflect knowledge of the data sample.
For the purpose of this section, we shall continue to use the weather of days example intro-
duced in the main vignette of the package (reproduced above for convenience).
Let us invoke the fit function to estimate the MAP parameters with 92% confidence bounds
and hyperparameters as shown below, based on the first 200 days of the weather data. Ad-
ditionally, let us find out what the probability is of observing the weather data for the next
165 days. The usage would be as follows
R> hyperMatrix<-matrix(c(1, 1, 2,
R+ 3, 2, 1,
R+ 2, 2, 3),
R+ nrow = 3, byrow = TRUE,
R+ dimnames = list(weatherStates,weatherStates))
R> markovchainFit(weathersOfDays[1:200], method = "map",
R+ confidencelevel = 0.92, hyperparam = hyperMatrix)
$estimate
Bayesian Fit
A 3 - dimensional discrete Markov Chain defined by the following states:
cloudy, rain, sunny
The transition matrix (by rows) is defined as follows:
cloudy rain sunny
cloudy 0.4126984 0.2698413 0.3174603
rain 0.4634146 0.3902439 0.1463415
sunny 0.1553398 0.0776699 0.7669903
$expectedValue
cloudy rain sunny
cloudy 0.4090909 0.27272727 0.3181818
rain 0.4545455 0.38636364 0.1590909
sunny 0.1603774 0.08490566 0.7547170
$standardError
[,1] [,2] [,3]
52 markovchain package: discrete Markov chains in R
$confidenceInterval
$confidenceInterval$confidenceLevel
[1] 0.92
$confidenceInterval$lowerEndpointMatrix
[,1] [,2] [,3]
[1,] 0.32117517 0.1801276 0.2266730
[2,] 0.35036388 0.2810536 0.0000000
[3,] 0.08492395 0.0000000 0.6946682
$confidenceInterval$upperEndpointMatrix
[,1] [,2] [,3]
[1,] 0.5508869 0.3696812 0.4276070
[2,] 1.0000000 0.5612194 0.2404126
[3,] 0.2143174 0.1249744 1.0000000
$logLikelihood
[1] -170.3136
R> predictiveDistribution(weathersOfDays[1:200],
R+ weathersOfDays[201:365],hyperparam = hyperMatrix)
[1] -163.8482
The results should not change after permuting the dimensions of the matrix.
$estimate
Bayesian Fit
A 3 - dimensional discrete Markov Chain defined by the following states:
cloudy, rain, sunny
The transition matrix (by rows) is defined as follows:
cloudy rain sunny
cloudy 0.4126984 0.2698413 0.3174603
rain 0.4634146 0.3902439 0.1463415
sunny 0.1553398 0.0776699 0.7669903
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 53
$expectedValue
cloudy rain sunny
cloudy 0.4090909 0.27272727 0.3181818
rain 0.4545455 0.38636364 0.1590909
sunny 0.1603774 0.08490566 0.7547170
$standardError
[,1] [,2] [,3]
[1,] 0.06006657 0.05440960 0.05690292
[2,] 0.07422696 0.07258509 0.05452441
[3,] 0.03547494 0.02694693 0.04159431
$confidenceInterval
$confidenceInterval$confidenceLevel
[1] 0.92
$confidenceInterval$lowerEndpointMatrix
[,1] [,2] [,3]
[1,] 0.32117517 0.1801276 0.2266730
[2,] 0.35036388 0.2810536 0.0000000
[3,] 0.08492395 0.0000000 0.6946682
$confidenceInterval$upperEndpointMatrix
[,1] [,2] [,3]
[1,] 0.5508869 0.3696812 0.4276070
[2,] 1.0000000 0.5612194 0.2404126
[3,] 0.2143174 0.1249744 1.0000000
$logLikelihood
[1] -170.3136
R> predictiveDistribution(weathersOfDays[1:200],
R+ weathersOfDays[201:365],hyperparam = hyperMatrix2)
[1] -163.8482
Note that the predictive probability is very small. However, this can be useful when comparing
model orders. Suppose we have an idea of the (prior) transition matrix corresponding to the
expected value of the parameters, and have a data set from which we want to deduce the
MAP estimates. We can infer the hyperparameters from this known transition matrix itself,
and use this to obtain our MAP estimates.
$scaledInference
cloudy rain sunny
54 markovchain package: discrete Markov chains in R
cloudy 4 3 3
rain 4 4 2
sunny 2 1 7
$dataInference
cloudy sunny
cloudy 3 4
sunny 3 8
Now we can safely use hyperMatrix3 and hyperMatrix4 with markovchainFit (in the
hyperparam argument).
Supposing we don’t provide any hyperparameters, then the prior is uniform. This is the same
as maximum likelihood.
R> data(preproglucacon)
R> preproglucacon <- preproglucacon[[2]]
R> MLEest <- markovchainFit(preproglucacon, method = "mle")
R> MAPest <- markovchainFit(preproglucacon, method = "map")
R> MLEest$estimate
MLE Fit
A 4 - dimensional discrete Markov Chain defined by the following states:
A, C, G, T
The transition matrix (by rows) is defined as follows:
A C G T
A 0.3585271 0.1434109 0.16666667 0.3313953
C 0.3840304 0.1558935 0.02281369 0.4372624
G 0.3053097 0.1991150 0.15044248 0.3451327
T 0.2844523 0.1819788 0.17667845 0.3568905
R> MAPest$estimate
Bayesian Fit
A 4 - dimensional discrete Markov Chain defined by the following states:
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 55
A, C, G, T
The transition matrix (by rows) is defined as follows:
A C G T
A 0.3585271 0.1434109 0.16666667 0.3313953
C 0.3840304 0.1558935 0.02281369 0.4372624
G 0.3053097 0.1991150 0.15044248 0.3451327
T 0.2844523 0.1819788 0.17667845 0.3568905
6. Applications
This section shows applications of DTMC in various fields.
Land of Oz
The Land of Oz is acknowledged not to have ideal weather conditions at all: the weather is
snowy or rainy very often and, once more, there are never two nice days in a row. Consider
three weather states: rainy, nice and snowy. Let the transition matrix be as in the following:
Given that today it is a nice day, the corresponding stochastic row vector is w0 = (0 , 1 , 0)
and the forecast after 1, 2 and 3 days are given by
As can be seen from w1 , if in the Land of Oz today is a nice day, tomorrow it will rain or
snow with probability 1. One week later, the prediction can be computed as
The steady state of the chain can be computed by means of the steadyStates method.
Note that, from the seventh day on, the predicted probabilities are substantially equal to the
steady state of the chain and they don’t depend from the starting point, as the following code
shows.
0 1-5 6+
548 295 253
Alofi MC
A 3 - dimensional discrete Markov Chain defined by the following states:
0, 1-5, 6+
The transition matrix (by rows) is defined as follows:
0 1-5 6+
0 0.6605839 0.2299270 0.1094891
1-5 0.4625850 0.3061224 0.2312925
6+ 0.1976285 0.3122530 0.4901186
The long term daily rainfall distribution is obtained by means of the steadyStates method.
R> steadyStates(mcAlofi)
0 1-5 6+
[1,] 0.5008871 0.2693656 0.2297473
Finance
Credit ratings transitions have been successfully modelled with discrete time Markov chains.
Some rating agencies publish transition matrices that show the empirical transition probabi-
lities across credit ratings. The example that follows comes from CreditMetrics R package
(Wittmann 2007), carrying Standard & Poor’s published data.
R> rc <- c("AAA", "AA", "A", "BBB", "BB", "B", "CCC", "D")
R> creditMatrix <- matrix(
R+ c(90.81, 8.33, 0.68, 0.06, 0.08, 0.02, 0.01, 0.01,
R+ 0.70, 90.65, 7.79, 0.64, 0.06, 0.13, 0.02, 0.01,
R+ 0.09, 2.27, 91.05, 5.52, 0.74, 0.26, 0.01, 0.06,
R+ 0.02, 0.33, 5.95, 85.93, 5.30, 1.17, 1.12, 0.18,
R+ 0.03, 0.14, 0.67, 7.73, 80.53, 8.84, 1.00, 1.06,
R+ 0.01, 0.11, 0.24, 0.43, 6.48, 83.46, 4.07, 5.20,
R+ 0.21, 0, 0.22, 1.30, 2.38, 11.24, 64.86, 19.79,
R+ 0, 0, 0, 0, 0, 0, 0, 100
R+ )/100, 8, 8, dimnames = list(rc, rc), byrow = TRUE)
58 markovchain package: discrete Markov chains in R
It is easy to convert such matrices into markovchain objects and to perform some analyses
[1] "D"
Economics
For a recent application of markovchain in Economic, see Jacob (2014).
A dynamic system generates two kinds of economic effects (Bard 2000):
m
X
ci = cSi + R
Cij pij . (22)
j=1
Let c̄ = [ci ] and let ei be the vector valued 1 in the initial state and 0 in all other, then, if fn is
the random variable representing the economic return associated with the stochastic process
at time n, Equation (23) holds:
The following example assumes that a telephone company models the transition probabilities
between customer/non-customer status by matrix P and the cost associated to states by
matrix M .
If the average revenue for existing customer is +100, the cost per state is computed as follows.
For an existing customer, the expected gain (loss) at the fifth year is given by the following
code.
[1] 48.96009
Assuming that the a-priori claim frequency per car-year is 0.05 in the class (being the class
the group of policyholders that share the same common characteristics), the underlying BM
transition matrix and its underlying steady state are as follows.
If the underlying BM coefficients of the class are 0.5, 0.7, 0.9, 1.0, 1.25, this means that the
average BM coefficient applied on the long run to the class is given by
[1] 0.534469
This means that the average premium paid by policyholders in the portfolio almost halves in
the long run.
The data shows the probability of transition between the state of (A)ctive, to (I)ll and Dead.
It is easy to complete the transition matrix.
R> ltcDemo<-transform(ltcDemo,
R+ pIA=0,
R+ pII=1-pID,
R+ pDD=1,
R+ pDA=0,
R+ pDI=0)
Now we build a function that returns the transition during the t + 1 th year, assuming that
the subject has attained year t.
R> possibleStates<-c("A","I","D")
R> getMc4Age<-function(age) {
R+ transitionsAtAge<-ltcDemo[ltcDemo$age==age,]
R+
R+ myTransMatr<-matrix(0, nrow=3,ncol = 3,
R+ dimnames = list(possibleStates, possibleStates))
R+ myTransMatr[1,1]<-transitionsAtAge$pAA[1]
R+ myTransMatr[1,2]<-transitionsAtAge$pAI[1]
R+ myTransMatr[1,3]<-transitionsAtAge$pAD[1]
R+ myTransMatr[2,2]<-transitionsAtAge$pII[1]
R+ myTransMatr[2,3]<-transitionsAtAge$pID[1]
R+ myTransMatr[3,3]<-1
R+
R+ myMc<-new("markovchain", transitionMatrix = myTransMatr,
R+ states = possibleStates,
R+ name = paste("Age",age,"transition matrix"))
R+
R+ return(myMc)
R+
R+ }
Cause transitions are not homogeneous across ages, we use a markovchainList object to
describe the transition probabilities for a guy starting at age 100.
R> getFullTransitionTable<-function(age){
R+ ageSequence<-seq(from=age, to=120)
R+ k=1
R+ myList=list()
R+ for ( i in ageSequence) {
R+ mc_age_i<-getMc4Age(age = i)
R+ myList[[k]]<-mc_age_i
R+ k=k+1
R+ }
62 markovchain package: discrete Markov chains in R
We can use such transition for simulating ten life trajectories for a guy that begins “active”
(A) aged 100:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] "A" "A" "D" "D" "D" "D" "D" "D" "D" "D" "D" "D" "D"
[2,] "A" "A" "A" "A" "A" "A" "A" "A" "D" "D" "D" "D" "D"
[3,] "A" "A" "A" "A" "D" "D" "D" "D" "D" "D" "D" "D" "D"
[4,] "A" "D" "D" "D" "D" "D" "D" "D" "D" "D" "D" "D" "D"
[5,] "A" "A" "D" "D" "D" "D" "D" "D" "D" "D" "D" "D" "D"
[6,] "A" "A" "D" "D" "D" "D" "D" "D" "D" "D" "D" "D" "D"
[7,] "A" "D" "D" "D" "D" "D" "D" "D" "D" "D" "D" "D" "D"
[8,] "A" "A" "A" "D" "D" "D" "D" "D" "D" "D" "D" "D" "D"
[9,] "A" "A" "A" "A" "A" "D" "D" "D" "D" "D" "D" "D" "D"
[10,] "A" "A" "A" "A" "A" "D" "D" "D" "D" "D" "D" "D" "D"
[,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22]
[1,] "D" "D" "D" "D" "D" "D" "D" "D" "D"
[2,] "D" "D" "D" "D" "D" "D" "D" "D" "D"
[3,] "D" "D" "D" "D" "D" "D" "D" "D" "D"
[4,] "D" "D" "D" "D" "D" "D" "D" "D" "D"
[5,] "D" "D" "D" "D" "D" "D" "D" "D" "D"
[6,] "D" "D" "D" "D" "D" "D" "D" "D" "D"
[7,] "D" "D" "D" "D" "D" "D" "D" "D" "D"
[8,] "D" "D" "D" "D" "D" "D" "D" "D" "D"
[9,] "D" "D" "D" "D" "D" "D" "D" "D" "D"
[10,] "D" "D" "D" "D" "D" "D" "D" "D" "D"
Lets consider 1000 simulated live trajectories, for a healty guy aged 80. We can compute the
expected time a guy will be disabled starting active at age 80.
R> transitionsSince80<-getFullTransitionTable(age=80)
R> lifeTrajectories<-rmarkovchain(n=1e3, object=transitionsSince80,
R+ what="matrix",t0="A",include.t0=TRUE)
R> temp<-matrix(0,nrow=nrow(lifeTrajectories),ncol = ncol(lifeTrajectories))
R> temp[lifeTrajectories=="I"]<-1
R> expected_period_disabled<-mean(rowSums((temp)))
R> expected_period_disabled
[1] 1.156
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 63
Assuming that the health insurance will pay a benefit of 12000 per year disabled and that
the real interest rate is 0.02, we can compute the lump sum premium at 80.
[1] 11426.91
6.4. Sociology
Markov chains have been actively used to model progressions and regressions between social
classes. The first study was performed by Glass and Hall (1954), while a more recent applica-
tion can be found in Jo Blanden and Machin (2005). The table that follows shows the income
quartile of the father when the son was 16 (in 1984) and the income quartile of the son when
aged 30 (in 2000) for the 1970 cohort.
R> data("blanden")
R> mobilityMc <- as(blanden, "markovchain")
R> mobilityMc
R> round(steadyStates(mobilityMc), 2)
Genetics
P. J. Avery and D. A. Henderson (1999) discusses the use of Markov chains in model Prepro-
gucacon gene protein bases sequence. The preproglucacon dataset in markovchain contains
the dataset shown in the package.
64 markovchain package: discrete Markov chains in R
1970 mobility
2nd 0.29
0.11
0.17
0.38
0.22
Top 0.42
0.26 0.16
0.24 0.25
0.28
Bottom 0.21
0.22
0.28
0.25
3rd 0.26
It is possible to model the transition probabilities between bases as shown in the following
code.
Preproglucacon MC
A 4 - dimensional discrete Markov Chain defined by the following states:
A, C, G, T
The transition matrix (by rows) is defined as follows:
A C G T
A 0.3585271 0.1434109 0.16666667 0.3313953
C 0.3840304 0.1558935 0.02281369 0.4372624
G 0.3053097 0.1991150 0.15044248 0.3451327
T 0.2844523 0.1819788 0.17667845 0.3568905
Medicine
Discrete-time Markov chains are also employed to study the progression of chronic diseases.
The following example is taken from B. A. Craig and A. A. Sendi (2002). Starting from six
month follow-up data, the maximum likelihood estimation of the monthly transition matrix is
obtained. This transition matrix aims to describe the monthly progression of CD4-cell counts
of HIV infected subjects.
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 65
for row/column sums to be equal to one is valid up to fifth decimal. Similarly, when extracting
the eigenvectors only the real part is taken.
Such limitations are expected to be overcome in future releases. Similarly, future versions of
the package are expected to improve the code in terms of numerical accuracy and rapidity.
An intitial rewriting of internal function in C++ by means of Rcpp package (Eddelbuettel
2013) has been started.
8. Aknowledgments
The package was selected for Google Summer of Code 2015 support. The authors wish to
thank Michael Cole, Tobi Gutman and Mildenberger Thoralf for their suggestions and bug
checks. A final thanks also to Dr. Simona C. Minotti and Dr. Mirko Signorelli for their
support in drafting this version of the vignettes.
Giorgio Alfredo Spedicato, Tae Seung Kang, Sai Bhargav Yalamanchi, Deepak Yadav 67
References
Adamopoulou P (2018). MmgraphR: Graphing for Markov, Hidden Markov, and Mixture
Transition Distribution Models. R package version 0.3-1, URL https://CRAN.R-project.
org/package=MmgraphR.
Anderson TW, Goodman LA (1957). “Statistical inference about Markov chains.” The Annals
of Mathematical Statistics, pp. 89–110.
Brémaud P (1999). “Discrete-Time Markov Models.” In Markov Chains, pp. 53–93. Springer.
Chambers J (2008). Software for Data Analysis: Programming with R. Statistics and compu-
ting. Springer-Verlag. ISBN 9780387759357.
Ching WK, Ng MK, Fung ES (2008). “Higher-order multivariate Markov chains and their
applications.” Linear Algebra and its Applications, 428(2), 492–507.
Csardi G, Nepusz T (2006). “The igraph Software Package for Complex Network Research.”
InterJournal, Complex Systems, 1695. URL http://igraph.sf.net.
de Wreede LC, Fiocco M, Putter H (2011). “mstate: An R Package for the Analysis of
Competing Risks and Multi-State Models.” Journal of Statistical Software, 38(7), 1–30.
URL http://www.jstatsoft.org/v38/i07/.
Dobrow RP (2016). Introduction to Stochastic Processes with R. John Wiley & Sons.
Eddelbuettel D (2013). Seamless R and C++ Integration with Rcpp. Springer-Verlag, New
York. ISBN 978-1-4614-6867-7.
Feres R (2007). “Notes for Math 450 MATLAB Listings for Markov Chains.” URL http:
//www.math.wustl.edu/~feres/Math450Lect04.pdf.
68 markovchain package: discrete Markov chains in R
Geyer CJ, Johnson LT (2013). mcmc: Markov Chain Monte Carlo. R package version 0.9-2,
URL http://CRAN.R-project.org/package=mcmc.
Grinstead CM, Snell LJ (2006). Grinstead and Snell’s Introduction to Probability. Version
dated 4 july 2006 edition. American Mathematical Society. URL http://math.dartmouth.
edu/~{}prob/prob/prob.pdf.
Hu YT, Kiesel R, Perraudin W (2002). “The estimation of transition matrices for sovereign
credit ratings.” Journal of Banking and Finance, 26(7), 1383–1406. ISSN 03784266. doi:
10.1016/S0378-4266(02)00268-6.
Israel RB, Rosenthal JS, Wei JZ (2001). “Finding generators for Markov chains via empirical
transition matrices, with applications to credit ratings.” Mathematical finance, 11(2), 245–
265.
Jackson CH (2011). “Multi-State Models for Panel Data: The msm Package for R.” Journal
of Statistical Software, 38(8), 1–29. URL http://www.jstatsoft.org/v38/i08/.
Jo Blanden PG, Machin S (2005). “Intergenerational Mobility in Europe and North America.”
Technical report, Center for Economic Performances. URL http://cep.lse.ac.uk/about/
news/IntergenerationalMobility.pdf.
Kullback S, Kupperman M, Ku H (1962). “Tests for Contingency Tables and Marltov Chains.”
Technometrics, 4(4), 573–608.
P J Avery, D A Henderson (1999). “Fitting Markov Chain Models to Discrete State Series.”
Applied Statistics, 48(1), 53–61.
R Core Team (2013). R: A Language and Environment for Statistical Computing. R Founda-
tion for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.
Roebuck P (2011). matlab: MATLAB emulation package. R package version 0.8.9, URL
http://CRAN.R-project.org/package=matlab.
Sison CP, Glaz J (1995). “Simultaneous confidence intervals and sample size determination
for multinomial proportions.” Journal of the American Statistical Association, 90(429),
366–369.
Skuriat-Olechnowska M (2005). Statistical inference and hypothesis testing for Markov chains
with Interval Censoring. diploma thesis, Delft University of Technology.
Spedicato GA (2017). “Discrete Time Markov Chains with R.” The R Journal. URL https:
//journal.r-project.org/archive/2017/RJ-2017-036/index.html.
Wittmann A (2007). CreditMetrics: Functions for Calculating the CreditMetrics Risk Model.
R package version 0.0-2.
Affiliation:
Giorgio Alfredo Spedicato
Ph.D C.Stat ACAS, UnipolSai R&D
Via Firenze 11 Paderno Dugnano 20037 Italy
E-mail: spedygiorgio@gmail.com
URL: www.statisticaladvisor.com
Deepak Yadav
B-Tech student, Computer Science and Engineering
Indian Institute of Technology, Varanasi Uttar Pradesh - 221 005, India
E-mail: deepakyadav.iitbhu@gmail.com