MATH 437/ MATH 535: Applied Stochastic Processes/ Advanced Applied Stochastic Processes
MATH 437/ MATH 535: Applied Stochastic Processes/ Advanced Applied Stochastic Processes
There are basically two different types of problems to estimate parameter for any model,
given any suitable piece of information.
1.1
Forward Problem
These types of problems arise when we do have knowledge of parameter(s) involved in the
model and we are interested in the evolution of that particular process over time. Pictorial
representation of such problems is given in the figure 1.
1.2
Inverse Problem
In these types of problems, given we are aware of the state of the underlying process, we
infer the parameter for that process.
In other words, we minimize the sum of square of errors between the observed data and
models prediction.
N
(
)2
min
P (0)ertn P (tn )
r
n=0
Examples
1. Let X Bernoulli(p). Given knowledge of the parameter p, we can easily sample X
using the following procedure on MATLAB.
Generate U U nif (0, 1)
If U < p then X = 1
else X = 0
end
Now suppose that we have knowledge of the outcomes of three Bernoulli trials (let the
outcomes be X = 1, 0, 1) and we are interested in finding an estimate of the parameter
p. We define the likelihood function as
)
l() = Prob(X|
In this example,
l(p) = Prob(X1 , X2 , X3 , Xn |p)
l(p) =
Prob(Xi |p)
i=1
l
=
p
0 for p. This value of p is known as maximum likelihood estimator. It would be
convenient, however, to take L() = log(l()) as it converts products into sums and
also Max l()=Max log(l()). Now,
In order to obtain a value of p which maximizes the likelihood of data we solve
( ) 6
Prob(X = 6|p)= 10
p (1 p)4
6
0.0001
0.0055
0.036
0.111
0.205
( ) 6
p Prob(X = 6|p)= 10
p (1 p)4
6
0.6
0.2508
0.7
0.2001
0.8
0.0881
0.9
0.0112
Thus the likelihood is highest when p = 0.6. We can also verify this by differentiating
the log-likelihood function L(p) = C + 6log(p) + 4log(1 p) where C is a constant and
(X
)
n
1
i
l(, 2 ) =
e i=1 22
2
n/2
( 2 )
The corresponding log-likelihood function is given by
(Xi )2
n
L(, ) = log(2) nlog( 2 )
2
2 2
i=1
n
In order to obtain maximum likelihood estimator, we differentiate the above loglikelihood function w.r.t. and 2 both and solve for them. Now
L
=0
(Xi )
2
(1) = 0
2 2
i=1
Xi = n
i=1
n
Xi
i=1
Also,
n
+
2 2
L
=0
2
(Xi )2
i=1
n
=0
2 4
(Xi )2 = 2 n
i=1
n
2 =
(Xi )2
i=1
Xi
i=1
= 0.009
And
n
(Xi 0.009)2
2 =
i=1
= 0.53
Let {Xn }
n=0 be a time homogeneous discrete time Markov chain and we define an indicator
function IA as
{
1 xA
IA (x) =
0 otherwise
We also define
nkj =
n.j =
m1
l=0
m
I (Xl+1 = k|Xl = j)
nkj ; nk. =
j=1
k=1
nkj
where nkj accounts for the number of transitions from state j to state k in m time steps.
The maximum likelihood estimator for pkj is now given as
pkj =
nkj
m
nkj
nkj
n.j
(2)
k=1
Let us consider the rainfall model introduced at the beginning of the study of DTMCs. Let
0 and 1, respectively denote no rain and rain on any given day. Suppose that following
observations were made over a course of 368 consecutive days.
State
0
0
175
1
49
Total 224
1
48
96
144
Total
223
145
368
n00
1
n00
n00 + n10
nk0
k=0
25
175
=
p00 =
224
32
Similarly,
p10 =
n10
1
49
7
=
224
32
48
1
=
144
3
96
2
=
144
3
nk0
k=0
p01 =
n01
1
nk1
k=0
p11 =
n11
1
nk1
k=0
25 1
32 3
P = 7 2
32 3
Also,
P 20
(
)
0.6038 0.6038
=
= (0.6038
0.3982 0.3982
0.3982)T
Now we will show that the likelihood function corresponding to pkj is given as
n
l(pkj ) = Prob(nkj |pkj )
pkjkj
j
Let us define pji = Prob(Xn+1 = j|Xn = i) and let {a1 , a2 , a3 , , an+1 } be the observed
sequence in a run of a DTMC. Now the probability of occurence of this particular sequence
is Pa1 Pa2 a1 Pa3 a2 Pan+1 an . Also suppose nji be the number of transitions from i to j i.e.,
am+1 = j when am = i then.
n
Pa1 Pa2 a1 Pa3 a2 Pan+1 an = Pa1
Pjiji
i
Thus
l(pkj )
pkjkj
(3)
m
m
nkj log(pkj )
j=1 k=1
As
pkj = 1, so
k=1
L(pkj ) = C +
m m1
nkj log(pkj ) +
j=1 k=1
(
nmj log 1
j=1
m1
)
pkj
k=1
In the homework, you will show that the maximum likelihood estimator of pkj is indeed
nkj
pkj =
nkj
k
End of Lecture