Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
115 views

MATH 437/ MATH 535: Applied Stochastic Processes/ Advanced Applied Stochastic Processes

This document summarizes key concepts from a lecture on statistical inference for stochastic processes: - There are two types of parameter estimation problems - forward problems where the parameter is known and the process is observed, and inverse problems where the process is observed and the parameter is estimated. - For Markov chains, the maximum likelihood estimate of the transition probability pkj is the number of observed transitions from state j to k divided by the total transitions from state j. - The likelihood function for a Markov chain is proportional to the product of transition probabilities raised to the power of the number of observed transitions.

Uploaded by

Kashif Khalid
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
115 views

MATH 437/ MATH 535: Applied Stochastic Processes/ Advanced Applied Stochastic Processes

This document summarizes key concepts from a lecture on statistical inference for stochastic processes: - There are two types of parameter estimation problems - forward problems where the parameter is known and the process is observed, and inverse problems where the process is observed and the parameter is estimated. - For Markov chains, the maximum likelihood estimate of the transition probability pkj is the number of observed transitions from state j to k divided by the total transitions from state j. - The likelihood function for a Markov chain is proportional to the product of transition probabilities raised to the power of the number of observed transitions.

Uploaded by

Kashif Khalid
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

MATH 437/ MATH 535: Applied Stochastic Processes/

Advanced Applied Stochastic Processes


Lecture 27
April 24, 2014
In this lecture and the next one, we will study statistical inference to estimate parameter for
any process. Specifically, given past data, we will learn how to infer transition probabilities
for a Markov chain.

Estimating Parameters of a Model

There are basically two different types of problems to estimate parameter for any model,
given any suitable piece of information.

1.1

Forward Problem

These types of problems arise when we do have knowledge of parameter(s) involved in the
model and we are interested in the evolution of that particular process over time. Pictorial
representation of such problems is given in the figure 1.

Figure 1: Forward estimation problem


For instance, Let us consider the following differential equation concerning a population
growth
dP
= rP
(1)
dt
If we know value of growth rate, r and the initial population P (0), then we we can easily
observe the evolution of population over time.

1.2

Inverse Problem

In these types of problems, given we are aware of the state of the underlying process, we
infer the parameter for that process.

Figure 2: Inverse estimation problem


For instance in model (1), given we have knowledge of the population at any time i.e. P (tn )
where n = 0, 1, 2, 3, Using inverse problem methodology, one can infer the parameter r
of the model. Usually this is done by finding a best fit of the model which minimizes the
following norm
min ||P (0)ertn P (tn )||2
r

In other words, we minimize the sum of square of errors between the observed data and
models prediction.
N

(
)2
min
P (0)ertn P (tn )
r

n=0

Examples
1. Let X Bernoulli(p). Given knowledge of the parameter p, we can easily sample X
using the following procedure on MATLAB.
Generate U U nif (0, 1)
If U < p then X = 1
else X = 0
end
Now suppose that we have knowledge of the outcomes of three Bernoulli trials (let the
outcomes be X = 1, 0, 1) and we are interested in finding an estimate of the parameter
p. We define the likelihood function as

)
l() = Prob(X|
In this example,
l(p) = Prob(X1 , X2 , X3 , Xn |p)

If the observation are independent of each other then


l(p) = Prob(X1 |p)Prob(X2 |p)Prob(X3 |p) Prob(Xn |p)
n

l(p) =
Prob(Xi |p)
i=1

l
=
p
0 for p. This value of p is known as maximum likelihood estimator. It would be
convenient, however, to take L() = log(l()) as it converts products into sums and
also Max l()=Max log(l()). Now,
In order to obtain a value of p which maximizes the likelihood of data we solve

l(p) = Prob(1, 0, 1|p)


= Prob(1|p)Prob(0|p)Prob(1|p) = p(1 p)p
l(p) = p2 (1 p)
The maximum likelihood estimator is thus obtained as follows
l
=0
p
2p(1 p) + p2 (1) = 0
2
p=
3
The corresponding log-likelihood function is given as
L(p) = 2log(p) + log(1 p)
It can be easily verified that maximum likelihood estimator obtained using log-likelihood
is also equal to 2/3.
2. Let us now consider X bin(10, p) where X = 6 successes, then l(p) = Prob(X =
6|p = p). Using MATLAB, we obtained following results
p
0.1
0.2
0.3
0.4
0.5

( ) 6
Prob(X = 6|p)= 10
p (1 p)4
6
0.0001
0.0055
0.036
0.111
0.205

( ) 6
p Prob(X = 6|p)= 10
p (1 p)4
6
0.6
0.2508
0.7
0.2001
0.8
0.0881
0.9
0.0112

Thus the likelihood is highest when p = 0.6. We can also verify this by differentiating
the log-likelihood function L(p) = C + 6log(p) + 4log(1 p) where C is a constant and

solving it for p as follows:


L
6
4
=0+ +
(1) = 0
p
p 1p
6
4
=
p
1p
10p = 6
p = 0.6
3. As another example, let us now consider X N (, 2 ). In a forward problem, if
we knew and 2 , we can generate the samples from this distribution. On the other
hand, for an inverse problem, we will estimate the parameters and 2 given particular
population data. The likelihood function is given as
l(, 2 ) = Prob(X1 , X2 , X3 , Xn |, 2 )
If the observation are independent of each other then
l(, 2 ) = Prob(X1 |, 2 )Prob(X2 |, 2 )Prob(X3 |, 2 ) Prob(Xn |, 2 )
(X1 )2
(X2 )2
(Xn )2
1
1
1
l(, 2 ) =
e 22
e 22
e 22
2
2
2
2

(X
)
n
1
i
l(, 2 ) =
e i=1 22
2
n/2
( 2 )
The corresponding log-likelihood function is given by
(Xi )2
n
L(, ) = log(2) nlog( 2 )
2
2 2
i=1
n

In order to obtain maximum likelihood estimator, we differentiate the above loglikelihood function w.r.t. and 2 both and solve for them. Now
L
=0

(Xi )

2
(1) = 0
2 2
i=1

Xi = n

i=1
n

Xi

i=1

Also,

n
+
2 2

L
=0
2
(Xi )2

i=1
n

=0

2 4

(Xi )2 = 2 n

i=1
n

2 =

(Xi )2

i=1

For example, if X N (, 2 ) and the data be given as


X : {0.075, 0.851, 0.92, 0.96, 0.42, 1.22, 0.1, 0.21, 0.55, 0.78}
Best estimates of the parameters are as follows
n

Xi

i=1

= 0.009

And
n

(Xi 0.009)2

2 =

i=1

= 0.53

Discrete Time Markov Chains

Let {Xn }
n=0 be a time homogeneous discrete time Markov chain and we define an indicator
function IA as
{
1 xA
IA (x) =
0 otherwise
We also define
nkj =
n.j =

m1

l=0
m

I (Xl+1 = k|Xl = j)
nkj ; nk. =

j=1

k=1

nkj

where nkj accounts for the number of transitions from state j to state k in m time steps.
The maximum likelihood estimator for pkj is now given as
pkj =

nkj
m

nkj

nkj
n.j

(2)

k=1

Let us consider the rainfall model introduced at the beginning of the study of DTMCs. Let
0 and 1, respectively denote no rain and rain on any given day. Suppose that following
observations were made over a course of 368 consecutive days.
State
0
0
175
1
49
Total 224

1
48
96
144

Total
223
145
368

Using eq. (2), following estimates for pkj are obtained.


p00 =

n00
1

n00
n00 + n10

nk0

k=0

25
175
=
p00 =
224
32
Similarly,
p10 =

n10
1

49
7
=
224
32

48
1
=
144
3

96
2
=
144
3

nk0

k=0

p01 =

n01
1

nk1

k=0

p11 =

n11
1

nk1

k=0

The estimated transition matrix is therefore given as

25 1
32 3

P = 7 2

32 3

Also,
P 20

(
)
0.6038 0.6038
=
= (0.6038
0.3982 0.3982

0.3982)T

Now we will show that the likelihood function corresponding to pkj is given as
n
l(pkj ) = Prob(nkj |pkj )
pkjkj
j

Let us define pji = Prob(Xn+1 = j|Xn = i) and let {a1 , a2 , a3 , , an+1 } be the observed
sequence in a run of a DTMC. Now the probability of occurence of this particular sequence
is Pa1 Pa2 a1 Pa3 a2 Pan+1 an . Also suppose nji be the number of transitions from i to j i.e.,
am+1 = j when am = i then.
n
Pa1 Pa2 a1 Pa3 a2 Pan+1 an = Pa1
Pjiji
i

Thus
l(pkj )

pkjkj

(3)

The log-likelihood function corresponding to eq. (3) is now


L(pkj ) = C +

m
m

nkj log(pkj )

j=1 k=1

As

pkj = 1, so

k=1

L(pkj ) = C +

m m1

nkj log(pkj ) +

j=1 k=1

(
nmj log 1

j=1

m1

)
pkj

k=1

In the homework, you will show that the maximum likelihood estimator of pkj is indeed
nkj
pkj =
nkj
k

End of Lecture

You might also like