Objectives:: Expectation Maximization (Em)
Objectives:: Expectation Maximization (Em)
• Objectives:
Jensen’s Inequality (Special Case)
EM Theorem Proof
EM Example – Missing Data
Application: Hidden Markov Models
• Resources:
Wiki: EM History
T.D.: Brown CS Tutorial
UIUC: Tutorial
F.J.: Statistical Methods
The Expectation Maximization Algorithm (Preview)
Proof:
p( x) log p( x) p( x) log q( x) 0
x x
p( x) log p( x) q( x) 0
x
p ( x)
x p ( x ) log 0
q ( x)
q( x)
x p ( x ) log
p( x)
0
q( x) q( x)
x p ( x ) log
p( x)
x p ( x )(
p ( x)
1)
The last step follows using a bound for the natural logarithm: ln x x 1.
We note that since both of these functions are probability distributions, they
must sum to 1.0. Therefore, the inequality holds.
The general form of Jensen’s inequality relates a convex function of an integral
to the integral of the convex function and is used extensively in information
theory.
Explanation: What exactly have we shown? If the last quantity is greater than
zero, then the new model will be better than the old model. This suggests a
strategy for finding the new parameters, θ: choose them to make the last
quantity positive!
• Many other reestimation algorithms have been derived using this approach.
ECE 8527: Lecture 10, Slide 9
Example: Estimating Missing Data