Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Varia%onal	Autoencoder	
Mark	Chang
Original	Paper	
•  Title:	
– Auto-Encoding	Varia%onal	Bayes		
•  Author:	
– Diederik	P.	Kingma		
– Max	Welling		
•  Organiza%on:	
– Machine	Learning	Group,	Universiteit	van	
Amsterdam
Outlines	
•  Varia%onal	Inference	
•  Varia%onal	Autoencoder	
•  Experiment	
•  Further	Research
Varia%onal	Inference	
•  Problem	Defini%on	
– Observable	Data:	
– Hidden	Variable:	
– Posterior	Distribu%on	of	hidden	variable	
given	some	data:		
Intractable	to	compute	
z = {z1, z2, ..., zn}
x
z
m
n
p(z|x) =
p(z, x)
p(x)
=
p(x|z)p(z)
R
p(x|z)p(z)dz
x = {x1, x2, ..., xm}
Varia%onal	Inference	
•  Solu%ons	for	Intractable	Posterior	
– Monte	Carlo	Sampling	
•  Metropolis	Has%ng	
•  Gibbs	Sampling	
– Varia%onal	Inference
Varia%onal	Inference	
•  Approximate     by	
•  Minimize	the	KL	Divergence:	
	
p(z|x) q(z)
DKL[q(z)||p(z|x)] =
Z
q(z)log
q(z)
p(z|x)
dz
Evidence(Varia%onal)	Lower	Bound	
Evidence	Lower	Bound	(ELBO):	
= (Eq(z)[logp(z, x)] Eq(z)[logq(z)]) + logp(x)
L[q(z)]
DKL[q(z)||p(z|x)] =
Z
q(z)log
q(z)
p(z|x)
dz
=
Z
q(z)log
q(z)p(x)
p(z, x)
dz
=
Z
q(z)log
q(z)
p(z, x)
dz +
Z
q(z)logp(x)dz
=
Z
q(z)(logq(z) logp(z, x))dz + logp(x)
Evidence	Lower	Bound	
Minimize	
is	equal	to	Maximize	 L[q(z)]
DKL[q(z)||p(z|x)] = L[q(z)] + logp(x)
logp(x) = DKL[q(z)||p(z|x)] + L[q(z)]
DKL[q(z)||p(z|x)]
Mean-Field	Varia%onal	Inference	
•  Q	can	be	factorized:		
q(z) =
Y
i
q(zi|✓i)
8i,
Z
q(zi|✓i)dzi = 1
Minimize	
DKL[q(z)||p(z|x)]
q(z)
p(z|x)
hXp://cpmarkchang.logdown.com/posts/737247-pgm-varia%onal-inference
Varia%onal	Autoencoder	
q (z|x)
Encoder	
Network	
Decoder		
Network	
p✓(x|z)
DKL[q (z|x)||p✓(z|x)]Minimize:	
p✓(z|x) =
p✓(x|z)p✓(z)
p✓(x)
Intractable:
Varia%onal	Autoencoder	
logp✓(x) = DKL[q (z|x)||p✓(z|x)] + L(✓, , x)
L(✓, , x) = Eq (z|x)[logp✓(x, z) logq (z|x)]
= DKL[q (z|x)||p✓(z)] + Eq (z|x)[logp✓(x|z)]
Marginal	Likelihood:	
Varia%onal	Lower	Bound:	
= Eq (z|x)[logp✓(z) + logp✓(x|z) logq (z|x)]
= Eq (z|x)[log
p✓(z)
q (z|x)
+ p✓(x|z)]
Monte	Carlo	Gradient	Es%mator		
Gradient	of																						contains	
which	is	Intractable		
L(✓, , x)
Use	Monte	Carlo	Gradient	Es%mator	:			
where	
r Eq (z|x)[logp✓(x|z)]
r Eq (z)[f(z)] = r
Z
q (z)f(z)dz
=
Z
q (z)f(z)
r q (z)
q (z)
dz =
Z
q (z)f(z)r logq (z)dz
= Eq (z)[f(z)r logq (z)]
⇡
1
L
LX
l=1
f(z)r logq (z(l)
) z(l)
⇠ q (z)
Objec%ve	Func%on	
L(✓, , x(i)
) = DKL[q (z|x(i)
)||p✓(z)] + Eq (z|x(i))[logp✓(x(i)
|z)]
Monte	Carlo	Gradient	Es%mator		
˜L(✓, , x(i)
) ⇡ L(✓, , x(i)
)
where	 z(l)
⇠ q (z|x(i,l)
)
˜L(✓, , x(i)
) = DKL[q (z|x(i)
)||p✓(z)] +
1
L
LX
l=1
logp✓(x(i)
|z(i,l)
)
Reparameteriza%on	Trick		
✏ ⇠ p(✏)
z ⇠ q (z|x)
determinis%c	variable	
auxiliary	variable	
z = µ + ✏
Example:	 ✏ ⇠ N(0, 1)
z ⇠ p(z|x) = N(µ, 2
)
z = g (✏, x)
Reparameteriza%on	Trick		
z(i,l)
= µ(i)
+ (i)
✏(l)
x(i)
Encoder	
Networks	
	
	
q (z|x)
logq (z|x(i)
) = logN(z, µ(i)
, 2(i)
I)
✏ ⇠ N(0, I)
Reparameteriza%on	Trick		
z(i,l)
= µ(i)
+ (i)
✏(l)
x(i)
Encoder	
Networks	
	
	
x(i)
q (z|x) p✓(x|z)
z(i,l)
	Decoder	
Networks	
	
	
✏ ⇠ N(0, I)
Objec%ve	Func%on	
˜L(✓, , x(i)
) = DKL[q (z|x(i)
)||p✓(z)] +
1
L
LX
l=1
(logp✓(x(i)
|z(i,l)
)
˜L(✓, , x(i)
) =
1
2
JX
j=1
(1+log((
(i)
j )2
) (µ
(i)
j )2
(
(i)
j )2
)+
1
L
LX
l=1
(logp✓(x(i)
|z(i,l)
)
Regulariza%on	 Reconstruc%on	
Error	
p✓(z) = N(z, 0, I)
q (z|x(i)
) = N(z, µ(i)
, 2(i)
I)
Training
Experiment	
Horizontal	axis:	size	of	training	data	
Ver%cal	axis:	evidence	Lower	Bound	
Nz	:	dimensions	of	hidden	variables
Experiment
Experiment	
Visualiza%on	of	2d	latent	space
Further	Research	
DRAW:	A	Recurrent	Neural	Network	For	
Image	Genera%on	
Karol	Gregor,	Ivo	Danihelka,	Alex	Graves,Danilo	Jimenez	Rezende	and	Daan	Wierstra
Further	Research	
Neural	Varia%onal	Inference	for	Text	
Processing	
Yishu	Miao,	Lei	Yu	&	Phil	Blunsom	
Neural	Varia%onal	
Document	Model	
Neural	Answer	
Selec%on	Model
Further	Research	
Deep	Convolu%onal	Inverse	Graphics	Network	
Tejas	D.	Kulkarni,	William	F.	Whitney,	Pushmeet	Kohli,	Joshua	B.	Tenenbaum
Source	Code	
•  hXps://jmetzen.github.io/2015-11-27/
vae.html
Reference	
•  Charles	Fox,	Stephen	Roberts.	A	Tutorial	on	
Varia%onal	Bayesian	Inference.	
– hXp://www.orchid.ac.uk/eprints/40/1/
fox_vbtut.pdf	
•  Diederik	P	Kingma,	Max	Welling.	Auto-
Encoding	Varia%onal	Bayes.	
– hXps://arxiv.org/pdf/1312.6114v10.pdf

More Related Content

Variational Autoencoder