Discovery of Linear Acyclic Models Using Independent Component Analysis

Created by S.S. in Jan 2008
Discovery of Linear Acyclic Models
Using Independent Component Analysis
Shohei Shimizu, Patrik Hoyer,
Aapo Hyvarinen and Antti Kerminen
LiNGAM homepage: http://www.cs.helsinki.fi/group/neuroinf/lingam/

2
Independent Component Analysis
(Hyvarinen et al., 2001)
x As
m 㬍1 m 㬍n n 㬍1
• A is unknown, cov(s)=I
– Typically, A is square (m=n)
i s
• are assumed to be non-Gaussian and mutually
independent
• Non-Gaussian and independence are
the key assumptions in ICA
• Estimable including the rotation (Comon, 1994)

3
Linear acyclic models
(Bollen, 1989; Pearl, 2000; Spirtes et al., 2000)
• Continuous variables
• Directed acyclic graph (DAG)
• The value assigned to each variable is a linear
combination of those previously assigned, plus a
disturbance term ei
• Disturbances (errors) are independent and have non-zero
variances
x Bx e i
i ij j e x b x
k j k i
( ) ( )
or

4
Our goal
• We know
– Data X is generated by
• We do NOT know
x Bx e
– Connection strengths: B
– Order: k(i)
– Disturbances: ei
• What we observe is data X only
• Goal
– Estimate B and k using data X only!

5 Can we recover the original
network?
Original network
?

6
Can we recover the original
network using ICA?
Yes!
Original network Estimated network

7
Discovery of linear acyclic models
from non-experimental data
• Existing methods (Bollen,1989; Pearl, 2000; Spirtes et al., 2000)
– Gaussian assumption on disturbances ei
– Produce many equivalent models
• Our LiNGAM approach (Shimizu et al, UAI2005, 2006 JMLR)
– Replace Gaussian assumption by non-Gaussian assumption
– Can identify the connection strengths and structure

Gaussnianity
Non-Gaussianity
Equivalent models
No Equivalent models
x1 x2
x3
x1 x2
x3
x1 x2
x3
x1 x2
x3
x1 x2
x3
x1 x2
x3

8 Linear Non-Gaussian Acyclic Models
(LiNGAM)
• As usual, linear acyclic models,
but disturbances are assumed to be
non-Gaussian:
i ij j e x b x
• Examples
x Bx e i
k j k i
( ) ( )
or
i e

Outline of LiNGAM algorithm 9
x2
x3
x4
x1
Non-significant edges
x2
x3
x4
x1
1. Estimate B by ICA
+ post-processing
2. Find an order k(i) (DAG)
3. Prune non-significant edges
x2
x3
x4
x1

10 Key ideas
i x
• Observed variables are linear combinations of
non-Gaussian independent disturbances
x Bx
e
( )1
x I B e

Ae

– The classic case of ICA (Independent Component Analysis)
• Permutation indeterminacy in ICA can be solved
– Can be shown that the correct permutation is the only one which
has no zeros in the diagonal (Shimizu et al., 2005; 2006)
• Pruning edges can be done by many existing methods
– Imposing sparseness, testing, model fit etc.
i e

11
Examples of estimated networks
• All the edges correctly identified
• All the connection strengths approximately correct

12 What kind of mistakes LiNGAM might make?
• One falsely added edge (x1x7, -0.019)
• One missing edge (x1x6, 0.0088)

13
Real data example 1
• Galton’s height data (Galton, 1886)
– x1: child height
– x2: `midparent’ height
• (father’s height + 1.08 mother’s height)/2
– 928 observations
• Estimates:

x
1
0 0.67
• Estimated direction:

Discovery of Linear Acyclic Models Using Independent Component Analysis

e
1
2
x
1
2
2
0.012 0
e
x
x
1 2 x x child midparent

14
Real data example 2
• Fuller’s corn data (Fuller, 1987)
– x1: Yield of corn
– x2: Soil Nitrogen
– 11 observations
• Estimates:

x
1
0 0.34
• Estimated direction:

Discovery of Linear Acyclic Models Using Independent Component Analysis

More Related Content

Discovery of Linear Acyclic Models Using Independent Component Analysis