Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Stab

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

COMPUTING THE PERMANENT

OF (SOME) COMPLEX MATRICES

Alexander Barvinok

June 2014

Abstract. We present a deterministic algorithm, which, for any given 0 <  < 1
and an n × n real or complex matrix A = (aij ) such that |aij − 1| ≤ 0.19 for all
i, j computes the permanent of A within relative error  in nO(ln n−ln ) time. The
method can be extended to computing hafnians and multidimensional permanents.

1. Introduction and main results


The permanent of an n × n matrix A = (aij ) is defined as

n
X Y
per A = aiσ(i) ,
σ∈Sn i=1

where Sn is the symmetric group of permutations of the set {1, . . . , n}. The prob-
lem of efficient computation of the permanent has attracted a lot of attention. It
is #P -hard already for 0-1 matrices [Va79], but a fully polynomial randomized
approximation scheme, based on the Markov Chain Monte Carlo approach, is con-
structed for all non-negative matrices [J+04]. A deterministic polynomial time
algorithm based on matrix scaling for computing the permanent of non-negative
matrices within a factor of en is constructed in [L+00] and the bound was recently
improved to 2n in [GS13]. An approach based on the idea of “correlation decay”
from statistical physics results in a deterministic polynomial time algorithm ap-
proximating per A within a factor of (1 + )n for any  > 0, fixed in advance, if A
is the adjacency matrix of a constant degree expander [GK10].
There is also interest in computing permanents of complex matrices [AA13]. The
well-known Ryser’s algorithm (see, for example, Chapter 7 of [Mi78]) computes the

1991 Mathematics Subject Classification. 15A15, 68C25, 68W25.


Key words and phrases. permanent, hafnian, algorithm.
This research was partially supported by NSF Grant DMS 0856640.

Typeset by AMS-TEX
1
permanent of a matrix A over any field in O (n2n ) time. A randomized approxi-
mation algorithm of [Fü00] computes the permanent of a complex matrix within a
n/2 −2

(properly defined) relative error  in O 3  time. The randomized algorithm
of [Gu05], see also [AA13] for an exposition, computes the permanent of a complex
matrix A in polynomial in n and 1/ time within an additive error of kAkn , where
kAk is the operator norm of A.
In this paper, we present a new approach to computing permanents of real or
complex matrices A and show that if |aij − 1| ≤ γ for some absolute constant γ > 0
(we can choose γ = 0.19) and all i and j, then, for any  > 0 the value of per A
can be computed within relative error  in nO(ln n−ln ) time (we say that α ∈ C
approximates per A within relative error 0 <  < 1 if per A = α(1 + ρ) where
|ρ| < ). We also discuss how the method can be extended to computing hafnians
of symmetric matrices and multidimensional permanents of tensors.
(1.1) The idea of the algorithm. Let J denote the n × n matrix filled with 1s.
Given an n×n complex matrix A, we consider (a branch of) the univariate function

(1.1.1) f (z) = ln per J + z(A − J) .
Clearly,
f (0) = ln per J = ln n! and f (1) = ln per A.
Hence our goal is to approximate f (1) and we do it by using the Taylor polynomial
expansion of f at z = 0:
m
X 1 dk
(1.1.2) f (1) ≈ f (0) + f (z) .

k! dz k z=0
k=1

It turns out that the right hand side of (1.1.2) can be computed in nO(m) time.
We present the algorithm in Section 2. The quality of the approximation (1.1.2)
depends on the location of complex zeros of the permanent.
(1.2) Lemma. Suppose that there exists a real β > 1 such that

per J + z(A − J) 6= 0 for all z ∈ C satisifying |z| ≤ β.
Then for all z ∈ C with |z| ≤ 1 the value of

f (z) = ln per J + z(A − J)
is well-defined by the choice of the branch of the logarithm for which f (0) is a real
number, and the right hand side of (1.1.2) approximates f (1) within an additive
error of
n
.
(m + 1)β m (β − 1)
In particular, for a fixed β > 1, to ensure an additive error of 0 <  < 1, we can
choose m = O (ln n − ln ), which results in the algorithm for approximating per A
within relative error  in nO(ln n−ln ) time. We prove Lemma 1.2 in Section 2.
Thus we have to identify a class of matrices A for which the number β > 1 of
Lemma 1.2 exists. We prove the following result.
2
(1.3) Theorem. There is an absolute constant δ > 0 (we can choose δ = 0.195)
such that if Z = (zij ) is a complex n × n matrix satisfying
|zij − 1| ≤ δ for all i, j
then
per Z 6= 0.

We prove Theorem 1.3 in Section 3.


For any matrix A = (aij ) satisfying
|aij − 1| ≤ 0.19 for all i, j,
we can choose β = 195/190 in Lemma 1.2 and thus obtain an approximation algo-
rithm for computing per A.
The sharp value of the constant δ in Theorem 1.3 is not known to the author.
A simple example of a 2 × 2 matrix
 1+i 1−i 
A = 1−i 2 2
1+i
2 2
for which per A = 0 shows that in Theorem 1.3 we must have

2
δ < ≈ 0.71.
2
What is also not clear is whether the constant δ can improve as the size of the
matrix grows.
(1.4) Question. Is it true that for any 0 <  < 1 there is a positive integer N ()
such that if Z = (zij ) is a complex n × n matrix with n > N () and
|zij − 1| ≤ 1 −  for all i, j
then per Z 6= 0?
We note that for any 0 <  < 1, fixed in advance, a deterministic polynomial
time algorithm based on scaling approximates the permanent of a given n × n real
matrix A = (aij ) satisfying
 ≤ aij ≤ 1 for all i, j
within a multiplicative factor of nκ() for some κ() > 0 [BS11].
(1.5) Ramifications. In Section 4, we discuss how our approach can be used
for computing hafnians of symmetric matrices and multidimensional permanents
of tensors. The same approach can be used for computing partition functions
associated with cliques in graphs [Ba14] and graph homomorphisms [BS14]. In
each case, the main problem is to come up with a version of Theorem 1.3 bounding
the complex roots of the partition function away from the vector of all 1s. Isolating
zeros of complex extensions of real partition functions is a problem studied in
statistical physics and also in connection to combinatorics, see, for example, [SS05].
3
2. The algorithm
(2.1) The algorithm for approximating the permanent. Given an n × n
complex matrix A = (aij ), we present an algorithm which computes the right hand
side of the approximation (1.1.2) for the function f (z) defined by (1.1.1).
Let

(2.1.1) g(z) = per J + z(A − J) ,

so f (z) = ln g(z). Hence

g 0 (z)
f 0 (z) = and g 0 (z) = g(z)f 0 (z).
g(z)

Therefore, for k ≥ 1 we have


k−1
dk X k − 1  dj   dk−j 
(2.1.2) g(z) = g(z) f (z)

dz k z=0 j dz j z=0 dz k−j z=0
j=0

(we agree that the 0-th derivative of g is g).


We note that g(0) = n!. If we compute the values of

dk
(2.1.3) g(z) for k = 1, . . . , m,

dz k z=0

then the formulas (2.1.2) for k = 1, . . . , m provide a non-degenerate triangular


system of linear equations that allows us to compute

dk
f (z) for k = 1, . . . , m.

dz k

z=0

Hence our goal is to compute the values (2.1.3).


We have
n
dk dk X Y 
g(z) = k 1 + z aiσ(i) − 1

dz k z=0 dz z=0
σ∈Sn i=1
X X  
= ai1 σ(i1 ) − 1 · · · aik σ(ik ) − 1
σ∈Sn 1≤i1 ,... ,ik ≤n
X
=(n − k)! (ai1 j1 − 1) · · · (aik jk − 1) ,
1≤i1 ,... ,ik ≤n
1≤j1 ,... ,jk ≤n

where the last sum is over all pairs of ordered k-subsets (i1 , . . . , ik ) and (j1 , . . . , jk )
2
of the set {1, . . . , n}. Since the last sum contains n!/(n − k)! = nO(k) terms, the
complexity of the algorithm is indeed nO(m) .
4
(2.2) Proof of Lemma 1.2. The function g(z) defined by (2.1.1) is a polynomial
in z of degree d ≤ n with g(0) = n! 6= 0, so we factor

d  
Y z
g(z) = g(0) 1− ,
i=1
αi

α1 , . . . , αd are the roots of g(z). By the condition of Lemma 1.2, we have

|αi | ≥ β > 1 for i = 1, . . . , d.

Therefore,

d
 
X z
(2.2.1) f (z) = ln g(z) = ln g(0) + ln 1 − for |z| ≤ 1,
i=1
αi

where we choose the branch of ln g(z) that is real at z = 0. Using the standard
Taylor expansion, we obtain
  m  k
1 X 1 1
ln 1 − =− + ζm ,
αi k αi
k=1

where +∞
X 1  1 k 1
|ζm | = ≤ .

k αi (m + 1)β m (β − 1)
k=m+1

Therefore, from (2.2.1) we obtain

m d  k !
X 1X 1
f (1) = f (0) + − + ηm ,
k i=1 αi
k=1

where
n
|ηm | ≤ .
(m + 1)β m (β − 1)
It remains to notice that

d  k
1X 1 1 dk
− = f (z) .

k i=1 αi k! dz k z=0


5
3. Proof of Theorem 1.3
Let us denote by U n×n (δ) ⊂ Cn×n the closed polydisc
n o
U n×n (δ) = Z = (zij ) : |zij − 1| ≤ δ for all i, j .

Thus Theorem 1.3 asserts that per Z 6= 0 for Z ∈ U n×n (δ) and δ = 0.195.
First, we establish a simple geometric lemma.
(3.1) Lemma. Let u1 , . . . , un ∈ Rd be non-zero vectors such that for some 0 ≤
α < π/2 the angle between any two vectors ui and uj does not exceed α. Let
u = u1 + . . . + un . Then
n
√ X
kuk ≥ cos α kui k.
i=1

Proof. We have

n
!2
X X X
kuk2 = hui , uj i ≥ kui kkuj k cos α = (cos α) kui k ,
1≤i,j≤n 1≤i,j≤n i=1

and the proof follows. 


We prove Theorem 1.3 by induction on n, using Lemma 3.1 and the following
two lemmas.
(3.2) Lemma. For an n × n matrix Z = (zij ) and j = 1, . . . , n, let Zj be the
(n − 1) × (n − 1) matrix obtained from Z by crossing out the first row and the j-th
column of Z.
Suppose for some δ > 0 and for some 0 < τ < 1, for any Z ∈ U n×n (δ) we have
per Z 6= 0 and
n
X
|per Z| ≥ τ |z1j | |per Zj | .
j=1

Let A, B ⊂ U n×n (δ) be any two n × n matrices that differ in one column (or
in one row) only. Then the angle between two complex numbers per A and per B,
interpreted as vectors in R2 = C does not exceed


θ= .
(1 − δ)τ

Proof. Since per Z 6= 0 for all Z ∈ U n×n (δ), we may consider a branch of ln per Z
defined for Z ∈ U n×n (δ).
6
Using the expansion
n
X
(3.2.1) per Z = z1j per Zj ,
j=1

we conclude that
∂ per Zj
ln per Z = for j = 1, . . . , n.
∂z1j per Z

Therefore, since |zij | ≥ 1−δ for j = 1, . . . , n, we conclude that for any Z ∈ U n×n (δ),
we have
n
X ∂ 1
(3.2.2) ln per Z ≤ .
j=1
∂z1j (1 − δ)τ

Since the permanent is invariant under permutations of rows, permutations of


columns and taking the transpose of the matrix, without loss of generality we may
assume that the matrix B ∈ U n×n (δ) is obtained from A ∈ U n×n (δ) by replacing
the entries a1j by numbers b1j such that

|b1j − 1| ≤ δ for j = 1, . . . , n.

Then
 
n  
X ∂
|ln per A − ln per B| ≤  sup ∂z1j ln per Z
 max |a1j − b1j | .
Z∈U n×n (δ) j=1 j=1,... ,n

Since
|b1j − a1j | ≤ 2δ for all j = 1, . . . , n,
the proof follows from (3.2.2). 
(3.3) Lemma. Suppose that for some
π
0 ≤ θ < − 2 arcsin δ
2

and for any two matrices A, B ∈ U n×n (δ) which differ in one row (or in one
column), the angle between two complex numbers per A and per B, interpreted as
vectors in R2 = C does not exceed θ. Then for any matrix Z ∈ U (n+1)×(n+1) (δ),
we have
n+1
X
|per Z| ≥ τ |z1j | |per Zj |
j=1
7
with p
τ= cos (θ + 2 arcsin δ),
where Zj is the n × n matrix obtained from Z by crossing out the first row and the
j-th column.
Proof. We use the first row expansion (3.2.1) and observe that any two matrices
Zj and Zk , can be obtained from one from another by a replacing one column and
a permutation of columns. Therefore, the angle between any two complex numbers
per Zj and per Zk does not exceed θ. Since

− arcsin δ ≤ arg z1j ≤ arcsin δ for j = 1, . . . , n,

the angle between any two numbers z1j per Zj and z1k per Zk does not exceed θ +
2 arcsin δ. The proof follows by Lemma 3.1. 
(3.4) Proof of Theorem 1.3. One can see that for a sufficiently small δ > 0, the
equation

(3.4.1) θ= p
(1 − δ) cos(θ + 2 arcsin δ)

has a solution 0 < θ < π/2. Numerical computations show that we can choose
δ = 0.195 and
θ ≈ 0.7611025127.
Let p
τ= cos(θ + 2 arcsin δ) ≈ 0.6365398112.
We proceed by induction on n. More precisely, we prove the following three
statements (3.4.2)–(3.4.4) by induction on n:

(3.4.2) For every Z ∈ U n×n (δ), we have per Z 6= 0;


(3.4.3) Suppose A, B ∈ U n×n (δ) are two matrices which differ by one row (or
one column). Then the angle between two complex numbers per A and per B,
interpreted as vectors in R2 = C, does not exceed θ;
(3.4.4) For a matrix Z ∈ U n×n (δ), Z = (zij ), let Zj be the (n − 1) × (n − 1)
matrix obtained by crossing out the first row and the j-th column. Then
n
X
|per Z| ≥ τ |z1j | |per Zj | .
j=1

For n = 1 the statement (3.4.2) is obviously true. Moreover, the angle between
any two numbers a, b ∈ U 1×1 (δ) does not exceed

2 arcsin δ ≈ 0.3925149004 < θ,


8
so (3.4.3) holds as well. The statement (3.4.4) is vacuous.
Lemma 3.3 implies that if the statement (3.4.3) holds for n × n matrices then
the statement (3.4.4) holds for (n + 1) × (n + 1) matrices.
The statement (3.4.4) for (n + 1) × (n + 1) matrices together with the statement
(3.4.2) for n×n matrices implies the statement (3.4.2) for (n+1)×(n+1) matrices.
Finally, Lemma 3.2 implies that if the statement (3.4.4) holds for (n+1)×(n+1)
matrices then the statement (3.4.3) holds for (n + 1) × (n + 1) matrices.
This concludes the proof of (3.4.2)–(3.4.4) for all positive integer n. 

4. Ramifications
A similar approach can be applied to computing other quantities of interest.
(4.1) Hafnians. Let A = (aij ) be a 2n × 2n real or complex matrix. The quantity
X
haf A = ai1 j1 · · · ain jn ,
{i1 ,j1 },... ,{in ,jn }

where sum is taken over all (2n)!/n!2n unordered partitions of the set {1, . . . , 2n}
into n pairwise disjoint unordered pairs {i1 , j1 }, . . . , {in , jn }, is called the hafnian
of A, see for example, Section 8.2 of [Mi78]. For any n × n matrix A we have
 
0 A
haf = per A
AT 0
and hence computing the permanent of an n × n matrix reduces to computing the
hafnian of a symmetric 2n × 2n matrix. The computational complexity of hafnians
is understood less well than that of permanents. Unlike in the case of the perma-
nent, no fully polynomial (randomized or deterministic) polynomial approximation
scheme is known to compute the hafnian of a non-negative real symmetric matrix.
Unlike in the case of the permanent, no deterministic polynomial time algorithm
approximating the hafnian of a 2n × 2n non-negative symmetric matrix within a
factor of cn , where c > 0 is an absolute constant, is known. On the other hand
there is a polynomial time randomized algorithm based on the representation of
the hafnian as the expectation of the determinant of a random matrix, which ap-
proximates the hafnian of a given non-negative symmetric 2n × 2n matrix within
a factor of cn , where c ≈ 0.56 [Ba99]. Also, for any 0 <  < 1 fixed in advance,
there is a deterministic polynomial time algorithm based on scaling, which, given
a 2n × 2n symmetric matrix A = (aij ) satisfying

 ≤ aij ≤ 1 for all i, j,

computes haf A within a multiplicative factor of nκ() for some κ() > 0 [BS11].
With minimal changes, the approach of this paper can be applied to computing
hafnians. Namely, let J denote the 2n × 2n matrix filled with 1s and let us define

f (z) = ln haf J + z(A − J) .
9
Then
(2n)!
f (0) = ln haf J = ln and f (1) = ln haf A
n!2n
and one can use the Taylor polynomial approximation (1.1.2) to estimate f (1). As
in Section 2, one can compute the right hand side of (1.1.2) in nO(m) time. The
statement and the proof of Theorem 1.3 carries over to hafnians almost verbatim.
Namely, let δ > 0 be a real for which the equation (3.4.1) has a solution 0 < θ < π/2
(hence one can choose δ = 0.195). Then haf Z 6= 0 as long as Z = (zij ) is a 2n × 2n
symmetric complex matrix satisfying

|zij − 1| ≤ δ for all i, j.

Instead of the row expansion of the permanent (3.2.1) used in Lemmas 3.2 and 3.3,
one should use the row expansion of the hafnian

2n
X
haf Z = z1j haf Zj ,
j=2

where Zj is the symmetric (2n − 2) × (2n − 2) matrix obtained from Z by crossing


out the first and the j-th row and the first and the j-th column. As in Section 2,
we obtain an algorithm of nO(ln n−ln ) complexity of approximating haf Z within
relative error  > 0, where Z = (Zij ) is a 2n × 2n symmetric complex matrix
satisfying
|zij − 1| ≤ γ, for all i, j.
and γ > 0 is an absolute constant (one can choose γ = 0.19).
(4.2) Multidimensional permanents. Let us fix an integer ν ≥ 2 and let

A = (ai1 ...iν ) , 1 ≤ i1 , . . . , iν ≤ n,

be an ν-dimensional cubical n×. . .×n array of real or complex numbers. We define

X n
Y
PER A = aiσ1 (i)...σν−1 (i) .
σ1 ,... ,σν−1 ∈Sn i=1

If ν = 2 then A is an n × n matrix and PER A = per A. For ν > 2 it is already an


NP-hard problem to tell PER A from 0 even if ai1 ...iν ∈ {0, 1} since the problem
reduces to detecting a perfect matching in a hypergraph, see, for example, Problem
SP1 in [A+99]. However, for any 0 <  < 1, fixed in advance, there is a polynomial
time deterministic algorithm based on scaling, which, given a real array A satisfying

 ≤ ai1 ...iν ≤ 1 for all 1 ≤ i1 , . . . , iν ≤ n


10
computes PER A within a multiplicative factor of nκ(,ν) for some κ(, ν) > 0 [BS11].
With some modifications, the method of this paper can be applied to computing
this multidimensional version of the permanent. Namely, let J be the array filled
with 1s and let us define

f (z) = ln PER J + z(A − J) .

Then
f (0) = ln PER J = (ν − 1) ln n! and f (1) = ln PER A
and one can use the Taylor polynomial approximation (1.1.2) to estimate f (1).
As in Section 2, one can compute the right hand side of (1.1.2) in nO(m) time,
where the implicit constant in “O(m)” depends on ν. The proof of Theorem 1.3
carries to multidimensional permanents with some modifications. Namely, for some
sufficiently small δν > 0 the equation

2δν
θ= q 
(1 − δν ) cos (ν − 1)θ + 2 arcsin δν

has a solution θ ≥ 0 such that (ν − 1)θ + 2 arcsin δν < π/2. For ν = 2, we get the
equation (3.4.1) with a possible choice of δ2 = 0.195, while for ν = 3 we can choose
δ3 = 0.125 and for ν = 4 we can choose δ4 = 0.093. Then PER Z 6= 0 as long as
Z = (zi1 ...iν ) is an array of complex numbers satisfying

|zi1 ...iν − 1| ≤ δν for all 1 ≤ i1 , . . . , iν ≤ n.

We proceed as in the proof of Theorem 1.3, only instead of the first row expansion of
the permanent (3.2.1) used in Lemmas 3.2 and 3.3, we use the first index expansion
X
PER Z = z1j2 ...jν PER Zj2 ...jν ,
1≤j2 ,... ,jν ≤n

where Zj2 ...jν is the ν-dimensional array of size (n−1)×· · ·×(n−1) obtained from Z
by crossing out the section with the first index 1, the section with the second index
j2 and so forth, concluding with crossing out the section with the last index jν . As
in Section 2, we obtain at algorithm of nO(ln n−ln ) complexity of approximating
PER Z within relative error  > 0, where Z is a ν-dimensional cubic n × · · · × n
array of complex numbers satisfying

|zi1 ...iν − 1| ≤ γν for all 1 ≤ i1 , . . . , iν ≤ n,

and 0 < γν < δν are absolute constants (one can choose γ2 = 0.19, γ3 = 0.12 and
γ4 = 0.09).
11
References
[AA13] S. Aaronson and A. Arkhipov, The computational complexity of linear optics, Theory of
Computing 9 (2013), 143–252.
[A+99] G. Ausiello, P. Crescenzi, G. Gambosi, V. Kann, A. Marchetti-Spaccamela, and M. Pro-
tasi, Complexity and Approximation. Combinatorial Optimization Problems and their
Approximability Properties, Springer-Verlag, Berlin, 1999.
[Ba99] A. Barvinok, Polynomial time algorithms to approximate permanents and mixed dis-
criminants within a simply exponential factor, Random Structures & Algorithms 14
(1999), no. 1, 29–61.
[Ba14] A. Barvinok, Computing the partition function for cliques in a graph, preprint
arXiv:1405.1974 (2014).
[BS11] A. Barvinok and A. Samorodnitsky, Computing the partition function for perfect match-
ings in a hypergraph, Combinatorics, Probability and Computing 20 (2011), no. 6, 815–
835.
[BS14] A. Barvinok and P. Soberón, Computing the partition function for graph homomor-
phisms, preprint arXiv:1406.1771 (2014).
[C+13] J.-Y. Cai, X. Chen and P. Lu, Graph homomorphisms with complex values: a dichotomy
theorem, SIAM Journal on Computing 42 (2013), no. 3, 924–1029.
[Fü00] M. Fürer, Approximating permanents of complex matrices, Proceedings of the Thirty-
Second Annual ACM Symposium on Theory of Computing, ACM, New York, 2000,
pp. 667–669.
[GK10] D. Gamarnik and D. Katz, A deterministic approximation algorithm for computing the
permanent of a 0, 1 matrix, Journal of Computer and System Sciences 76 (2010), no. 8,
879–883.
[Gu05] L. Gurvits, On the complexity of mixed discriminants and related problems, Mathemati-
cal Foundations of Computer Science 2005, Lecture Notes in Computer Science, vol. 3618,
Springer, Berlin, 2005, pp. 447–458.
[GS13] L. Gurvits and A. Samorodnitsky, Bounds on the permanent and some applications,
preprint available at http://www.cs.huji.ac.il/∼salex/ (2013).
[J+04] M. Jerrum, A. Sinclair and E. Vigoda, A polynomial-time approximation algorithm for
the permanent of a matrix with nonnegative entries, Journal of the ACM 51 (2004), no.
4, 671–697.
[L+00] N. Linial, A. Samorodnitsky, and A. Wigderson, A deterministic strongly polynomial
algorithm for matrix scaling and approximate permanents, Combinatorica 20 (2000),
no. 4, 545–568.
[Mi78] H. Minc, Permanents, Encyclopedia of Mathematics and its Applications, Vol. 6,
Addison-Wesley Publishing Co., Reading, Mass., 1978.
[SS05] A.D. Scott and A.D. Sokal, The repulsive lattice gas, the independent-set polynomial, and
the Lovász local lemma, Journal of Statistical Physics 118 (2005), no. 5-6, 1151–1261.
[Va79] L.G. Valiant, The complexity of computing the permanent, Theoretical Computer Science
8 (1979), no. 2, 189–201.

Department of Mathematics, University of Michigan, Ann Arbor, MI 48109-1043,


USA
E-mail address: barvinok@umich.edu

12

You might also like