Computing the Permanent of (Some) Complex Matrices

COMPUTING THE PERMANENT OF (SOME) COMPLEX MATRICES Alexander Barvinok June 2014 Abstract. We present a deterministic algorithm, which, for any given 0 < < 1 and an n × n real or complex matrix A = (aij ) such that jaij − 1j ≤ 0:19 for all i; j computes the permanent of A within relative error in nO(ln n−ln ) time. The method can be extended to computing hafnians and multidimensional permanents. 1. Introduction and main results The permanent of an n × n matrix A = (aij) is defined as n X Y per A = aiσ(i); σ2Sn i=1 where Sn is the symmetric group of permutations of the set f1; : : : ; ng. The problem of efficient computation of the permanent has attracted a lot of attention. It is #P -hard already for 0-1 matrices [Va79], but a fully polynomial randomized approximation scheme, based on the Markov Chain Monte Carlo approach, is constructed for all non-negative matrices [J+04]. A deterministic polynomial time algorithm based on matrix scaling for computing the permanent of non-negative matrices within a factor of en is constructed in [L+00] and the bound was recently improved to 2n in [GS13]. An approach based on the idea of \correlation decay" from statistical physics results in a deterministic polynomial time algorithm approximating per A within a factor of (1 + )n for any > 0, fixed in advance, if A is the adjacency matrix of a constant degree expander [GK10]. There is also interest in computing permanents of complex matrices [AA13]. The well-known Ryser's algorithm (see, for example, Chapter 7 of [Mi78]) computes the 1991 Mathematics Subject Classification. 15A15, 68C25, 68W25. Key words and phrases. permanent, hafnian, algorithm. This research was partially supported by NSF Grant DMS 0856640. Typeset by AMS-TEX 1 permanent of a matrix A over any field in O (n2n) time. A randomized approximation algorithm of [Fü00]computes the permanent of a complex matrix within a (properly defined) relative error in O 3n=2−2 time. The randomized algorithm of [Gu05], see also [AA13] for an exposition, computes the permanent of a complex matrix A in polynomial in n and 1/ time within an additive error of kAkn, where kAk is the operator norm of A. In this paper, we present a new approach to computing permanents of real or complex matrices A and show that if jaij − 1j ≤ γ for some absolute constant γ > 0 (we can choose γ = 0:19) and all i and j, then, for any > 0 the value of per A can be computed within relative error in nO(ln n−ln ) time (we say that α 2 C approximates per A within relative error 0 < < 1 if per A = α(1 + ρ) where jρj < ). We also discuss how the method can be extended to computing hafnians of symmetric matrices and multidimensional permanents of tensors. (1.1) The idea of the algorithm. Let J denote the n × n matrix filled with 1s. Given an n×n complex matrix A, we consider (a branch of) the univariate function (1.1.1) f(z) = ln perJ + z(A − J): Clearly, f(0) = ln per J = ln n! and f(1) = ln per A: Hence our goal is to approximate f(1) and we do it by using the Taylor polynomial expansion of f at z = 0: m X 1 dk (1.1.2) f(1) ≈ f(0) + f(z) : k! dzk z=0 k=1 It turns out that the right hand side of (1.1.2) can be computed in nO(m) time. We present the algorithm in Section 2. The quality of the approximation (1.1.2) depends on the location of complex zeros of the permanent. (1.2) Lemma. Suppose that there exists a real β > 1 such that per J + z(A − J) 6= 0 for all z 2 C satisifying jzj ≤ β: Then for all z 2 C with jzj ≤ 1 the value of f(z) = ln perJ + z(A − J) is well-defined by the choice of the branch of the logarithm for which f(0) is a real number, and the right hand side of (1.1.2) approximates f(1) within an additive error of n : (m + 1)βm(β − 1) In particular, for a fixed β > 1, to ensure an additive error of 0 < < 1, we can choose m = O (ln n − ln ), which results in the algorithm for approximating per A within relative error in nO(ln n−ln ) time. We prove Lemma 1.2 in Section 2. Thus we have to identify a class of matrices A for which the number β > 1 of Lemma 1.2 exists. We prove the following result. 2 (1.3) Theorem. There is an absolute constant δ > 0 (we can choose δ = 0:195) such that if Z = (zij) is a complex n × n matrix satisfying jzij − 1j ≤ δ for all i; j then per Z 6= 0: We prove Theorem 1.3 in Section 3. For any matrix A = (aij) satisfying jaij − 1j ≤ 0:19 for all i; j; we can choose β = 195=190 in Lemma 1.2 and thus obtain an approximation algorithm for computing per A. The sharp value of the constant δ in Theorem 1.3 is not known to the author. A simple example of a 2 × 2 matrix 1+i 1−i 2 2 A = 1−i 1+i 2 2 for which per A = 0 shows that in Theorem 1.3 we must have p 2 δ < ≈ 0:71: 2 What is also not clear is whether the constant δ can improve as the size of the matrix grows. (1.4) Question. Is it true that for any 0 < < 1 there is a positive integer N() such that if Z = (zij) is a complex n × n matrix with n > N() and jzij − 1j ≤ 1 − for all i; j then per Z 6= 0? We note that for any 0 < < 1, fixed in advance, a deterministic polynomial time algorithm based on scaling approximates the permanent of a given n × n real matrix A = (aij) satisfying ≤ aij ≤ 1 for all i; j within a multiplicative factor of nκ() for some κ() > 0 [BS11]. (1.5) Ramifications. In Section 4, we discuss how our approach can be used for computing hafnians of symmetric matrices and multidimensional permanents of tensors. The same approach can be used for computing partition functions associated with cliques in graphs [Ba14] and graph homomorphisms [BS14]. In each case, the main problem is to come up with a version of Theorem 1.3 bounding the complex roots of the partition function away from the vector of all 1s. Isolating zeros of complex extensions of real partition functions is a problem studied in statistical physics and also in connection to combinatorics, see, for example, [SS05]. 3 2. The algorithm (2.1) The algorithm for approximating the permanent. Given an n × n complex matrix A = (aij), we present an algorithm which computes the right hand side of the approximation (1.1.2) for the function f(z) defined by (1.1.1). Let (2.1.1) g(z) = perJ + z(A − J); so f(z) = ln g(z). Hence g0(z) f 0(z) = and g0(z) = g(z)f 0(z): g(z) Therefore, for k ≥ 1 we have k−1 dk X k − 1 dj dk−j (2.1.2) g(z) = g(z) f(z) dzk z=0 j dzj z=0 dzk−j z=0 j=0 (we agree that the 0-th derivative of g is g). We note that g(0) = n!. If we compute the values of dk (2.1.3) g(z) for k = 1; : : : ; m; dzk z=0 then the formulas (2.1.2) for k = 1; : : : ; m provide a non-degenerate triangular system of linear equations that allows us to compute dk f(z) for k = 1; : : : ; m: dzk z=0 Hence our goal is to compute the values (2.1.3). We have k k n d d X Y g(z) = 1 + z aiσ(i) − 1 dzk z=0 dzk z=0 σ2Sn i=1 X X = ai1σ(i1) − 1 ··· aikσ(ik) − 1 σ2Sn 1≤i1;::: ;ik≤n X =(n − k)! (ai1j1 − 1) ··· (aikjk − 1) ; 1≤i1;::: ;ik≤n 1≤j1;::: ;jk≤n where the last sum is over all pairs of ordered k-subsets (i1; : : : ; ik) and (j1; : : : ; jk) 2 of the set f1; : : : ; ng. Since the last sum contains n!=(n − k)! = nO(k) terms, the complexity of the algorithm is indeed nO(m). 4 (2.2) Proof of Lemma 1.2. The function g(z) defined by (2.1.1) is a polynomial in z of degree d ≤ n with g(0) = n! 6= 0, so we factor d Y z g(z) = g(0) 1 − ; α i=1 i α1; : : : ; αd are the roots of g(z). By the condition of Lemma 1.2, we have jαij ≥ β > 1 for i = 1; : : : ; d: Therefore, d X z (2.2.1) f(z) = ln g(z) = ln g(0) + ln 1 − for jzj ≤ 1; α i=1 i where we choose the branch of ln g(z) that is real at z = 0. Using the standard Taylor expansion, we obtain m k 1 X 1 1 ln 1 − = − + ζm; αi k αi k=1 where +1 1 1 k 1 X jζmj = ≤ m : k αi (m + 1)β (β − 1) k=m+1 Therefore, from (2.2.1) we obtain m d k! X 1 X 1 f(1) = f(0) + − + ηm; k αi k=1 i=1 where n jη j ≤ : m (m + 1)βm(β − 1) It remains to notice that d k 1 X 1 1 dk − = f(z) : k α k! dzk z=0 i=1 i 5 3.

Load more