<ᎏ ᎏ<

Polynomial Time Algorithms to Approximate Permanents and Mixed Discriminants Within a Simply Exponential FactorU

Alexander Barvinok Department of Mathematics, University of Michigan, Ann Arbor, MI 48109-1109; e-mail: [email protected]

Recei¨ed 30 April 1997; re¨ised 4 June 1998; accepted 15 July 1998

ABSTRACT: We present real, complex, and quaternionic versions of a simple randomized polynomial time algorithm to approximate the permanent of a nonnegative and, more generally, the mixed discriminant of positive semidefinite matrices. The algorithm provides an unbiased estimator, which, with high probability, approximates the true value within a factor of OcŽ n.Ž., where n is the size of the matrix matrices and where cf0.28 for the real version, cf0.56 for the complex version, and cf0.76 for the quaternionic version. We discuss possible extensions of our method as well as applications of mixed discriminants to problems of combinatorial counting. ᮊ 1999 John Wiley & Sons, Inc. Random Struct. Alg., 14, 29᎐61, 1999 Key Words: permanent; mixed discriminant; randomized algorithms; approximation algorithms

1. INTRODUCTION

In this paper, we construct a family of randomized polynomial time algorithms to approximate the permanent of a . In particular, one of our algorithmsŽ. the quaternionic algorithm of Section 2.3 provides the best known

* This research was partially supported by Alfred P. Sloan Research Fellowship, by NSF Grants DMS 9501129 and DMS 9734138, and by the Mathematical Sciences Research Institute, Berkeley, CA through NSF Grant DMS 9022140. ᮊ 1999 John Wiley & Sons, Inc. CCC 1042-9832r99r010029-33 29 30 BARVINOK polynomial time approximation for the permanent of an arbitrary nonnegative matrix. Our approximation algorithms generalize naturally to mixed discriminants, quantities of independent interest. Possible extensions of our method and applica- tions of mixed discriminants to problems of combinatorial counting are discussed in the last two sections.

1.1. Permanent s = Let A Ž.aijbe an n n matrix and let S n be the symmetric group, that is the group of all permutations of the setÄ4 1, . . . , n . The number,

n s per A ÝŁai␴ Ži. ␴g is1 Sn

G is called the permanent of A. We assume that A is nonnegative, that is aij 0 for all i, js1,...,n.If A is a 0-1 matrix, then per A can be interpreted as the number ¨ ¨ of perfect matchings in a bipartite graph G on 2n vertices 1,..., n and u1,...,un, ¨ s where Ž.ij, u is an edge of G if and only if a ij1. To compute the permanent of a given 0-1 matrix is a ࠻P-complete problem and even to estimate per A seems to be difficult. Polynomial time algorithms for computing per A are known when A has some special structure, for example, when A has a small rankwx 5 , or when A is a 0-1 matrix and per A is smallŽ seewx 14 and Section 7.3 of wx 25. . Since the exact computation is difficult, a natural question is how well one can approximate the permanent in polynomial time. In particular, is it true that for any ⑀)0 there is a polynomial timeŽ. possibly randomized algorithm that approximates the permanent of a given matrix within a relative error ⑀? In other words, does there exist a polynomial time approximation scheme? Polynomial time approxima- tion schemes are known for dense 0-1 matriceswx 15 , for ‘‘almost all’’ 0-1 matrices Žseewxwx 15, 12 , and 27. and for some special 0-1 matrices, such as those correspond- ing to lattice graphsŽ seewx 16 for a survey on approximation algorithms. . However, no polynomial time approximation scheme is known for an arbitrary 0-1 matrixŽ see wx18 for the fastest known ‘‘mildly exponential’’ approximation scheme. . Inwx 6 , the author suggested a polynomial time randomized algorithm, which, given an n=n nonnegative matrix A, outputs a nonnegative number ␣ approxi- mating per A within a simply exponential in n factor. The algorithm uses random- ization, so ␣ is a random variable. The expectation of ␣ is per A and with high probabilityŽ. say, with probability at least 0.9 we have

c n per AF␣FC per A,Ž. 1.1.1 where C and c)0 are absolute constantsŽ. with cf0.28 . However, as usual, the probability 0.9 can be improved to 1y⑀ by running the algorithm independently OŽlog ⑀y1 . times and choosing ␣ to be the median of the computed ␣ s. Recently, N. Linial, A. Samorodnitsky, and A. Wigdersonwx 22 constructed a polynomial time deterministic algorithm, which achievesŽ. 1.1.1 with Cs1 and cs1ref0.37. The algorithm uses a scaling of a given nonnegative matrix to a doubly . POLYNOMIAL TIME ALGORITHMS TO APPROXIMATE PERMANENTS 31

In this paper, we present a family of randomized polynomial time algorithms for approximating the permanent within a simply exponential factor. We present real, complex, and quaternionic versions of an unbiased estimator, each achieving a better degree of approximation than the previous one. Our estimators produce a number ␣ whose expectation is the permanent and which with high probability satisfiesŽ. 1.1.1 , where cf0.28 for the real algorithm, cf0.56 for the complex algorithm, and cf0.76 for the quaternionic algorithm. The last algorithm provides the best known polynomial time approximation for the permanent of an arbitrary nonnegative matrix. The algorithms have a much simpler structure and are easier to implement than the algorithm ofwx 6 . The first versionŽ seewx 7. of the paper contained the real algorithm only. The complex algorithm was suggested to the author by M. Dyer and M. Jerrumwx 8 . Building on the complex version, the author constructed the quaternionic version.

1.2. Mixed Discriminant = Let Q1,...,Qn be n n real symmetric matrices and let t1,...,tn be variables. q иии q Then detŽ.tQ11 tQnnis a homogeneous polynomial of degree n in t1,...,tn. The number,

Ѩ n DQ,...,Q s det tQq иии qtQ Ž.1 n Ѩ иии Ѩ Ž11 nn . t1 tn is called the mixed discriminant of Q1,...,Qn. Sometimes the normalizing factor r wx 1 n! is usedŽ cf. 21.Ž. . The mixed discriminant DQ1,...,Qn is a polynomial in the s s s entries of Q1,...,Qnkij: for Q Ž.q , k : i, j 1,...,n, k 1,...,n, we have

n DQ,...,Q s sgn ␴ sgn ␴ q␴ ␴ . 1.2.1 Ž.1 n ÝŁ Ž.Ž.1212Žk. Ž k., k Ž. ␴ ␴ g ks1 12, Sn

The mixed discriminant can be considered as a generalization of the permanent.

Indeed, fromŽ. 1.2.1 we deduce that for diagonal matrices Q1,...,Qn, where s QiidiagÄ4a 1,...,ain , we have s s DQŽ.1 ,...,Qnijper A, where A Ž.a .

Mixed discriminants were introduced by A. D. Aleksandrov in his proof of the Aleksandrov᎐Fenchel inequality for mixed volumesŽwx 2 , see also w 21 x. . The relation between the mixed discriminant and the permanent was used in the proof of the van der Waerden conjecture for permanents of doubly stochastic matricesŽ seewx 9. . G It is known that DQŽ.1,...,Qn 0 provided Q1,...,Qn are positive semidefinite Žseewx 21. . Just as it is natural to restrict the permanent to nonnegative matrices, it is natural to restrict the mixed discriminant to positive semidefinite matrices. Mixed discriminants generalize permanents but they also have some indepen- dent applications to computationally hard problems of combinatorial counting, some of which we describe in this paper. Suppose, for example, we are given a connected graph with nq1 vertices, whose edges are colored in n colors. Then the number of spanning trees having exactly one edge of each color can be expressed 32 BARVINOK as the mixed discriminant of some n positive semidefinite matrices, explicitly computed from the of the graph. Mixed discriminants play an important role in convex and integral geometryŽ seewx 21. and the problem of their efficient computation᎐approximation is not less interestingŽ but certainly less publicized. than the problem of efficient computation᎐approximation of perma- nents. Inwx 6 , the author suggested a randomized polynomial time algorithm, which, ␣ given n positive semidefinite matrices Q1,...,Qn, computes a number , which with high probability satisfies the inequalities, n F␣F и c DQŽ.1 ,...,Qn C DQ Ž.1 ,...,Qn , where c)0 and C are absolute constantsŽ. with cf0.28 . In this paper, we construct a family of algorithmsŽ. again, real, complex, and quaternionic , which achieve cf0.28Ž. real , cf0.56 Ž complex . , and cf0.76 Ž quaternionic . . The algo- rithms are natural generalizations of the permanent approximation algorithms. The real algorithmŽ. Section 3.1 can be interpreted as a ‘‘parallelization’’ of the algorithm fromwx 6 . One can note that the permanent approximation algorithm of wx22 does not obviously generalize to mixed discriminants. The paper is organized as follows. In Section 2, we describe the permanent approximation algorithms. In Section 3, we describe the mixed discriminant approximation algorithms. In Section 4, we prove some general inequalities for quadratic forms on Euclidean spaces endowed with a Gaussian measure. The results of the section constitute the core of our proofs. In Section 5, we prove a technical martingale-type result, which we use in our proofs. In Section 6, we prove the main results of the paper. In Section 7, we discuss possible extensions of our method. In particular, we discuss how to approximate hafnians and sums of subpermanents of a rectangular matrix. In Section 8, we discuss some applications of mixed discriminants to counting.

1.3. Notation Our approximation constants belong to a certain family which we define here. For a positive integer k let x1,..., xk be independent real valued random variables having the Gaussian distribution with the density

kkx2 ␺ Ž.x s exp y . ( 2␲ ½52 2 r Thus the expectation of xiiis 0 and the expectation of x is 1 k, so the expected 22q иии q ᑝ 2 value of x1 xkkis 1. Let us define to be the expectation of lnŽ x1 q иии q 2 xk .; that is, r kkxk 2 22q иии qx ᑝ s 22q иии q y Ž.1 k иии k H lnŽ.x1 xk exp dx1 dxk . ž/2␲ ޒ k ½52 In particular, we will be interested in the following values, ᑝ fy ᑝ fy ᑝ fy 121.270362845, 0.5772156649, and 40.2703628455. POLYNOMIAL TIME ALGORITHMS TO APPROXIMATE PERMANENTS 33

ᑝ sy␥y ᑝ sy␥ ᑝ s y␥y ␥f One can show that 124ln 2, , and 1 ln 2, where 0.5772156649 is the Euler constant. Let us define ᒀ s ᑝ kkexpÄ4. In particular, ey␥ ᒀ sf0.2807297419, ᒀ sey␥ f0.5614594836, and 122 e1y␥ ᒀ sf0.7631025558. 4 2 ᒀ ᒀ It turns out that 1 is the approximation constant in the real algorithms, 2 is the ᒀ approximation constant in the complex algorithms, and 4 is the approximation constant in the quaternionic algorithms. One can argue that the quaternionic version of our algorithms is more efficient than the complex version and that the latter is more efficient than the real version. However, the author believes that all three versions are of interest, and, conse- quently, discusses all three in detail. The general method can be extended to other problems of approximate countingŽ. see Section 7 and it may happen, for example, that there is an obvious real extension, whereas the existence of a quaternionic extension is problematic.

2. PERMANENT APPROXIMATION ALGORITHMS

In this section, we present the three versions of the permanent approximation algorithm: real, complex, and quaternionic. The algorithms are very similar: their = s input is an n n nonnegative matrix A Ž.aij and the output is a nonnegative number ␣, approximating per A. Our algorithms are randomized. In the course of the computation, we generate a random variable, so the output ␣ is a random variable. Specifically, we use sampling from the Gaussian distribution in ޒ1 with yx 22r2 ␴ the density ␺␴ Ž.x s Ž1r␴'2␲ .e . However, as is known Ž see, for example, Section 3.4.1 ofwx 20. , sampling from the Gaussian distribution can be efficiently simulated from the standard Bernoulli distributionŽ. sampling a random bit . The output ␣ is an unbiased estimator for per A and it turns out that ␣ is unlikely to overestimate per A by a large constant factorŽ. easy to prove and unlikely to underestimate per A by a factor of OcŽ n., where c)0 is an absolute constant Ž.harder to prove . For example, it follows that for all sufficiently large n, the probability that ␣ satisfies the inequalities, c n per AF␣F3 per A is at least 0.6. However, as we noted in Section 1, for any ⑀)0 we can improve the probability to 1y⑀ by running the algorithm OŽlog ⑀y1 . times and choosing the median of the computed ␣ s, cf.wx 17 . We can choose cs0.28 in the real algorithm, cs0.56 in the complex algorithm, and cs0.76 in the quaternionic algorithm. The computational model is the RAMŽ. random access machine with the uniform cost criterionŽ seewx 1. , so the algorithms operate with real numbers and 34 BARVINOK allow arithmetic operationsŽ. addition, subtraction, multiplication, and division and comparison of real numbers. A preprocessing requires taking the square root of each entry of the matrix A. The computational complexity of the algorithms is bounded by a polynomial in n. Each subsequent algorithm attains a better degree of approximation but is more time-consuming. The real version reduces to the computation of the of an n=n real matrix, the complex version reduces to the computation of the determinant of an n=n complex matrix, and the quaternionic version reduces to the computation of the determinant of a 2n=2n complex matrix. However, as is known, the determinant of an n=n real or complex matrix can be routinely computed using OnŽ 3. arithmetic operations. Our algorithms can be rewritten in the binaryŽ. ‘‘Turing’’ mode, so that they remain in polynomial time. This transfor- mation is straightforward; it is sketched in the first version of the paperwx 7 . The general technique of going from ‘‘real randomized’’ to ‘‘binary randomized’’ algorithms can be found inwx 24 .

2.1. The Real Algorithm

Input: A nonnegative n=n matrix A. Output: A nonnegative number ␣ approximating per A. 2 s Algorithm: Sample independently n numbers uij: i, j 1,...,n at random from yx 2 r2 the Gaussian distribution in ޒ with the density ␺ޒŽ.x s Ž1r'2␲ .e . Com- = sŽ.s ␣s Ž .2 pute the n n matrix B bij, where b ijua ij' ij . Compute det B . Output ␣.

2.1.1. Theorem.

()1 The expectation of ␣ is per A. ()2 For any C)1 the probability that ␣GCиper A does not exceed Cy1. ()3 For any 1)⑀)0 the probability that ␣F ⑀ ᒀ n Ž.1 per A r 2 ⑀ ᒀ f does not exceed 8 Žn ln ., where 1 0.28 is the constant defined in Section 1.3.

Remark Ž.Relation to the Godsil᎐Gutman Estimator . It is immediately seen that Algorithm 2.1 is a modification of the Godsil᎐Gutman estimatorŽ seewx 13. and wx ᎐ Chap. 8 of 23. . Indeed, in the Godsil Gutman estimator we sample uij from the binary distribution,

1 1, with probability2 , u s ij y 1 ½ 1, with probability2 . POLYNOMIAL TIME ALGORITHMS TO APPROXIMATE PERMANENTS 35

Furthermore, parts 1 and 2 of Theorem 2.1.1 remain true as long as we sample uij independently from some distribution with the expectation 0 and variance 1. However, part 3 does not hold true for the binary distribution. In fact, one can show that as long as uij are sampled from some fixed discrete distributionŽ or, more generally, from some distribution with atoms. , there exist 0-1 matrices with arbitrarily large permanents, for which the expected number of trials to produce a nonzero ␣ is exponentially large. Indeed, consider a version of the Godsil᎐Gut- man estimator, where uij can accept a certain value with a positive probability. Let ) = us fix an integer k 1 and let us consider the k k matrix Jk filled with 1s. If we ) multiply the Ž.i, j th entry of Jkijby u , there is some positive probability p 0 that we get a matrix with two identical rows, so the obtained matrix has the zero determinant. For n)0, consider an Ž.Ž.nk = nk matrix A which consists of n s n s n diagonal blocks Jknk. Then per A Ž.per J Ž.k! . However, the probability that the output ␣s0 is at least 1yŽ.1yp n, which approaches 1 exponentially fast as n 1r nk grows. Note that even the scaled valueŽ. per An can be made arbitrarily large, 1r nks k f r s sinceŽ. per An Ž.k! k e for large k. For example, let us choose k 8. = s n Then Annis a 0-1 8n 8n matrix and per A Ž.40, 320 . The described above binary version of the Godsil᎐Gutman estimator outputs 0 with the probability at least 1yŽ.0.8n . The complex discrete version ofwx 19Ž see also the remark in Section 2.2.Ž. outputs 0 with the probability at least 1y 0.99n . Algorithm 2.1 outputs at leastŽ. 1.52n with the probability approaching 1 as nªqϱ. Similarly, the complex versionŽ. Algorithm 2.2 outputs at least Ž. 390n and the quaternionic versionŽ. Algorithm 2.3 outputs at least Ž. 4, 487n with the probability approaching 1 as nªqϱ. Of course, for other matricesŽ. for example, for the and even for the majority of matricesŽ cf.wx 12. the binary version of the estimator may perform better than Algorithm 2.1. However, any version of the Godsil᎐Gutman estimator which approximates the permanent of an arbitrary 0-1 matrix within some positive constant in expected polynomial time must either use a continuous distribution or a sequence of discrete distributions which depend on the size of the matrix.

2.2. The Complex Algorithm By the standard complex Gaussian distribution in ރ we mean the distribution of a complex random variable zsxqiy with the density,

11 ␺ s yŽ x 22qy . s y< z < 2 ރ Ž.z ␲␲e e .

Here <

Input: A nonnegative n=n matrix A. Output: A nonnegative number ␣ approximating per A. 36 BARVINOK

2 Algorithm: Sample independently n numbers uij at random from the standard complex Gaussian distribution in ރ with the density ␺ރ . Compute the n=n sŽ. s ␣s<<2 matrix B bij, where b ijua ij' ij . Compute det B .

2.2.1. Theorem.

()1 The expectation of ␣ is per A. ()2 For any C)1 the probability that ␣GCиper A does not exceed Cy1. ()3 For any 1)⑀)0 the probability that ␣F ⑀ ᒀ n Ž.2 per A r 2 ⑀ ᒀ f does not exceed 8 Žn ln ., where 2 0.56 is the constant defined in Section 1.3.

Remark Ž.Relation to the Karmarkar᎐Karp᎐Lipton᎐Lovasz´ ᎐Luby Estimator . It is seen that Algorithm 2.2 is a simple modification of the estimator fromwx 19 . In wx 19 , the authors sample uij from the set of roots of unity of degree 3 and then use averaging over a large number of trials. The goal ofwx 19 is somewhat dual to our goal. We want to minimize the error of approximation keeping the running time polynomial whereas the authors ofwx 19 want to minimize the time needed to approximate the permanent keeping the relative error of approximation at most Ž.1q⑀ , where ⑀)0 is a part of the input. The complexity of the algorithm from wx19 is polyŽ. n ⑀y2 2n r2 and, while still exponential, is better than n2n complexity of Ryser’s exact algorithmŽ see Section 7.2 ofwx 25. . Obviously, both approaches have their advantages. Algorithms 2.1᎐2.3 provide a quick rough estimate, whereas the algorithms fromwx 18 and wx 19 are much more precise but also more time-consum- ing. It would be interesting to find out whether by applying repeated sampling and averaging in Algorithm 2.2, one can get the running time to achieve the relative error ⑀ comparable to that ofwx 19 , and if that can be extended to arbitrary nonnegative matricesŽ as the performance of the algorithm fromwx 19 is guaranteed for 0-1 matrices only. .

2.3. The Quaternionic Algorithm We recall that the algebra ވ of is the four-dimensional real vector space with the basis vectors 1, i, j, k that satisfy the following multiplication rules,

i2 sj2 sk 2 sy1, ijsyjisk, jksykjsi, kisyiksj, and 1isi1si, 1jsj1sj, 1ksk1sk. The norm <

ވ we mean the distribution of a quaternionic random variable hsaqibqjcqk d with the density,

4 2222 ␺ h s ey2Ž a qb qc qd .. ވ Ž. ␲ 2

In particular, the expectation of <

Input: A nonnegative n=n matrix A. Output: A number ␣ approximating per A.

2 Algorithm: Sample independently n quaternions uij at random from the stan- dard quaternionic Gaussian distribution in ވ with the density ␺ވ. Compute the = sŽ. s s q q n n quaternionic matrix H hij, where h ijua ij' ij . Write H R i B jCqk D, where R, B, C, and D are real n=n matrices. Construct the 2n=2n complex matrix,

RqiB CqiD H s . ރ ž/yCqiD RyiB

Compute ␣sdet Hރ . Output ␣.

2.3.1. Theorem.

()1 The output ␣ is a nonnegati¨e real number and its expectation is per A. ()2 For any C)1 the probability that

␣GCиper A does not exceed Cy1. ()3 For any 1)⑀)0 the probability that

␣F ⑀ ᒀ n Ž.4 per A r 2 ⑀ ᒀ f does not exceed 8 Žn ln ., where 4 0.76 is the constant defined in Section 1.3.

Remark Ž.Relations to the Real and Complex Estimators . An obvious obstacle to constructing a quaternionic version of Algorithms 2.1 and 2.2 is that it appears that we need to use the ‘‘determinant’’ of a quaternionic matrix. However, a closer look reveals that what we need is the squared norm of the determinant. This ‘‘squared norm of the determinant’’ can be defined using the canonical two-dimensional complex representation of the quaternions. It turns out that the determinant of the complex matrix Hރ , computed in Algorithm 2.3, is the right definition of the squared norm of the determinant of a quaternionic matrix H Žit is known as the ‘‘reduced norm,’’ or the squared norm of the Dieudonne´ determinant, see Chap. 38 BARVINOK

VI, Section 1 ofwx 3. . As an analogue, let us point out that the squared absolute value of the determinant of an n=n complex matrix can be interpreted as the determinant of a 2n=2n real matrix,

2 AB detŽ.AqiB sdet . ž/yBA Example Ž.The Identity Matrix . The following example provides some intuition why the quaternionic estimator gives a better approximation than the complex estimator and why the complex estimator gives a better approximation than the real estimator. Suppose that AsI is the n=n identity matrix, so per As1. Algorithm 2.1 approximates 1 by the product, ␣s 2 иии 2 x1 xn , s where xiiiu are independent random variables sampled from the standard ޒ1 2 Gaussian distribution in . The expectation EŽ xi . of each factor is 1, so the 2 expected value of the product is 1. However, since the values of xi can deviate from the expectation, we can accumulate an exponentially large deviation. Since ln ␣ 1 n s 2 Ý ln xi , nnis1 ␣ r ᑝ s 2 the law of large numbers implies thatŽ. ln n concentrates around 1 E Žln xi . ␣ ᑝ sᒀ n Ž.cf. Section 1.3 , so is reasonably close to expÄ4n 11most of the time. Algorithm 2.2 approximates 1 by the product, 2 q 2 2 q 2 x11y xnny ␣s иии , ž/ž/22 where xii, y are independent random variables sampled from the standard Gauss- ian distribution in ޒ1. Again, the expectation of each factor is 1, but because of the 22q r averaging, Ž xiiy . 2 is concentrated around its expectation somewhat more 2 2 sharp than either xiior y . Therefore, the accumulated error, while still exponen- tial, is smaller than that in Algorithm 2.1. Indeed, we haveŽ. cf. Section 1.3 , 2 q 2 xiiy ᒀ sexp E ln )ᒀ . 21½5ž/2 Finally, Algorithm 2.3 approximates 1 by the product, 2 q 2 q 2 q 2 2 q 2 q 2 q 2 a1111b c d annnb c d n ␣s иии , ž/ž/44 where aiiii, b , c , d are independent random variables sampled from the standard Gaussian distribution in ޒ1. Here we have a still sharper concentration of each ᒀ factor around its expectation, so the total error gets smaller. For the constant 4 we haveŽ. cf. Section 1.3 , 2 q 2 q 2 q 2 aiiib c d i ᒀ sexp E ln )ᒀ . 42½5ž/4 POLYNOMIAL TIME ALGORITHMS TO APPROXIMATE PERMANENTS 39

It appears that the Algorithms 2.1᎐2.3 are related to Clifford algebras ޒ, ރ, and ވ with 0, 1, and 3 generators, respectively,Ž cf. Sect. 41 ofwx 26. . There seem to be ways to associate an approximation algorithm with any Clifford algebra, but it is not clear at the moment whether those algorithms can be of any interest. It would be interesting to find out whether for any k there is a version of our algorithm ᒀ n ᒀ ª ªqϱ achieving a kkapproximationŽ. we note that 1as k .

Remark. How well can we approximate the permanent in polynomial time? Suppose we have a polynomial timeŽ. probabilistic or deterministic algorithm that for any given n=n Ž.nonnegative or 0-1 matrix A computes a number ␣ such that ␾ Ž.n per AF␣Fper A. What sort of function might ␾Ž.n be? For an n=n matrix A and k)0, let us = construct the nk nk block- Ak , having k diagonal copies of A. We observe that Akkis nonnegative if A is nonnegative and A is 0-1 if A is 0-1, s 1r k and that per A Ž.per Akk. Applying our algorithm to A and taking the root we ␣ get an approximation k , where ␾ 1rk F␣ F Ž.Ž.nk per A k per A. ␾ ␾ s ␾ 1r k Therefore, we can always improve to k ŽŽnk .. , where k must be bounded by a polynomial in n to keep polynomial time complexity. For example, if ␾Ž.n sny␤ for some ␤)0orif␾Ž.n sexpÄyn ␤ 4 for some 0-␤-1, then by s ␾ ␾ s y⑀ choosing a sufficiently large k knŽ.we can always improve to k Ž1 .with any given 1)⑀)0. There are few obvious choices for ‘‘nonimprovable’’ functions ␾.

()a ␾Ž.n '1. This does not look likely, given that the problem is ࠻P-hard. ()b For any ⑀)0 one can choose ␾⑀Ž.n s1y⑀, and the algorithm is polynomial in ⑀y1. In the author’s opinion this conjecture is overly optimistic. n ()c For any ⑀)0 one can choose ␾⑀Ž.n s Ž1y⑀ ., but the algorithm is not polynomial in ⑀y1. The existence of this type of approximation was conjec- tured by V. D. Milman. ()d ␾Ž.n sc n for some fixed constant c. This is the type of a bound achieved by Algorithms 2.1᎐2.3. An interesting question is, what is the best possible constant c?

3. MIXED DISCRIMINANT APPROXIMATION ALGORITHMS

It turns out that the permanent approximation algorithms of Section 2 can be naturally extended to mixed discriminants. The input of the algorithms consists of = n n positive semidefinite matrices Q1,...,Qn and the output is a nonnegative ␣ number approximating DQŽ.1,...,Qn . As in Section 2, we use the real model of computation. A preprocessing requires representing each matrix Qi as the product s U QiiiTT of a real matrix and its transpose. This is a standard procedure of linear algebra, which requires OnŽ 3. arithmetic operations and n square root extractions Žsee, for example, Chap. 2, Section 10 ofwx 11. . 40 BARVINOK

3.1. The Real Algorithm gޒ n s ␰ ␰ The algorithm requires sampling vectors x , x Ž.1,..., n from the standard Gaussian distribution in ޒ n with the density,

y55x 2 ynr2 552 22 ␺ޒ n Ž.x s Ž2␲ . exp , where x s␰ q иии q␰ . ½52 1 n ␰ ␰ We interpret x as an n-column of real numbers 1,..., n. Note that the expecta- ␰ 22␰ ᑝ tion of iiis 1 and that the expectation of ln is 1 Ž.see Section 1.3 . To sample ␰ ␰ from the distribution, we can sample the coordinates 1,..., n independently from the one-dimensional standard Gaussian distribution with the density ␺ޒ s 2 1r'2␲ eyx r2, cf. Section 2.1. Our algorithm is the following:

= Input: Positive semidefinite n n matrices Q1,...,Qn. ␣ Output: A number approximating the mixed discriminant DQŽ.1,...,Qn . s s U Algorithm: For i 1,...,n compute a decomposition QiiiTT . Sample inde- pendently n vectors u1,...,un at random from the standard Gaussian distribu- n tion in ޒ with the density ␺ޒ nŽ.x . Compute

␣s wx2 Ž.det Tu11,...,Tunn ,

the squared determinant of the matrix with the columns Tu11,...,Tunn. Out- put ␣.

3.1.1. Theorem.

␣ ()1 The expectation of is the mixed discriminant DŽ. Q1,...,Qn . ()2 For any C)1 the probability that ␣G и C DQŽ.1 ,...,Qn does not exceed Cy1. ()3 For any 1)⑀)0 the probability that ␣F ⑀ ᒀ n Ž.Ž11DQ,...,Qn . r 2 ⑀ ᒀ f does not exceed 8 Žn ln ., where 1 0.28 is the constant defined in Section 1.3.

Remark ŽRelation to the Estimator fromwx 6. . The first randomized polynomial time algorithm to approximate the mixed discriminant within an exponential factor was constructed in the author’s paperwx 6 . It achieves asymptotically the same degree of approximation as Algorithm 3.1. The idea of the algorithm fromwx 6 is to apply repeatedly the following two steps: first, by applying an appropriate linear transformation, we reduce the problem to the special case, where Q sI is the XXn identity matrix. Next, we replace DQŽ.Ž1,...,Qny11, I by nD Q ,...,Qny1., where X y = y Qi are Ž.Ž.n 1 n 1 symmetric matrices interpreted as projections of Qi onto a randomly chosen linear hyperplane in ޒ n. Algorithm 3.1 can be interpreted as a POLYNOMIAL TIME ALGORITHMS TO APPROXIMATE PERMANENTS 41

parallelization of the algorithm fromwx 6 . Instead of the successive projections, we ޒ n independently project the matrices Qi onto randomly chosen lines in .

3.2. The Complex Algorithm s ␨ ␨ gރ n The algorithm requires sampling random complex vectors z Ž.1,..., n from the standard Gaussian distribution with the density, 1 552222 55 < < < < ␺ n z s exp y z , where z s ␨ q иии q ␨ . ރ Ž. ␲ n Ä4 1 n ␨ ␨ We interpret z as an n-column of complex numbers 1,..., n. Note that the <<␨ 2 <<␨ 2 ᑝ expectation of iiis 1 and that the expectation of ln is 2 Ž.see Section 1.3 . ␨ To sample from the distribution, we can sample each i independently from the standard one-dimensional complex Gaussian distribution with the density ␺␨ރŽ.s 2 1r␲ ey< ␨ < , cf. Section 2.2. Our algorithm is the following:

= Input: Real positive semidefinite n n matrices Q1,...,Qn. ␣ Output: A real number approximating the mixed discriminant DQŽ.1,...,Qn . s s U Algorithm: For i 1,...,n compute a decomposition QiiiTT . Sample inde- pendently n vectors u1,...,un at random from the standard complex Gaussian distribution in ރ n. Compute ␣s wx2 det Tu11,...,Tunn , the squared absolute value of the determinant of the matrix with the columns

Tu11,...,Tunn. Output ␣.

3.2.1. Theorem.

␣ ()1 The expectation of is the mixed discriminant DŽ. Q1,...,Qn . ()2 For any C)1 the probability that ␣G и C DQŽ.1 ,...,Qn does not exceed Cy1. ()3 For any 1)⑀)0 the probability that ␣F ⑀ ᒀ n Ž.Ž21DQ,...,Qn . r 2 ⑀ ᒀ f does not exceed 8 Žn ln ., where 2 0.56 is the constant defined in Section 1.3.

3.3. The Quaternionic Algorithm ވn s ␶ ␶ ␶ gވ Let be the set of all n-tuplesŽ.Ž. vectors h 1,..., niof quaternions . For an n=n real matrix T and a vector hsaqibqjcqk dgވn, where a, b, c, d gޒ n,byThgވn we understand the vector TaqiTbqjTcqkTd. The algorithm 42 BARVINOK

s ␶ ␶ requires sampling a random vector h Ž.1,..., n from the standard quaternionic Gaussian distribution in ވn with the density, 4n 552222 55 < < < < ␺ n h s exp y2 h , where h s ␶ q иии q ␶ . ވ Ž. ␲ 2 n Ä4 1 n ␶ ␶ We interpret h as an n-column of quaternions 1,..., n. Note that the expectation <<␶ 2 <<␶ 2 ᑝ of iiis 1 and that the expectation of ln is 4 Ž.see Section 1.3 . To sample ␶ gވ from the distribution, we can sample each i independently from the standard one-dimensional quaternionic Gaussian distribution with the density ␺␶ވŽ.s 2 4r␲ 2 ey2 <␶ < , cf. Section 2.3. Our algorithm is the following:

= Input: Real positive semidefinite n n matrices Q1,...,Qn. ␣ Output: A number approximating the mixed discriminant DQŽ.1,...,Qn . s s U Algorithm: For i 1,...,n compute a decomposition QiiiTT . Sample inde- pendently n vectors u1,...,un at random from the standard quaternionic Gaussian distribution in ވn. Compute the n=n quaternionic matrix Hs wx s q q q Tu11,...,Tunn, whose columns are Tu11,...,Tunn. Write H A i B jC k D, where A, B, C, and D are real n=n matrices. Construct the 2n=2n complex matrix, AqiB CqiD H s . ރ ž/yCqiD AyiB

Compute ␣sdet Hރ . Output ␣.

3.3.1. Theorem.

()1 The output ␣ is a nonnegati¨e real number and its expectation is the mixed

discriminant DŽ. Q1,...,Qn . ()2 For any C)1 the probability that ␣G и C DQŽ.1 ,...,Qn does not exceed Cy1. ()3 For any 1)⑀)0 the probability that ␣F ⑀ ᒀ n Ž.Ž41DQ,...,Qn . r 2 ⑀ ᒀ f does not exceed 8 Žn ln ., where 4 0.76 is the constant defined in Section 1.3.

3.4. Relation to the Permanent Approximation Algorithms s = Let A Ž.aij be an n n nonnegative matrix. Let us construct n matrices s Q1,...,Qn by placing the ith row of A as the diagonal of Qii: Q diagÄ4a i1,...,ain . s Then we have per A DQŽ.Ž.1,...,Qn cf. Section 1.2 . Since A is nonnegative, Q1,...,Qn are positive semidefinite matrices. Now it is seen that if we choose s ᎐ TiidiagÄ4''a 1 ,..., ain , then Algorithms 3.1 3.3 with the input Q1,...,Qn trans- POLYNOMIAL TIME ALGORITHMS TO APPROXIMATE PERMANENTS 43 form into Algorithms 2.1᎐2.3 with the input A. Hence Theorems 3.1.1᎐3.3.1 are straightforward generalizations of Theorems 2.1.1᎐2.3.1.

4. GAUSSIAN MEASURES AND QUADRATIC FORMS

4.1. Gaussian Measures Given a space X with a probability measure ␮, we denote the expectation of a function f: Xªޒ by EŽ.f . In this paper, X will be the space ޒ nn, ރ sޒ 2 n,or ވn sޒ 4 n and ␮ will be a Gaussian measureŽ. distribution . 55s ␰ 22q иии q␰ s ␰ ␰ Let x ' 1 m be the standard norm of a vector x Ž.1,..., m in ޒ m gޒ m ␰ ␰ . As usual, we interpret x as an m-column of numbers 1,..., m. The probability measure ␮ in ޒ m with the density ␺ Ž.x s Ž2␲ .ym r2 expÄy55x 2r24 is called the standard GaussianŽ. or normal distribution. A Gaussian distribution in ޒ m is the distribution of the vector Tx where xgޒ m has the standard Gaussian distribution and T is a fixed m=m matrixŽ if T is degenerate, the distribution is concentrated on the image of T .. ␮ s ␰ ␰ gޒ m s If is a probability distribution of x Ž.1,..., mij, the matrix Q Ž.q , s ␰␰ ¨ ␮ where qijEŽ. i j , is called the co ariance matrix of x. For example, if is the standard Gaussian distribution in ޒ m, the is the m=m identity matrix I. We often use the following fact: if xgޒ m has the standard Gaussian distribution in ޒ m and T is an m=m matrix, then the covariance matrix of the U vector Tx is QsTT .

4.2. Theorem. Let us fix a Gaussian measure ␮ in ޒ n. Let q: ޒ n ªޒ be a positi¨e semidefinite quadratic form, such that

EŽ.q s1. Then ᑝ F F 1 EŽ.ln q 0, Ž. 1 ᑝ fy where 1 1.27 is the constant defined in Section 1.3, and 0FEŽ.ln2 q F8.Ž. 2

Proof. Without loss of generality, we can assume that ␮ is the standard Gaussian measure with the density,

y552 y r x ␺ Ž.x s Ž2␲ .n 2 exp , ½52

Ž.otherwise, we apply a suitable linear transformation . Since ln is a concave function, by Jensen’s inequality we have EŽ.ln q Fln E Ž.q s0. Let us decompose q s␭ q иии q␭ into a nonnegative linear combination q 11q nnq of positive semidefi- s s nite forms qiiiof rank 1. We can scale q so that EŽ.q 1 for i 1,...,n and then ␭ q иии q␭ s ␭ we have 1 ni1. In fact, one can choose to be the eigenvalues of the 44 BARVINOK

s 2 matrix of q and qii²:x, u , where u iis the corresponding unit eigenvector and ²:и, и is the standard scalar product in ޒ n. Since ln is a concave function, we have ␭ q иии q␭ G␭ q иии q␭ lnŽ.11q nnq 11ln q nnln q . Furthermore, since q iis a positive semidefinite form of rank 1, by an orthogonal transformation of the coordinates it s␣ 2 2 s can be brought into the form qxiŽ. x11. Since EŽ x . 1, we conclude that ␣s s 2 sᑝ 1. Therefore, EŽ.Žln qi E ln x11.Žcf. Sections 3.1 and 1.3 . and G␭ q иии q␭ s ␭ q иии q␭ ᑝ sᑝ EŽ.ln q 11E Žln q .nnE Žln q .Ž1 n .11, so Part 1 is provedw we note that this reasoning proves that EŽ.ln q is well-definedx . Let XsÄ xgޒ n: qxŽ.F14 and Ysޒ n RX. Then

EŽ.ln2 q sHH␺ Ž.x ln2 qxdx Ž.q ␺ Ž.x ln2 qxdx Ž. . XY s␭ q иии q␭ Let us estimate the first integral. Decomposing q 11q nnq as above, we G␭ q иии q␭ F g get ln q 11ln q nnln q . Since ln qxŽ. 0 for x X, we get that n 2 F ␭␭ ln qxŽ.Ý ijŽ.ln qx i Ž.Ž.ln qx j Ž., i, js1 for xgX. Therefore,

n ␺ 2 F ␭␭ ␺ HHŽ.x ln qxdx Ž.Ý ij Ž.x Ž.ln qx i Ž.Ž.ln qx j Ž. dx XXi, js1

n 1r21r2 F ␭␭ ␺ 22␺ Ý ijHHŽ.x ln qxdx i Ž. Ž.x ln qxdx j Ž. , s ž/ž/XX i, j 1

Ž.we applied the Cauchy᎐Schwartz inequality ,

n r 1r2 F ␭␭ 221 2 Ý ijŽ.EŽ.ln q iž/EŽ.ln q j . i, js1

Now, as in the proof of Part 1 we have

8 qϱ 2222s s yt 2 r2 f F EŽ.Ž.ln qi E ln x1 H Ž.ln te dt 6.548623960 7. '2␲ 0

Summarizing, we get

nn r 1r2 ␺ 222F ␭␭ 1 2 F ␭␭s H Ž.x ln qxdx Ž. ÝÝijŽ.EŽ.ln q iž/EŽ.ln q j7 ij 7. X i, js1 i, js1

Since for 0Fln tF't for tG1, we have

HH␺ Ž.x ln2 qxdx Ž.F qx Ž.␺ Ž.xdxFE Ž.q s1. YY Therefore, EŽln2 q.F7q1s8 and Part 2 is proved. B POLYNOMIAL TIME ALGORITHMS TO APPROXIMATE PERMANENTS 45

Remark Ž.Role of the Gaussian Distribution . Let us consider Algorithm 3.1. A natural question is to what extent it is important to sample vectors u1,...,un from the Gaussian distribution, as compared to sampling from some other distribution ␮ in ޒ n. Our method carries over as long as ␮ satisfies the following properties: first, the expectation of a random vector xgޒ n is 0 and the covariance matrix is the identity matrix I and second, if q: ޒ n ªޒ is a positive semidefinite quadratic form such that EŽ.q s1, then E Žln q .is bounded below by a universal constant ᑝsᑝ Ž␮ .. Furthermore, the closer to 0 that ᑝ can be chosen, the better approximation we get. It is seen that any discrete distributionŽ or, more generally, a distribution with atoms. fails the test for n)1, since if a vector xgޒ n occurs with a positive probability and qxŽ.s0, then E Žln q .syϱ. One can use continuous distributions other than Gaussian, but the author suspects that asymptotically, for large n, the Gaussian distribution provides the best constant ᑝ. For example, the uniform ␰ 2 q иии q␰ 2 s ޒ n distribution on the sphere 1 n n in might give better constants for ᑝ wx small n, but asymptotically it gives the same constant 1 Žcf. Theorem 3.5 of 6. . The next two results of this section state that for quadratic forms from some particular classes and some special Gaussian distributions we can get better estimates than in the general case.

Hermitian Forms. We recall that a function q: ރ n ªޒ,

n s ␤␨␨ s ␨ ␨ qzŽ.Ý ij i j, where z Ž1 ,..., n ., i, js1

␤ s␤ s ␤ and ij ji for i, j 1,...,n is called a Hermitian form with the matrix Ž.Žij see, for example, Section 19 ofwx 26.Ž. . The form is called positive semidefinite, if qzG0 for all zgރ n. We note that q can be considered as a real quadratic form q: ޒ 2 n ªޒ, if we identify ރ n sޒ 2 n. We fix the standard complex Gaussian distribu- n n y5 z 5 2 tion in ރ with the density ␺ރ n s1r␲ e , cf. Section 3.2.

4.3. Theorem. Let us fix the standard complex Gaussian distribution in ރ n. Let q: ރ n ªޒ be a positi¨e semidefinite Hermitian form such that EŽ.q s1. Then ᑝ F F 2 EŽ.ln q 0, ᑝ fy where 2 0.58 is the constant defined in Section 1.3.

Proof. The proof is similar to that of partŽ. 1 of Theorem 4.2. Since q is a positive semidefinite Hermitian form, it can be represented as a convex combination s␭ q иии q␭ q 11q nnq of positive semidefinite Hermitian forms q iof rank 1 such s that EŽ.qi 1. Since ln is a concave function, we get that G␭ q иии q␭ EŽ.ln q 11E Žln q .nnE Žln q ..

Since qi is a positive semidefinite Hermitian form of rank 1, by a unitary transfor- mation of ރ n, it can be brought into the form

s␣␨<<2 s ␨ ␨ qzi Ž.11, where z Ž,..., n ., 46 BARVINOK ␣ s ␣s s and is nonnegative real. Since EŽ.qii1, we must have 1, so E Žln q . <<␨ 2 sᑝ EŽln 12.Žcf. Sections 3.2 and 1.3 . . Summarizing, we get G␭ q иии q␭ s ␭ q иии q␭ ᑝ EŽ.ln q 1E Ž.qinnE Ž.Žq 1 n .2 , and the proof follows. B

As we see, the lower bound for the expectation of ln q, where q is a Hermitian form, is better than for general quadratic forms. The reason for this improvement is, roughly, the following: the ‘‘worst possible’’ forms are those of rank 1. However, a Hermitian form of rank 1 is a real form of rank 2. Next, we see that quaternionic quadratic forms provide a still better bound.

ވn ␶ ␶ Quaternionic Forms. Let be a real vector space of all n-tuples Ž.1,..., n of ␶ gވ ވ ޒ 4 quaternions i . As a vector space, can be identified with via the identification aqibqjcqk dsŽ.a, b, c, d , and so ވn can be identified with ޒ 4 n. ވ ވn s ␶ ␶ ␶gވ We fix the structure of a right -module of : for u Ž.1,..., n and ,we ␶s ␶␶ ␶␶ let u Ž.1 ,..., n . We note that the right multiplication by i, j, and k are orthogonal transformations of ޒ 4 n without fixed nonzero vectors. By a quadratic form on ވn we understand a function q: ވn ªޒ which is an ordinary quadratic form under the identification ވn sޒ 4 n. We say that q is positive semidefinite if quŽ.G0 for any ugވn. Let us fix the standard quaternionic Gaussian distribution n n 2 n y2 5 u5 2 in ވ with the density ␺ވ nŽ.u s4 r␲ e , cf. Section 3.3.

4.4. Theorem. Let us fix the standard quaternionic Gaussian distribution in ވn. Let q: ވn ªޒ be a positi¨e semidefinite quadratic form such that EŽ.q s1. Suppose further, that qŽ. u squ Ži .squ Žj .squ Žk .for any ugވn. Then ᑝ F F 4 EŽ.ln q 0, ᑝ fy where 4 0.27 is the constant defined in Section 1.3.

Proof. Let Q be the 4n=4n real of q: ޒ 4 n ªޒ as a real quadratic form, so qxŽ.s ²Qx, x :, where ²и, и :is the standard scalar product on ޒ 4 n gޒ 4 n и s и . Then for the differential of q at a point x we have dqxŽ.2 ²Qx, :. Let SsÄ xgވn: 55x s14 be the unit sphere. As is known, xgS is an eigenvector of Q with an eigenvalue ␭ if and only if x is a critical point of the restriction q: ªޒ S Ž.that is, dqx is 0 on the tangent space at x with the corresponding critical value ␭sqxŽ.. Since q is invariant under the orthogonal transformations given by right multi- plication by i, j, and k, we conclude that if x is an eigenvector of Q with an eigenvalue ␭, then so are xi, x j, and xk. It follows that each eigenspace of Q is a right ވ-submodule of ވn. In particular, the multiplicity of each eigenvalue of Q is a multiple of 4. Therefore, q can be expressed as a nonnegative linear combination s␭ q иии q␭ s q 11q nnq of quadratic forms, such that for each i, EŽ.q i 1 and by an ޒ 4 n orthogonal transformation of , qi can be written as a normalized sum of four s␣␰2222q␰ q␰ q␰ s ␣s squared coordinates: qxiŽ. Ž1234.Ž.. Since E qi 1 we have 1 sᑝ and EŽ.ln qi 4 Žcf. Sections 3.3 and 1.3 . . Since ln is a concave function, we conclude that G␭ q иии q␭ G ␭ q иии q␭ ᑝ EŽ.ln q 11E Žln q .nnE Žln q .Ž1 n .4 , and the proof follows. B POLYNOMIAL TIME ALGORITHMS TO APPROXIMATE PERMANENTS 47

5. A MARTINGALE INEQUALITY

In this section, we prove a general result from the probability theory. Although the technique is quite standard, we present a complete proof here, since we need our estimate in a particular form suitable for proving the main results of the paper.

5.1. Conditional Expectations In this subsection, we summarize some general results on measures and integra- tion, which we exploit in Lemma 5.2 below. As a general source, we usewx 4 . Let us fix a probability measure ␮ on the Eulidean space ޒ m. Suppose that ␮ is absolutely continuous with respect to the Lebesgue measure and let ␺ Ž.x be the density of ␮. Suppose that we have k copies of the Euclidean space ޒ m, each endowed with the measure ␮. We consider functions f: ޒ m = иии =ޒ m ªޒ that ␯ s␮ are defined almost everywhere and integrable with respect to the measure k = иии =␮ y . Let fuŽ.1,...,uk be such a function. Then for almost all Ž.k 1 -tuples gޒ m и Ž.u1,...,uky1 with ui , the function fuŽ.1,...,uky1, is integrable Ž Fubini’s theorem.Ž. and we can define the conditional expectation Ek f , which is a function y of the first k 1 variables u1,...,uky1, s ␺ Ek Ž.Žfu1 ,...,uky11 .fu Ž,...,uky1 , ukkk . Žudu . . Hޒ m Fubini’s theorem implies that s иии EŽ.f E1 Ek Ž.f , ␯ where E is the expectation with respect to the product measure k . Tonelli’s ␯ theorem states that if f is k-measurable and nonnegative almost surely with ␯ иии -qϱ ␯ respect to k and if E1 EkkŽ.g , then f is -integrable. - If fuŽ.1,...,ui is a function of i k arguments, we may formally extend it to ޒ mm= иии =ޒ s Ž.k times by letting fu Ž1,...,uk .Ž.Ž.fu1,...,ui .If fu1,...,ui is ␯ ␯ i-integrable, then fuŽ.1,...,ukkis -integrable. We note the following useful facts:

F ()5.1.1 The operator Ek is linear and monotone, that is, if fuŽ.1,...,uk ␯ F guŽ.1,...,ukkkkalmost surely with respect to , then E Ž.Ž.f E g almost ␯ surely with respect to ky1. - ␯ ()5.1.2 If fuŽ.1,...,uk is integrable and gu Ž.1,...,uii, i k is a -measurable s function, then EkkŽ.gf gE Ž.f . s␣ ␯ s ()5.1.3 If f is a constant almost surely with respect to kk, then E Ž.f a ␯ almost surely with respect to ky1.

In this section, we prove the following technical lemmaŽ. a martingale inequality .

s 5.2. Lemma. Suppose that fkŽ. u1,...,uk , k 1,...,n is an integrable function on the ޒ m = иии =ޒ m ޒ m ␯ s␮= иии =␮ product of k copies of , and let n be the nth product measure. Suppose that for some a, bgޒ and all ks1,...,n, F 2 F ␯ a EkkŽ.f and E kkŽ.f b almost surely with respect to ky1 . 48 BARVINOK

Then for any ␦)0,

1 n b ␯ u ,...,u : fu,...,u Fay␦ F . n Ž.1 nkÝ Ž.1 k ␦ 2 ½5n s n k 1

s s y Proof. Let gkkkE Ž.f and let h kkkf g . Since g kdoes not depend on u k, usingŽ. 5.1.2 , we get

2 s 2 y q 2 s 2 y 2 EkkŽ.h E kk Ž.f 2E kkkŽ.gf E kk Ž.g E kk Ž.f g k.

Hence we may write

s q s G 2 F fkkg h k, where E kkŽ.h 0, g ka, and E kkŽ.h b,

␯ almost surely with respect to ky1. Let

1 n s HuŽ.1 ,...,unkÝ hu Ž.1 ,...,uk . n ks1

s Now for U Ž.u1,...,un we have

1 n ␯ F y␦ nkU: Ý fuŽ.1 ,...,uk a ½5n s k 1 1 n s␯ q F y␦ nkU: HUŽ.Ý gu Ž1 ,...,uky1 . a ½5n s k 1 EŽ.H 2 F␯ Ä4U: HU Fy␦ F we used Chebyshev’s inequality n Ž.␦ 2 Ž . 12n s E h2 q E hh . ␦ 22ÝÝŽ.kij␦ 22 Ž. n ks11n Fi-jFn

We note that it is legitimate to pass to global expectations E here. Indeed, since 222F F hkkkkkis nonnegative and E h E f b it follows by formulasŽ.Ž. 5.1.1 5.1.3 , and 2 ␯ <

s иии s иии s иии s EŽ.hhij E1 Enij Ž.hh E1 E jij Ž.hh E iE jy1 hijE Ž.h j 0.

The proof now follows. B POLYNOMIAL TIME ALGORITHMS TO APPROXIMATE PERMANENTS 49

6. PROOFS

As we noted in Section 3.4, Theorems 3.1.1᎐3.3.1 imply Theorems 2.1.1᎐2.3.1. Proofs of Theorems 3.1.1, 3.2.1, and 3.3.1 are very similar. Theorem 3.3.1 provides the best approximation known so far and it is the hardest to prove, so we present its detailed proof here. Theorem 2.1.1 is the easiest to prove, so we present its detailed proof as well, because there the main ideas of the general method can be easily traced. Then we describe the modifications one needs to make to prove Theorems 2.2.1, 3.1.1, and 3.2.1.

Proof of Theorem 2.1.1. The proof of part 1 follows the proof that the Godsil᎐ Gutman estimator is unbiasedŽ see Chap. 8 ofwx 23. . Let us write ␣ as a polynomial in uij,

n 2 ␣s 2 s ␴ Ž.det B ÝŁsgn uai␴ Ži.' i␴ Ži. ␴g is1 ž/Sn n s Ž.Ž.sgn ␴ sgn ␴ uu␴ ␴ aa␴ ␴ . ÝŁ12i 12Ži. i Ži.' i 12Ži. i Ži. ␴ ␴ g is1 12, Sn ␴ ␴ g For every pair of permutations 12, Sn, the corresponding summand is a monomial in variables uij. Each variable u ij from the monomial occurs either with ␴ s␴ s degree 2 if 12Ž.i Ž.i j, or with degree 1 otherwise. Next, we observe that ␴ s␴ unless 12, the corresponding summand contains some uij with degree 1. Since the expectation of uij is 0, the expectation of the whole monomial is 0. Hence, n ␣ s ␴ 2 2 EŽ.ÝŁ Žsgn .E uai␴ Ži. i␴ Ži. . ␴g ž/is1 Sn 2 s ␣ s Since uijare independent and EŽ.u ij 1, we conclude that EŽ. per A. Part 2 follows from part 1, the nonnegativity of ␣, and the Chebyshev inequality. gޒ n To prove part 3, let us introduce vectors u1,...,un , where s uiiŽ.u 1 ,...,uin is the ith row of the matrix uij. Thus u1,...,un are vectors sampled independently ޒ n ␣s␣ from the standard Gaussian distribution in and the output Ž.u1,...,un is a function of u1,...,un, ␣ : ޒ n = иии =ޒ n ªޒ. Since ␣sŽ.det B 2 and the determinant is a linear function in every row, we s ␣ gޒ n conclude that for each i 1,...,n, the output is a quadratic form in ui , ␣G provided u1,...,uiy1, uiq1,...,un are fixed. Furthermore, since 0, we deduce ␣ that is a positive semidefinite quadratic form in ui. As in Section 5.1, let us introduce the conditional expectation Ek with respect to s uk , k 1,...,n. Hence we can write s ␣ s иии ␣ per A EŽ.E1 En Žu1 ,...,un .. ␣ s иии ␣ ␣ Let kŽ.u1,...,ukkE q1 Enk Ž.. Thus is a polynomial function in the first k ␣ vectors u1,...,ukk, and is a positive semidefinite quadratic form in u k, provided 50 BARVINOK

␣ s ␣ s␣ u1,...,uky10are fixed. Naturally, per A and n . Without loss of general- ity, we may assume that per A)0, for if per As0, then by part 2, ␣s0 almost ␣ ) surely and part 3 is obvious. Hence kŽ.u1,...,uk 0 for almost all k-tuples Ž.u1,...,uk . We may write ␣␣n Ž.u ,...,u s k 1 k Ł ␣ , per A ks1 ky11Ž.u ,...,uky1 and, therefore, 1 ␣ 1 n ␣ Ž.u ,...,u s k 1 k ln Ý ln␣ . n per Anks1 ky11Ž.u ,...,uky1 y ␣ We observe that for each fixed Ž.k 1 -tuple Žu1,...,uky1 .such that ky11 Žu ,..., / uky1. 0, the ratio ␣ k Ž.u1 ,...,uk ␣ ky11Ž.u ,...,uky1 is a positive semidefinite quadratic form in uk whose expectation is 1. Therefore by Theorem 4.2, ␣ k Ž.u1 ,...,uk ᑝ FE ln , 1 k ␣ ž/ky11Ž.u ,...,uky1 and ␣ k Ž.u1 ,...,uk E ln2 F8. k ␣ ž/ky11Ž.u ,...,uky1 Now we apply Lemma 5.2 with ␣ k Ž.u1 ,...,uk fu,...,u sln , asᑝ , bs8, and ␦syln ⑀ , k Ž.1 k ␣ 1 ky11Ž.u ,...,uky1 to conclude that for any 1)⑀)0, 1 ␣ 8 Probability ln Fᑝ qln ⑀ F , ½5n per A 1 n ln2 ⑀ ᒀ s ᑝ and, since 11expÄ4Ž.see Section 1.3 , 8 ␣F ᒀ ⑀ n F ProbabilityÄ4Ž.1 per A . n ln2 ⑀ The proof of part 3 now follows. B

As we see, the proof is based on the following three observations: First, the output ␣ is a nonnegative number. Second, the expectation of ␣ is the value we are seeking to approximate. Third, ␣ can be represented as a function POLYNOMIAL TIME ALGORITHMS TO APPROXIMATE PERMANENTS 51

␣ Ž.u1,...,uniof vectors u , drawn independently from a Gaussian distribution in ޒ n ␣ , so that for any fixed u1,...,uiy1, uiq1,...,un, the function is a quadratic form in ui. To obtain the proof of Theorem 3.1.1, we need to do some modifications. ␣ ␣ First, we note that it is clear that is nonnegative and that Ž.u1,...,un is a ␣ quadratic form in ui, provided u1,...,uiy1, uiq1,...,un are fixed. The proof that provides an unbiased estimator is very similar to that of Theorem 2.1.1. Let s s ␻ ␻ U s wkkkkTu, w Ž. k1,..., kn. Then the covariance matrix of w kis TT k kQ k Žcf. ␻␻ s s Section 4.1.Ž , so E ki kj .q ij, kkij, where Q Ž.q , ki. Furthermore, vectors w and / wj are independent for i j. Now

n 2 ␣ s wx2 s ␴␻ EŽ. EŽ.det w1 ,...,wnkE ÝŁsgn ␴ Ž k. ␴g ks1 ž/Sn

n sE sgn ␴ sgn ␴␻␻␴ ␴ ÝŁŽ.Ž.12k 12Žk. k Ž k. ␴ ␴ g ks1 ž/12, Sn n s sgn ␴ sgn ␴ E ␻␻␴ ␴ ÝŁŽ.Ž.Ž12 k 12Ž k. k Ž k. . ␴ ␴ g ks1 12, Sn n s sgn ␴ sgn ␴ q␴ ␴ sDQ,...,Q ÝŁŽ.Ž.1212Ž k. Ž k., k Ž1 n . ␴ ␴ g ks1 12, Sn byŽ. 1.2.1 . gރ n To prove Theorems 2.2.1 and 3.2.1, we note that, if u1,...,uiy1, uiq1,...,un ␣ gރ n are fixed, is a Hermitian form in ui , so instead of part 1 of Theorem 4.2 one should refer to Theorem 4.3. The proof of Theorem 3.3.1 is much simplified if we use the exterior algebra formalismŽ see, for example, Section 28 ofwx 26. .

6.1. Exterior Algebra Let V be a complex vector space and suppose that dim Vsm. Recall that the H H s[m Hk exterior algebra V, as a vector space, is the direct sum V ks1 V, where Hk V is spanned by wedge products, ¨ n иии n¨ ¨ g s 1 ki, V, i 1,...,k, which are linear in each argument, ¨ n иии n¨ n ␣ ¨X q␤ ¨Y n¨ n иии n¨ 1 iy1 Ž.iiiq1 k s␣ ¨ n иии n¨ n¨X n¨ n иии n¨ Ž.1 iy1 iiq1 k q␤ ¨ n иии n¨ n¨Y n¨ n иии n¨ Ž.1 iy1 iiq1 k and skew-symmetric, ¨ n иии n¨ s ␴ ¨ n иии n¨ ␴ Ž1. ␴ Ž k. Ž.Žsgn 1 k ., ␴g Hk for any permutation Sk . Vectors from V are called k-vectors. 52 BARVINOK

sރ m = Let us fix a basis e1,...,em of V, thus identifying V . For an m m matrix gރ m U with the columns u1,...,um we have n иии n s n иии n u1 um Ž.Ždet Ue1 em ..

We identify Hm Žރ m.sރ, so we write simply n иии n s u1 um det U.

s ␰ ␰ s ␩ ␩ gރ m Let x Ž.Ž.1,..., m , y 1,..., m . Then n s ␰␩y␰␩ n x y Ý Ž.Ž.ij jie ie j.Ž. 6.1.1 1Fi-jFm

6.2. Proposition. For a ¨ector ugވn, let us define a 2-¨ector ␻Ž.u gH22ރ n as follows: if usaqibqjcqk d with a, b, c, dgޒ n, we let

␻ Ž.u s Žaqib, ycqid .n Žcqid, ayib .. Then

()1 For any ugވn, ␻Ž.u s␻ Žui .s␻ Žu j .s␻ Žuk .. = gވn ()2 Suppose that H is a quaternionic n n matrix with the columns u1,...,un . Let us write HsAqi BqjCqk D for n=n real matrices A, B, C, and D and let Hރ be the 2n=2n complex matrix,

AqiB CqiD H s . ރ ž/yCqiD AyIB Then

sy w nnŽ.y1 xr2 ␻ n иии n␻ det Hރ Ž.1 Ž.u1 Ž.un . = s U s ()3 Let T be an n n real matrix and let Q TT , Q Ž.qij . Suppose that u is sampled from the standard quaternionic Gaussian distribution in ވn. Then the expectation of ␻Ž.Tu is the 2-¨ector,

n n Ý qeijŽ. ie jqn , i, js1

ރ 2 n where e12,...,en is the standard basis of .

¨ s q y q Proof. Part 1 is proved by direct computation. Let us denote 1 Ž.a ib, c id , ¨ s q y ␻ s¨ n¨ ␻ s ¨ ny¨ s¨ n¨ 2121212Ž.Ž.Ž.Ž.Ž.c id, a ib ,so u . Then ui i i , ␻ sy¨ n¨ s¨ n¨ ␻ s ¨ n ¨ sy ¨ n¨ s¨ n¨ Ž.u j Ž2 . 112, and Ž.uk Ži 2 . Ži 1 . Ž 21 . 12. It is convenient to think of vectors as columns of numbers, so we write

aqib cqid ␻ Ž.u sn, ž/ž/ycqid ayib for usaqibqjcqk d. POLYNOMIAL TIME ALGORITHMS TO APPROXIMATE PERMANENTS 53

To prove part 2, let us denote the kth column of A, B, C, and D by akkk, b , c , and dk , respectively. Then q q a qib a qib c qid A iB C iD sn11 иии nnnnn 11иии det y q y y q y q y ž/C iD A iB ž/ž/ž/c11id cnnid a11ib c qid n nn y . ž/annib Rearranging the vectors in the wedge product, we get

q q w x a qib c qid A iB C iD sy nnŽ.y1 r2 11nn 11иии det y q y Ž.1 y q y ž/C iD A iB ž/ž/c11id a 11ib a qib c qid nnnn n n y q y ž/ž/cnnid a nnib sy w nnŽ.y1 xr2 ␻ n иии n␻ Ž.1 Ž.u1 Ž.un . To prove part 3, let usaqibqjcqk d for a, b, c, dgޒ n. Then TusTaqiTbq jTcqkTd and TaqiTb TcqiTd ␻ Ž.Tu sn. ž/ž/yTcqiTd TayiTb s ␣ ␣ s ␤ ␤ s ␥ ␥ s ␦ ␦ Let Ta Ž.Ž.Ž.1,..., n , Tb 1,..., n , Tc 1,..., n , and Td Ž.1,..., n be the coordinates of Ta, Tb, Tc, and Td, respectively. Since a, b, c, and d are sampled independently from the Gaussian distribution in ޒ n with the covariance 1 matrix Ž.4 I, it follows that Ta, Tb, Tc, and Td are sampled from the distribution 1 with the covariance matrix Ž.4 Q Žsee Section 4.1 . , so

qij EŽ.␣␣ sEŽ.␤␤ sE Ž.␥␥ sE Ž.␦␦ s , ij ij ij ij 4 ␣ ␤ ␥ ␦ ␣ ␤ ␥ ␦ and all other pairs from the set 1111, , , ,..., nnnn, , , are uncorrelated. UsingŽ. 6.1.1 , we can write ␻ Ž.Tu sIqIIqIII, where s ␣ q ␤␥q ␦ y ␣ q ␤␥q ␦ n I Ý Ž.Ž.Ž.Ž.Ž.Ž.kkssi i sskkksi i e e , 1Fk-sFn s ␣ q ␤␣y ␤ yy␥ q ␦␥q ␦ n II Ý Ž.Ž.Ž.Ž.Ž.Ž.kkssi i sskkksi i e e qn , 1Fk , sFn and sy␥ q ␦␣y ␤ yy␥ q ␦␣y ␤ n III Ý Ž.Ž.Ž.Ž.Ž.Ž.kkssi i sskkki i e qnse qn . 1Fi-sFn It is seen that the expectations of the first and the last sum are 0, whereas the expectation of the second sum is n B Ý qeksŽ. ke sqn . 1Fk , sFn 54 BARVINOK

Proof of Theorem 3.3.1. The output ␣ is a nonnegative real number, since it is the reduced normŽ.Ž squared Dieudonne´ determinant of a quaternionic matrix see Chap. IV, Section 1 ofwx 3.Ž . A direct proof of this fact is as follows cf.wx 3. . One can observe that det Hރރsdet H ,so␣sdet H ރis a real number. The correspon- ¬ ވ = dence H Hރ is an embedding of the group GLnŽ.of the nondegenerate n n ރ = quaternionic matrices in the group GL2 nŽ.of 2n 2n nondegenerate complex ވ matrices. The group GLnŽ.is known to be connected, therefore det Hރ can not change sign as H changes within the group. Substituting the identity matrix HsI, g ވ ވ we conclude that det Hރ is positive for any H GLnnŽ.. Since the group GL Ž. is dense in the space of all n=n quaternionic matrices H, we conclude that the output ␣ is nonnegative. ␣ Let us prove that the expectation of is the mixed discriminant DQŽ.1,...,Qn . Applying part 2 of Proposition 6.2, we can write

␣s␣ sy w nnŽ.y1 xr2 ␻ n иии n␻ Ž.Ž.Ž.Ž.u1 ,...,un 1 Tu11 Tunn. gވn Let Ek be the conditional expectation with respect to uk . Since the wedge product is linear in every term, we may write

␣ s иии ␣ sy w nnŽ.y1 xr2 ␻ n иии n ␻ EŽ.E1 En Ž. Ž1 .E111Ž. ŽTu .Ennn Ž. ŽTu .. s F F Let Q=Ž.qij, k for k 1,...,n and 1 i, j n. Applying part 3 of Proposition 6.2, we get

nn ␣ sy w nnŽ.y1 xr2 n n иии n n EŽ. Ž1 . ÝÝqeij,1Ž.ije qnijqe, ni Ž.e jqn . ž/ž/i, js1 i, js1 Rearranging the terms in the wedge product and canceling wedge products containing repeated vectors, we get

n wŽ.ny1 xr2 E ␣ sy1 qene q n иии ne ne q Ž. Ž . ÝŁijkk, ki11 j1 ijnnn F F ž/ks1 1 i1,...,i n ; j1 ,..., jn n n w nnŽ.y1 xr2 sy1 qe␴ ␴ ␴ ne␴ q n иии Ž. ÝŁ12Ž k. Ž k., k 1Ž1. 2Ž1. 1 ␴ ␴ g ž/ks1 12, Sn

ne␴ ne␴ q 12Ž n. Žn. n n w nnŽ.y1 xr2 sy1 sgn ␴ sgn ␴ qe␴ ␴ ne q n иии Ž.ÝŁ Ž12 .Ž. 12Ž k. Ž k., k 1 n 1 ␴ ␴ g ž/ks1 12, Sn n n en e2 n n s sgn ␴ sgn ␴ qe␴ ␴ n иии ne ne q n иии ÝŁŽ.Ž.12 12Ž k. Ž k., k 1 nn1 ␴ ␴ g ž/ks1 12, Sn n e2 n s DQŽ.Ž.1 ,...,Qn by 1.2.1 and part 1 is proven. POLYNOMIAL TIME ALGORITHMS TO APPROXIMATE PERMANENTS 55

Part 2 follows by part 1 and the Chebyshev inequality. Let us prove part 3. As in the proof of Theorem 2.1.1, we introduce conditional expectations, ␣ s иии ␣ k Ž.u1 ,...,ukkE q1 En Ž. sy w nnŽ.y1 xr2 ␻ n иии Ž.1 ŽTu11 . n␻ n ␻ n иии n ␻ Ž.TukkE kq1Ž.Ž. ŽTukq1 kq1 .Ennn Ž.Tu . ␣ s␣ ␣ s We have 0 and n DQŽ.1,...,Qn . Since the wedge product is linear in every term, we conclude that for any fixed ␣ u1,...,uky1, the function kŽ.Žu1,...,uky1, uk is a necessarily positive semidefinite . gވn quadratic form in uk . Furthermore, since the multiplications by a real matrix T and the units i, j, and k commute, by part 1 of Proposition 6.2 we ␣ conclude that kŽ.u1,...,uky1, uk is invariant under the right multiplication of uk by i, j, and k. ) Without loss of generality, we may suppose that DQŽ.1,...,Qn 0, since if s ␣s DQŽ.1,...,Qn 0, by part 1 we have 0 almost surely and the proof would ␣ follow immediately. Since kŽ.u1,...,uk is a polynomial in u1,...,uk , we conclude ␣ ) that kŽ.u1,...,uk 0 almost surely. We may write ␣␣n Ž.u ,...,u s k 1 k Ł ␣ , DQŽ.1 ,...,Qnkks1 y11 Žu ,...,uky1 . and, therefore, 1 ␣ 1 n ␣ Ž.u ,...,u s k 1 k ln Ý ln␣ . nDQŽ.1 ,...,Qnnkks1 y11 Žu ,...,uky1 . y ␣ We observe that for each fixed Ž.k 1 -tuple Žu1,...,uky1 ., such that ky11 Žu ,..., / uky1. 0, the ratio, ␣ k Ž.u1 ,...,uk ␣ ky11Ž.u ,...,uky1 is a positive semidefinite quadratic form in uk , which is invariant under the right multiplication by i, j, and k and which has expectation 1. Therefore, by Theorem 4.4, ␣ k Ž.u1 ,...,uk ᑝ FE ln , 4 k ␣ ž/ky11Ž.u ,...,uky1 and by part 2 of Theorem 4.2, ␣ k Ž.u1 ,...,uk E ln2 F8. k ␣ ž/ky11Ž.u ,...,uky1 Now we apply Lemma 5.2 with ␣ k Ž.u1 ,...,uk fu,...,u sln , asᑝ , bs8, and ␦syln ⑀ , k Ž.1 k ␣ 4 ky11Ž.u ,...,uky1 56 BARVINOK to conclude that for any 1)⑀)0,

1 ␣ 8 Probability ln Fᑝ qln ⑀ F , 4 2 ⑀ ½5nDQŽ.1 ,...,Qn n ln ᒀ s ᑝ and, since 44expÄ4Ž.cf. Section 1.3 , 8 ␣F ᒀ ⑀ n F ProbabilityÄ4Ž.Ž41DQ,...,Qn . . n ln2 ⑀ The proof of part 3 now follows. B

7. POSSIBLE RAMIFICATIONS

It appears that our method can be used in the following general situation. Suppose ␣ we want to approximate some quantity 0 of interest. Suppose that we have a function,

␣ : ޒ m1 = иии =ޒ m n ªޒ,

ޒ m i ␮ where each space is endowed with a Gaussian probability measure i. Suppose ␣ that the function Ž.u1,...,un has the following properties: ␣ ()a The value Ž.u1,...,un is a nonnegative number, which can be efficiently gޒ m1 gޒ m n computed for any given vectors u1 ,...,un . ␣ ()b For any fixed u1,...,uiy1, uiq1,...,un, the value Ž.u1,...,un is a quadratic gޒ m i form in ui . ␣ ␮ = иии =␮ ()c The expectation of with respect to the product measure 1 n is ␣ the quantity 0. ␣ Then we get an efficient randomized algorithm to approximate 0 within a simply exponential factor OcŽ n., where 1)c)0 is an absolute constant. The algorithm consists of sampling u1,...,un independently and at random and com- ␣ puting Ž.u1,...,un . s = Example Ž.Ž.Sums of Subpermanents . Let A aij be a rectangular n m matrix, G ; = m n. For a subset I Ä41,...,m of the cardinality n, let AI be the n n submatrix of A consisting of the columns whose indices are in I. Let

s PER A Ý per AI , <

T ␣ (b.Ž. For any fixed u1,...,uiy1, uiq1,...,un, The function u1,...,un is a quadratic polynomial Ž.not necessarily homogeneous in ui.

One can construct some interesting estimators satisfying this weaker property.

Example Ž.Approximating the Hafnian . Let A be an n=n nonnegative symmet- ric matrix. Suppose that n is even, ns2k. The number,

1 k s haf A k ÝŁa␴ Ž2 iy1., ␴ Ž2 i. 2 k! ␴g is1 Sn is called the hafnian of A, see Section 8.2 ofwx 25 . Thus haf A is the sum of all monomials a иии a , where Ä4i , j ,..., Äi , j 4constitutes a partition of the set ij1 ikk ij 11 kk Ä41,...,n into unordered pairs. For example, if A is the of an undirected graph, haf A is the number of perfect matchings in the graph. F - F Let us sample uij:1 i j n independently and at random from the standard ޒ s Gaussian distribution in . Let us construct a skew-symmetric matrix B Ž.bij , where ¡ F - F uaij' ij ,if1i j n, s~ bij yua,if1Fj-iFn, ¢ ij' ij 0, if isj.

Let ␣sdet B. gޒ nyi s y s Let us introduce vectors uiiii, i 1,...,n 1 where u Ž.u q1,...,uin . ␣ ␣ Then is a function of u1,...,uny1. One can prove that satisfies the properties T Ž.Ža , b.Ž.Ž , and c cf. Chap. 8 ofwx 23. and that with high probability ␣ approximates haf A within a factor of OcŽ n., where 1)c)0 is an absolute constant. Similarly, a complex estimator can be constructed and the author has a conjecture what a quaternionic version may look like. 58 BARVINOK

Finally an interesting question is what can we gainŽ. or lose by further relaxing T Žbto.

U ␣ Žb.Ž. For any fixed u1,...,uiy1, uiq1,...,un, the function u1,...,un is a polynomial of a fixedŽ. independent of n degree.

These and related questions will be addressed elsewhere.

8. APPLICATIONS OF MIXED DISCRIMINANTS TO COUNTING

gޒ n s ␰ ␰ m = For a vector x , x Ž.1,..., n , let us denote by x x the n n matrix whose ␰ и␰ m Ž.i, j th entry is ij. Thus x x is a positive semidefinite matrix whose rank does not exceed 1. Applications of mixed discriminants to problems of combinatorial counting are based on the following simple result.

¨ ޒ n 8.1. Lemma. Let u1,...,uben ectors from . Then m m s wx2 DuŽ.11u ,...,unnu Ž.det u1 ,...,un , the squared determinant of the matrix with the columns u1,...,un.

ޒ n Proof. Let e1,...,en be the standard orthonormal basis of . Let G be the s m s m U matrix with the columns u1,...,uniiiiii. Then u Ge , u u GeŽ.eG and from the definition of the mixed discriminantŽ. see Section 1.2 , we get Ѩ n Dumu ,...,u mu s det tumu q иии qtumu Ž.Ž11 nn Ѩ иии Ѩ 11 1 nn n . t1 tn Ѩ n U s det Gteme q иии qtemeG Ѩ иии Ѩ Ž.Ž.11 1 nn n t1 tn Ѩ n U sdet GG det teme q иии qteme Ž.Ѩ иии Ѩ Ž11 1 nn n . t1 tn s 2 Ž.det G . B

= Suppose we are given a rectangular n m matrix A with the columns u1,...,um, which we interpret as vectors from ޒ n. Suppose that for any subset I;Ä41,...,m , IsÄ4i ,...,i , the determinant of the submatrix A swxu ,...,u is either 0, y1, 1 nIii1 n or 1. Such an A represents a unimodular matroid onthesetÄ4 1,...,m and a subset / wx I with det AI 0 is called a base of the matroidŽ see 29. . Suppose now that the columns of A are colored with n different colors. The s j иии j coloring induces a partitionÄ4 1, . . . , m J1 Jn. We are interested in the number of bases that have precisely one index of each color. Let us define the positive semidefinite matrices Q1,...,Qn as follows: s m s QkiiÝ u u , k 1,...,n. g i Jk POLYNOMIAL TIME ALGORITHMS TO APPROXIMATE PERMANENTS 59

Then the number of bases can be expressed as DQŽ.1,...,Qn . The mixed discrimi- nant is linear in every argument; that is, ␣ X q␤ Y DQŽ.1 ,...,Qiy1 , QiiiQ , Q q1 ,...,Qn s␣ X DQŽ.1 ,...,Qiy1 , Qii, Q q1 ,...,Qn q␤ Y DQŽ.1 ,...,Qiy1 , Qii, Q q1 ,...,Qn ,

Žsee, for example,wx 21. . Using the linearity and Lemma 8.1, we have

DQ,...,Q s Dumu ,...,u mu Ž.1 niiiiÝ Ž.11 nn s I Ä4i1,...,i n

2 s detwxu ,...,u , Ý Ž.ii1 n s I Ä4i1,...,i n where the sums are taken over all n-subsets I, having precisely one element of each color, and the proof follows. R. Stanley obtained a similar formula which involves the mixed volume of zonotopes instead of the mixed discriminantwx 28 .

Example Ž.Trees in a Graph . Suppose we have a connected graph G with n vertices and m edges. Suppose further that the edges of G are colored with ny1 different colors. We are interested in spanning trees T in G such that all edges of T have different colors. Let us number the vertices of G by1,...,n and the edges of G by 1,...,m. Let us make G an oriented graph by orienting its edges arbitrarily. We consider the truncated incidence matrix Ž.with the last row removed s F F y F F y = A Ž.aij for 1 i n 1 and 1 j m as an Žn 1 .m matrix such that ¡ ~ 1, if i is the beginning of j, a s y ij ¢ 1, if i is the end of j, 0, otherwise.

The spanning trees of G are in a one-to-one correspondence with non-degenerate Ž.Ž.ny1 = ny1 submatrices of A and the determinant of such a submatrix is either 1 or y1Ž see, for example, Chap. 4 ofwx 10. . Hence counting colored trees reduces to computing the mixed discriminant of some positive semidefinite matri- ces, computed from the incidence matrix of the graph.

Note Added in Proof: Applications of Mixed Discriminants described in Section 8 are known, see Chapter V of R. B. Bapat and T. E. S. Raghavan, Nonnegative Matrices and applications, Cambridge University Press, 1997.

ACKNOWLEDGMENTS

I am grateful to Mark Jerrum and Martin Dyer for many useful conversations during the Schloss Dagstuhl meeting on approximation algorithms in August, 1997. It was their suggestion to explore a complex versionŽ. now Algorithm 2.2 of the original real algorithmŽ. now Algorithm 2.1 . 60 BARVINOK

REFERENCES

wx1 A. Aho, J. Hopcroft, and J. Ullman, The Design and Analysis of Computer Algorithms, Addison-Wesley, Reading, MA, 1974. wx2 A.D. Aleksandrov, On the theory of mixed volumes of convex bodies, IV, Mixed discriminants and mixed volumes, Mat SbŽ.Ž NS 3 1938 . , 227᎐251 Ž in Russian . . wx3 E. Artin, Geometric Algebra, Wiley-Interscience, New York, 1988. wx4 R.G. Bartle, The Elements of Integration, John Wiley & Sons, New York, 1966. wx5 A.I. Barvinok, Two algorithmic results for the traveling salesman problem, Math Oper Res 21Ž. 1996 , 65᎐84. wx6 A.I. Barvinok, Computing mixed discriminants, mixed volumes, and permanents, Dis- crete Comput Geom 18Ž. 1997 , 205᎐207. wx7 A.I. Barvinok, A simple polynomial time algorithm to approximate the permanent within a simply exponential factor, MSRI, No. 1997-031, preprint. wx8 M. Dyer and M. Jerrum, Schloss Dagstuhl Meeting on Approximation Algorithms, August 1998, personal communication. wx9 G.P. Egorychev, The solution of van der Waerden’s problem for permanents, Adv in Math 42Ž. 1981 , 299᎐305. wx10 V.A. Emelichev, M.M. Kovalev, and M.K. Kravtsov, Polytopes, Graphs and Optimiza- tion, Cambridge University Press, New York, 1984. wx11 V.N. Faddeeva, Computational Methods of Linear Algebra, Dover, New York, 1975. wx12 A. Frieze and M. Jerrum, An analysis of a Monte Carlo algorithm for estimating the permanent, Combinatorica 15Ž. 1995 , 67᎐83. wx13 C.D. Godsil and I. Gutman, On the matching polynomial of a graph, in Algebraic Methods in , Vol. I, II,Ž. Szeged, 1978 , North-Holland, Amsterdam᎐New York, 1981, pp. 241᎐249. wx14 D.Yu. Grigoriev and M. Karpinski, The matching problem for bipartite graphs with polynomially bounded permanents as in NC, Proc 28th Annual IEEE Symp on Founda- tions of Computer Science, IEEE Computer Soc Press Washington D.C., 1987, pp. 162᎐172. wx15 M. Jerrum and A. Sinclair, Approximating the permanent, SIAM J Comput 18Ž. 1989 , 1149᎐1178. wx16 M. Jerrum and A. Sinclar, The Markov chain Monte Carlo method: An approach to approximate counting and integration, in Approximation Algorithms for NP-hard Problems, D.S. Hochbaum,Ž. Editor , PWS, Boston, MA, 1996, pp. 482᎐520. wx17 M.R. Jerrum, L.G. Valiant, and V.V. Vazirani, Random generation of combinatorial structures from a uniform distribution, Theoret Comp Sci 43Ž. 1986 , 169᎐188. wx18 M. Jerrum and U. Vazirani, A mildly exponential approximation algorithm for the permanent, Algorithmica 16Ž. 1996 , 392᎐401. wx19 N. Karmarkar, R. Karp, R. Lipton, L. Lovasz,´ and M. Luby, A. Monte Carlo algorithm for estimating the permanent, SIAM J Comput 22Ž. 1993 , 284᎐293. wx20 D.E. Knuth, The Art of Computer Programming. Vol. 2: Seminumerical Algorithms, Addison-Wesley, Reading, MA, 1981, 2nd ed. wx21 K. Leichtweiss, Convexity and differential geometry, in Handbook of Convex Geometry, P.M. Gruber and J.M. Wills,Ž. Editors , North-Holland, Amsterdam, 1993, Vol. B, Chap. 4.1, pp. 1045᎐1080. wx22 N. Linial, A. Samorodnitsky, and A. Wigderson, A deterministic strongly polynomial algorithm for matrix scaling and approximate permanents, preprint. POLYNOMIAL TIME ALGORITHMS TO APPROXIMATE PERMANENTS 61 wx23 L. Lovasz´ and M.D. Plummer, Matching Theory, North-Holland, Amsterdam᎐New York and Akad. Kiado,´ Budapest, 1986. wx24 L. Lovasz´ and M. Simonovits, Random walks in a convex body and an improved volume algorithm, Random Structures Algorithms, 4Ž. 1993 , 359᎐412. wx25 H. Minc, Permanents, Addison-Wesley, Reading, MA, 1978. wx26 V.V. Prasolov, Problems and Theorems in Linear Algebra, Am Math Soc, Providence, RI, 1994. wx27 L.E. Rasmussen, Approximating the permanent: A simple approach, Random Struc- tures Algorithms, 5Ž. 1994 , 349᎐361. wx28 R.P. Stanley, Two combinatorial applications of the Alexandrov᎐Fenchel inequalities, J Combin Theory Ser A, 17Ž. 1981 , 56᎐65. wx29 N. White, Unimodular matroids, in Combinatorial Geometries, Cambridge University Press, Cambridge, U.K., 1987, pp. 40᎐52.