Sampling Arborescences in Parallel

Sampling Arborescences in Parallel Nima Anari1, Nathan Hu1, Amin Saberi1, and Aaron Schild2 1Stanford University, {anari,zixia314,saberi}@stanford.edu 2University of Washington, [email protected] December 18, 2020 Abstract We study the problem of sampling a uniformly random directed rooted spanning tree, also known as an arborescence, from a possibly weighted directed graph. Classically, this problem has long been known to be polynomial-time solvable; the exact number of arborescences can be computed by a determinant [Tut48], and sampling can be reduced to counting [JVV86; JS96]. However, the classic reduction from sampling to counting seems to be inherently sequential. This raises the question of designing efficient parallel algorithms for sampling. We show that sampling arborescences can be done in RNC. For several well-studied combinatorial structures, counting can be reduced to the computation of a determinant, which is known to be in NC [Csa75]. These include arborescences, planar graph perfect matchings, Eulerian tours in digraphs, and determinantal point processes. However, not much is known about efficient parallel sampling of these structures. Our work is a step towards resolving this mystery. 1 Introduction Algorithms for (approximately) counting various combinatorial structures are often based on the equivalence between (approximate) counting and sampling [JVV86; JS96]. This is indeed the basis of the Markov Chain Monte Carlo (MCMC) method to approximate counting, which is arguably the most successful ap- proach to counting, resolving long-standing problems such as approximating the permanent [JSV04] and computing the volume of convex sets [DFK91]. Approximate sampling and counting are known to be equivalent for a wide class of problems, including the so-called self-reducible ones [JVV86; JS96]. This equivalence is nontrivial and most useful in the direction of reducing counting to sampling. However, for some problems, the “easier” direction of this equivalence, namely the reduction from sampling to counting, proves useful. For these problems, almost arXiv:2012.09502v1 [cs.DS] 17 Dec 2020 by definition, we can count via approaches other than MCMC. One of the mysterious approaches to counting is via determinant computations. A range of counting problems can be (exactly) solved by simply computing a determinant. A non-exhaustive list is provided below. • Spanning trees in a graph can be counted by computing a determinant related to the Laplacian of the graph, a result known as the matrix-tree theorem [Kir47]. • Arborescences in a directed graph can be counted by computing a determinant related to the directed Laplacian [Tut48]. • The number of perfect matchings in a planar graph can be computed as the Pfaffian (square root of the determinant) of an appropriately signed version of the adjacency matrix, a.k.a. the Tutte matrix [Kas63]. • The number of Eulerian tours in an Eulerian digraph is directly connected to the number of arborescences, and consequently the determinant related to the directed Laplacian [AB87; TS41]. 1 Rd [n] • Given vectors v1,..., vn ∈ , the volume sampling distribution on subsets S ∈ ( d ) can be defined as follows: P 2 [S] ∝ det ([vi]i∈S) . ⊺ The partition function of this distribution is simply det(∑i vivi ) [see, e.g., DR10]. • The number of non-intersecting paths between specified terminals in a lattice, and more generally applications of the Lindström-Gessel-Viennot lemma [Lin73; GV89]. Efficient counting for these problems follows the polynomial-time computability of the associated determinants. In turn, one obtains efficient sampling algorithms for all of these problems; we remark that ot all of these problems are known to be self-reducible, but nevertheless “easy” slightly varied sampling to counting reductions exist for all of them. While polynomial-time sampling for all of these problems has long been settled, we reopen the investiga- tion of these problems by considering efficient parallel sampling algorithms. We focus on the computational model of PRAM, and specifically on the complexity classes NC and RNC. Here, a polynomially bounded number of processors are allowed access to a shared memory, and the goal is for the running time to be polylogarithmically bounded; the class RNC has additionally access to random bits. Determinants can be computed efficiently in parallel, in the class NC [Csa75], and as a result there are NC counting algorithms for all of aforementioned problems. However, the sampling to counting reductions completely break down for parallel algorithms, as there seems to be an inherent sequentiality in these reductions. Take spanning trees in a graph as an example. The classic reduction from sampling to counting proceeds as follows: for each edge e do A ← number of spanning trees containing e B ← total number spanning trees Flipping a coin with bias A/B, decide whether e should be part of the tree. Either contract or delete the edge e based on this decision. Each iteration of this loop uses a counting oracle to compute A, B. However the decision of whether to include an edge e as part of the tree affects future values of A, B for other edges, and this seems to be the inherent sequentiality in this algorithm. The sampling-to-counting reduction for all other listed problems encounters the same sequentiality obstacle. In this paper, we take a step towards resolving the mysterious disparity between counting and sampling in the parallel algorithms world. We resolve the question of sampling arborescences in weighted directed graphs, and as a special case spanning trees in weighted undirected graphs, using efficient parallel algorithms with access to randomness, a.k.a. the class RNC. We remark that the special case of sampling spanning trees in unweighted undirected graphs was implicitly solved by the work of Teng [Ten95a], who showed how to simulate random walks in RNC. When combined with earlier work of Aldous [Ald90] and Broder [Bro89], this algorithm would simulate a random walk on the graph, and from its transcript extract a random spanning tree. However, adding either weights or directions to the graph results in the need for potentially exponentially large random walks, which cannot be done in RNC. Our work removes this obstacle. Theorem 1. There is an RNC algorithm which takes a directed graph G = (V, E) together with edge weights R w : E → ≥0 as input and outputs a random directed rooted tree T, a.k.a. an arborescence. The output T follows the distribution P[T] ∝ ∏ w(e). e∈T 1.1 Related Work and Techniques There is a long line of research on algorithms for sampling and counting spanning trees and more generally arborescences. The matrix-tree theorem of Kirchhoff [Kir47] showed how to count spanning trees 2 W 1 ... W 1 Figure 1: Starting from left it Figure 2: A random walk started from the left node covers all n takes Θ(W) steps to cover. nodes in time Θ(2n). in undirected graphs, and later Tutte [Tut48] generalized this to arborescences in digraphs. Somewhat surprisingly Aldous [Ald90] and Broder [Bro89] showed that random spanning trees and more generally random arborescences of a graph can be extracted from the transcript of a random walk on the graph itself. The main focus of subsequent work on this problem has been on improving the total running time of sequential algorithms for sampling. After a long line of work [Wil96; CMN96; KM09; MST15; Dur+17b; Dur+17a], Schild [Sch18] obtained the first almost-linear time algorithm for sampling spanning trees. More recently Anari, Liu, Oveis Gharan, and Vinzant [Ana+20] improved this to nearly-linear time. Many of these works are based on speeding up the Aldous-Broder algorithm. Our main result, Theorem 1, is also built on the Aldous-Broder algorithm, but we focus on parallelizing it instead of optimizing the total running time. No almost-linear time algorithm is yet known for sampling arborescences in digraphs, as opposed to spanning trees in undirected graphs. While sampling spanning trees has a multitude of application [see Sch18], there are a number of applications for the directed graph generalization. Most notably, there is a many-to-one direct correspondence between Eulerian tours in an Eulerian digraph and arborescences of the graph. This correspondence, known as the BEST theorem [AB87; TS41] allows one to generate random Eulerian tours by generating random arborescences [see Cre10]. We leave the question of whether the correspondence in the BEST theorem is implementable in NC to future work, but note that sampling Eulerian tours has interesting applications in biology and sequence processing [Jia+08; Riv+08]. We remark that, unlike directed graphs, generating random Eulerian tours of undirected Eulerian graphs in polynomial time is a major open problem [TV01]. In a slightly different direction related to this work, Balaji and Datta [BD13] considered the space complexity of counting arborescences. They showed this problem is in L for graphs of bounded tree-width, obtaining algorithms for counting Eulerian tours in these digraphs as well. Perhaps in the first major result of its kind in the search for RNC sampling algorithms, Teng [Ten95b] showed implicitly how to sample spanning trees in undirected graphs in RNC. Teng [Ten95b] showed how to parallelize the simulation of a random walk on a graph; more precisely, he showed how to output a length L trace of a walk on size n Markov chains in parallel running time polylog(L, n) using only poly(L, n) many processors. The Aldous-Broder algorithm extracts an arborescence from the trace of a random walk by extracting the so-called first-visit edge to each vertex. If a random walk is simulated until all vertices are visited at least once, the trace of the random walk has enough information to extract all such first-visit edges. This allows RNC sampling of arborescences in graphs where the number of steps needed to visit all vertices, known as the cover time, is polynomially bounded.

Load more