Advanced Acceptance/Rejection Methods for Monte Carlo Algorithms

Advanced Acceptance/Rejection Methods for Monte Carlo Algorithms Mark Huber Department of Mathematics and Institute of Statistics and Decision Sciences Duke University March 14, 2006 Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 1 / 51 Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 2 / 51 Monte Carlo methods The basic question For a random variable X and a measurable event A, what is P(X ∈ A)? Classical statistics: finding p-values Bayesian statistics: learning about posterior distributions Statistical physics: approximating a partition function Computer science: approximation of ]P complete problems Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 3 / 51 Basic acceptance/rejection Fixed number of trials A/R Input: A, L(X), n Output: pˆ an estimate of P(X ∈ A) 1) Let s ← 0 1) For i from 1 to n do 2) Draw T from the distribution of X 3) If T ∈ A, let s ← s + 1 4) Let pˆ ← s/n Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 4 / 51 Running time Definition Suppose problem instance I has a true solution of S(I), and a randomized algorithm A returns the random variable A(I). Then A is a (1 + δ, ) randomized approximation algorithm if 1 S(I) ≤ ≤ 1 + δ ≥ 1 − . P 1 + δ A(I) Theorem Suppose p = P(X ∈ A). Then basic A/R is a (1 + δ, ) randomized approximation algorithm for 1 1 n = Θ · ln(−1) . p δ2 Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 5 / 51 Running time Definition Suppose problem instance I has a true solution of S(I), and a randomized algorithm A returns the random variable A(I). Then A is a (1 + δ, ) randomized approximation algorithm if 1 S(I) ≤ ≤ 1 + δ ≥ 1 − . P 1 + δ A(I) Theorem Suppose p = P(X ∈ A). Then basic A/R is a (1 + δ, ) randomized approximation algorithm for 1 1 n = Θ · ln(−1) . p δ2 Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 5 / 51 Random successes to random trials Basic A/R has fixed number of trials, random number of successes: ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ (10 Trials, estimate is 3/10) Better idea: fixed number of successes, random number of trials ~~~~~~~~~~~~ ~ ~ ~ (4 successes, estimate is 4/11) Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 6 / 51 Solving the 1/p problem Recall For G ∼ Geo(p), E[G] = 1/p. Fixed number of success A/R (Dagum, Luby, Karp, Ross 2000) Input: A, L(X), k Output: pˆ an estimate of P(X ∈ A) 1) Let t ← 0 1) For i from 1 to k do 2) Draw T from the distribution of X 3) Let t ← t + 1 3) If T ∈/ A, Goto line 2) 4) Let pˆ ← k/t Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 7 / 51 Solving the 1/p problem Recall For G ∼ Geo(p), E[G] = 1/p. Fixed number of success A/R (Dagum, Luby, Karp, Ross 2000) Input: A, L(X), k Output: pˆ an estimate of P(X ∈ A) 1) Let t ← 0 1) For i from 1 to k do 2) Draw T from the distribution of X 3) Let t ← t + 1 3) If T ∈/ A, Goto line 2) 4) Let pˆ ← k/t Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 7 / 51 No reliance on p Theorem The fixed number of successes A/R is a (1 + δ, ) randomized approximation algorithm when 1 k = Θ ln(−1) . δ2 Let R be the running time of this algorithm. Then 1 1 [R] = Θ ln(−1) . E p δ2 Note: 1) algorithm is noninterruptible, 2) 1/p factor is bad Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 8 / 51 Methods for improving on acceptance/rejection 1 Weighting draws with exponential shifts Chernoff bounds 2 Sequential Acceptance/Rejection Perfect matchings, the permanent, and Bregman’s Theorem 3 The Randomness Recycler The Ising model Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 9 / 51 Outline 1 Weighting draws with exponential shifts Chernoff bounds 2 Sequential Acceptance/Rejection Perfect matchings, the permanent, and Bregman’s Theorem 3 The Randomness Recycler The Ising model Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 10 / 51 Tails of sums of random variables Consider the following problem: given X1, X2, X3,... iid random variables can upper and lower bounds be found for X + X + ··· + X p = 1 2 n ≥ α P n Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 11 / 51 Chernoff bounds One approach is to use Chernoff bounds, for all t > 0: X + X + ··· + X 1 2 n ≥ α = (t(X + X + ··· + X ) ≥ tα) P n P 1 2 n t(X +···+X ) tnα = P e 1 n ≥ e [et(X1+···+Xn)] ≤ E etnα n [etX1 ] ≤ E etα Nice feature: captures exponential behavior in n Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 12 / 51 Questions about Chernoff bounds What value of t is best? How accurate are the bounds? Is there a way to use this upper bound in acceptance/rejection? Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 13 / 51 A bad approach Naive tail sums 1) Draw X1,..., Xn independently from distribution of X1 2) Accept if (X1 + ··· + Xn)/n ≥ α Problem: the chance of landing in the tail is exponentially small Example: U1,..., U100 ∼ Unif([0, 1]) −8 P ((U1 + ··· + U100)/100 ≥ .65) ≈ 7 · 10 . Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 14 / 51 Weighting draws Solution: weight draws towards larger values (Bucklew 2005) Add factor of etx to distribution: Z tx tx P(R1 ∈ dx) = e P(X1 ∈ dx)/ e P(X1 ∈ dx) tx tX1 = e P(X1 ∈ dx)/E[e ] Before After Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 15 / 51 Summing with weights etx1 etx2 ··· etxn (R ∈ dx ,... R ∈ dx ) = (X ∈ dx ,... X ∈ dx ) P 1 1 n n C P 1 1 n n et(x1+···xn) = (X ∈ dx ,... X ∈ dx ) C P 1 1 n n Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 16 / 51 Consequences Let Sn = X1 + ··· Xn, Tn = R1 + ··· + Rn Note ts tSn P(Tn ∈ ds) = e P(Sn ∈ ds)/E[e ]. To remove ets factor in measure, let Y |Tn ∼ Unif[0, Tn]. Now ts tSn P(Tn ∈ ds, Y ∈ dy) = P(Sn ∈ ds)1(y ∈ [0, e ])/E[e ]. Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 17 / 51 Using the auxilliary variable... ts tSn P(Tn ∈ ds, Y ∈ dy) = P(Sn ∈ ds)1(y ∈ [0, e ])/E[e ]. Note for t > 0, if s ≥ nα, then ets ≥ etnα Theorem tnα [Tn|Tn ≥ nα, Y ≤ e ] ∼ [Sn|Sn ≥ nα]. Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 18 / 51 Theorem to algorithm Weighted tail sums 1) Draw R1,..., Rn independently from distribution of X1 weighted by etx 2) Draw Y uniformly from 0 to et(R1+···+Rn) tnα 3) Accept if (R1 + ··· + Rn)/n ≥ α and Y ≤ e Important Fact: The probability of acceptance in this scheme is: tX1 tnα P((X1 + ··· + Xn)/n ≥ α)/[E[e ]/e ]. Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 19 / 51 Running time results Theorem (Huber 2006 [6]) Suppose that P(X1 > α) > 0 E[R1] = α 3 E[|R1| ] exists then the probability of acceptance is √ Θ(1/ n), giving an O(n3/2) sampling algorithm. Under the conditions, Berry-Esseen Theorem says sum of Ri close to normal, so close to mean Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 20 / 51 Outline 1 Weighting draws with exponential shifts Chernoff bounds 2 Sequential Acceptance/Rejection Perfect matchings, the permanent, and Bregman’s Theorem 3 The Randomness Recycler The Ising model Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 21 / 51 Selfreducible problems Problem A A1 A2 A3 Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 22 / 51 Perfect matchings/Dimer coverings Definition A perfect matching is a collection of edges of a graph such that each node is adjacent to exactly one edge Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 23 / 51 Perfect matchings/Dimer coverings Definition A perfect matching is a collection of edges of a graph such that each node is adjacent to exactly one edge Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 23 / 51 The Permanent Bipartite graphs can be encoded with n by n matrix of 0’s and 1’s 1 1 1 1 1 0 0 1 1 Number of perfect matchings is the permanent of the matrix Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 24 / 51 Relation to the determinant The determinant of a matrix A can be defined as: n X X (−1)sign(σ)A(i, σ(i)) σ∈Sn i=1 The permanant of a matrix A can be defined as: n X X A(i, σ(i)) σ∈Sn i=1 Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 25 / 51 Goals Twin goals: Generate uniformly from set of perfect matchings Estimate number of perfect matchings (]P-complete) Markov chain approach Broder [1] created chain on matchings + near perfect matchings Jerrum/Sinclair [8] showed polynomial under certain conditions Jerrum/Sinclair/Vigoda [9] method for all 0-1 matrices, Θ(n9) Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 26 / 51 Induction and an algorithm Say problem A breaks into problems A1,..., Ak and bound(A) ≥ bound(A1) + ··· + bound(Ak ) Sequential A/R P 1) Let A0 ← bound(A) − j bound(Aj ) 2) Choose X : P(X = i) = bound(Ai )/bound(A) 3) If X = 0 Reject and Quit 4) Else A ← Ai and Goto 1) Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 27 / 51 Running time The running time is at most O(k · bound(A)/soln(A)) Important to find upper bound within polynomial of solution Mark Huber (Duke University) Advanced Acceptance/Rejection Math Phys & Prob Seminar 28 / 51 Bregman’s Theorem Theorem (Bregman’s Theorem) For an n by n 0-1 matrix with row sums ri , n Y 1/ri per(A) ≤ (ri !) .

Load more