Approximate Inference on Planar Graphs Using Loop Calculus and Belief Propagation

Approximate inference on planar graphs using Loop Calculus and Belief Propagation Vicen¸cGómez Michael Chertkov Hilbert J. Kappen Theoretical Division and Center for Nonlinear Studies Radboud University Nijmegen, Los Alamos National Laboratory Donders Institute for Brain, Cognition and Behaviour Los Alamos, NM 87545 6525 EZ Nijmegen, The Netherlands Abstract as hypothesis testing or parameter estimation. Ex- act computation of this quantity is only feasible when the graph is not too complex, or equivalently, when We introduce novel results for approximate its tree-width is small. Currently many methods are inference on planar graphical models using devoted to approximate this quantity. the loop calculus framework. The loop calculus (Chertkov and Chernyak, 2006b) allows The belief propagation (BP) algorithm (Pearl, 1988) is to express the exact partition function Z of at the core of many of these methods. Initially thought a graphical model as a finite sum of terms as an exact algorithm for tree graphs, it is widely used that can be evaluated once the belief prop- as an approximation method for loopy graphs (Mur- agation (BP) solution is known. In general, phy et al., 1999; Frey and MacKay, 1998). The exact full summation over all correction terms is partition function Z is explicitly related to the BP ap- intractable. We develop an algorithm for proximation through the loop calculus framework in- the approach presented in Chertkov et al. troduced by Chertkov and Chernyak (2006b). Loop (2008) which represents an efficient trunca- calculus allows to express Z as a finite sum of terms tion scheme on planar graphs and a new rep- (loop series) that can be evaluated once the BP so- resentation of the series in terms of Pfaffians lution is known. Each term maps uniquely to a sub- of matrices. We analyze in detail both the graph, also denoted as a generalized loop, where the loop series and the Pfaffian series for mod- connectivity of any node within the subgraph is at els with binary variables and pairwise in- least degree 2. Summation of the entire loop series teractions, and show that the first term of is a hard combinatorial task since the number of gen- the Pfaffian series can provide very accurate eralized loops is typically exponential in the size of the approximations. The algorithm outperforms graph. However, different approximations can be ob- previous truncation schemes of the loop series tained by considering different subsets of generalized and is competitive with other state-of-the-art loops in the graph. methods for approximate inference. Although it has been shown empirically (Gómez et al., 2007; Chertkov and Chernyak, 2006a) that truncating this series may provide efficient corrections to the ini- 1 Introduction tial BP approximation, a formal characterization of the classes of tractable problems via loop calculus still Graphical models are popular tools widely used in remains as an open question. The work of Chertkov many areas which require modeling of uncertainty. et al. (2008) represents a step in this direction, where They provide an effective approach through a com- it was shown that for any graphical model, summation pact representation of the joint probability distribu- of a certain subset of terms can be mapped to a sum- tion. The two most common types of graphical mod- mation of weighted perfect matchings on an extended els are Bayesian Networks (BN) and Markov Random graph. For planar graphs (graphs that can be embed- Fields (MRFs). ded into a plane without crossing edges), this summation can be performed in polynomial time evaluat- The partition function of a graphical model, which ing the Pfaffian of a skew-symmetric matrix associated plays the role of normalization constant in a MRF or with the extended graph. Furthermore, the full loop probability of evidence (likelihood) in a BN is a fun- series can be expressed as a sum over Pfaffian terms, damental quantity which arises in many contexts such each one accounting for a large number of loops and of variables where σa is the vector of variables asso- σ solvable in polynomial time as well. ciated with node a, i.e. a := (σab1 ,σab2 ,... ) where b a¯. The joint probability distribution of such a This approach builds on classical results from 1960s i model∈ factorizes as: by Kasteleyn (1963); Fisher (1966) and other physi- −1 cists who showed that in a planar graphical model de- p (σ) = Z fa (σa), Z = fa (σa), (1) fined in terms of binary variables, computing Z can σ aY∈V X aY∈V be mapped to a weighted perfect matching problem where Z is the partition function. and calculated in polynomial time under the key re- striction that interactions only depend on agreement From a variational perspective, a fixed point of the BP or disagreement between the signs of their variables. algorithm represents a stationary point of the Bethe Such a model is known in statistical physics as the ”free energy” approximation under proper constraints Ising model without external field. Notice that exact (Yedidia et al., 2005): inference for a general binary graphical model on a ZBP = exp F BP , (2) planar graph (i.e. Ising model with external field) is − b (σ ) intractable (Barahona, 1982). F BP = b (σ ) ln a a a a f (σ ) a σ a a Recently, some methods for inference over graphical X Xa models, based on the works of Kasteleyn and Fisher, b (σ ) lnb (σ ), (3) − ab ab ab ab have been introduced. Globerson and Jaakkola (2007) b∈a¯ σab obtained upper bounds on Z for non-planar graphs X X σ with binary variables by decomposition of Z into a where ba( a) and bab(σab) are the beliefs (pseudo- marginals) associated to each node a and variable weighted sum over partition functions of spanning ∈ V tractable (zero field) planar models. Another example ab. For graphs without loops, Equation (2) coincides BP is the work of Schraudolph and Kamenetsky (2009) with the Gibbs ”free energy” and therefore Z coin- BP which provides a framework for exact inference on a cides with Z. If the graph contains loops, Z is just restricted class of planar graphs using the approach of an approximation critically dependent on how strong Kasteleyn and Fisher. the influence of the loops is. We introduce now some convenient definitions. Contrary to the two aforementioned approaches which rely on exact inference on a tractable planar model, the Definition 1 A generalized loop in a graph is any G loop calculus directly leads to a framework for approx- subgraph C such that each node in C has degree 2 or imate inference on general planar graphs. Truncating larger. the loop series according to Chertkov et al. (2008) al- ready gives the exact result in the zero external field We use the term ”loop” instead of ”generalized loop” case. In the general planar case, however, this trunca- for the rest of the manuscript. Z is explicitly repre- tion may result into an accurate approximation that sented in terms of the BP solution via the loop series can be incrementally corrected by considering subse- expansion: quent terms in the series. BP Z = Z z, z = 1 + rC , rC = µa;¯aC , · ! CX∈C aY∈C 2 Belief Propagation and loop Series (4) for Planar Graphs where is the set of all the loops within the graph. Each loopC term r is a product of terms µ asso- We consider the Forney graph representation, also C a,a¯C ciated with every node a of the loop.a ¯ denotes the called general vertex model (Forney, 2001; Loeliger, C set of neighbors of a within the loop C: 2004), of a probability distribution p(σ) defined over a vector σ of binary variables (vectors are denoted us- ba (σa) (σab mab) ing bold symbols). Forney graphs are associated with − σa b∈a¯C general graphical models which subsume other factor µa;¯aC = X Y , 1 m2 graphs, e.g. those correspondent to BNs and MRFs. − ab b∈a¯C q A binary Forney graph := ( , ) consists of a set of Y nodes where each nodeG a V representsE an interac- mab = σabbab (σab). (5) V ∈ V σab tion and each edge (a, b) represents a binary vari- X able ab which take values∈σ E := 1 . We denotea ¯ We consider planar graphs with all nodes of degree not ab {± } the set of neighbors of node a. Interactions fa (σa) are larger than 3, i.e. a¯C 3. We denote by triplet a arbitrary functions defined over typically small subsets node with degree 3| in the| ≤ graph. (a) a G g (b) Gext G Gext b f b c d e 1 i j h l a µa;{b,c} 1 k c (c) b 1 µ µ a a;{b,c} a;{b,d} 1 1 µa;{c,d} 1 2 3 (d) c d 4 5 Figure 2: Fisher’s rules. (Top) A node a of degree 2 in is split in 2 nodes in . (Bottom) A node a of G Gext degree 3 in is split in 3 nodes in ext. Right boxes include all matchingsG in relatedG with node a in . Gext G Figure 1: Example. (a) A Forney graph. (b) Cor- responding extended graph. (c) Loops (in bold) included in the 2-regular partition function Z∅. (d) otherwise. The following identity allows to obtain the Loops (in bold and red) not included in Z∅. Marked in Pfaffian up to a sign by computing the determinant: red, the triplets associated with each loop. Grouped in 2 gray squares, the loops considered in different subsets Pfaffian (A) = Det(A).

Approximate Inference on Planar Graphs Using Loop Calculus and Belief Propagation

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support