From Elimination to Belief Propagation

School of Computer Science The Belief Propagation (Sum-Product) Algorithm Probabilistic Graphical Models (10-708) Lecture 5, Sep 31, 2007 Receptor A Receptor B X1 X2 Eric Xing X Kinase C 3 Kinase D X4 Kinase E X5 TF F X6 Reading: J-Chap 4 Gene G X 7 Gene H X8 1 From Elimination to Belief Propagation z Recall that Induced dependency during marginalization is captured in elimination cliques z Summation <-> elimination z Intermediate term <-> elimination clique B A B A A C A A C D A E F C D E z Can this lead to an generic E E F inference algorithm? G H Eric Xing 2 1 Tree GMs Undirected tree: a Directed tree: all Poly tree: can have unique path between nodes except the root multiple parents any pair of nodes have exactly one parent We will come back to this later Eric Xing 3 Equivalence of directed and undirected trees z Any undirected tree can be converted to a directed tree by choosing a root node and directing all edges away from it z A directed tree and the corresponding undirected tree make the same conditional independence assertions z Parameterizations are essentially the same. z Undirected tree: z Directed tree: z Equivalence: z Evidence:? Eric Xing 4 2 From elimination to message passing z Recall ELIMINATION algorithm: z Choose an ordering Z in which query node f is the final node z Place all potentials on an active list z Eliminate node i by removing all potentials containing i, take sum/product over xi. z Place the resultant factor back on the list z For a TREE graph: z Choose query node f as the root of the tree z View tree as a directed tree with edges pointing towards from f z Elimination ordering based on depth-first traversal z Elimination of each node can be considered as message-passing (or Belief Propagation) directly along tree branches, rather than on some transformed graphs Æ thus, we can use the tree itself as a data-structure to do general inference!! Eric Xing 5 The elimination algorithm Procedure Initialize (G, Z) Procedure Normalization (φ∗) ∗ ∗ 1. Let Z1, . ,Zk be an ordering of Z 1. P(X|E)=φ (X)/∑xφ (X) such that Zi ≺ Zj iff i < j 2. Initialize F with the full the set of factors Procedure Evidence (E) 1. for each i∈ΙE , Procedure Sum-Product-Eliminate-Var ( F =F ∪δ(Ei, ei) F, // Set of factors Procedure Sum-Product-Variable- Z // Variable to be eliminated ) Elimination (F, Z, ≺) 1. F ′←{φ ∈ F : Z ∈ Scope[φ]} 1. for i = 1, . , k 2. F ′′ ← F − F ′ F ← Sum-Product-Eliminate-Var(F, Zi) ∗ 3. ψ ←∏φ∈F ′ φ 2. φ ← ∏φ∈F φ 4. τ ← ∑ ψ 3. return φ∗ Z 5. return F ′′ ∪ {τ} 4. Normalization (φ∗) Eric Xing 6 3 Message passing for trees Let mij(xi) denote the factor resulting from f eliminating variables from bellow up to i, which is a function of xi: This is reminiscent of a message sent i from j to i. j k l mij(xi) represents a "belief" of xi from xj! Eric Xing 7 z Elimination on trees is equivalent to message passing along tree branches! f i j k l Eric Xing 8 4 The message passing protocol: z A node can send a message to its neighbors when (and only when) it has received messages from all its other neighbors. z Computing node marginals: z Naïve approach: consider each node as the root and execute the message passing algorithm X m21(x1) 1 Computing P(X1) m32(x2) m42(x2) X2 X3 X4 Eric Xing 9 The message passing protocol: z A node can send a message to its neighbors when (and only when) it has received messages from all its other neighbors. z Computing node marginals: z Naïve approach: consider each node as the root and execute the message passing algorithm X m12(x2) 1 Computing P(X2) m32(x2)m42(x2) X2 X3 X4 Eric Xing 10 5 The message passing protocol: z A node can send a message to its neighbors when (and only when) it has received messages from all its other neighbors. z Computing node marginals: z Naïve approach: consider each node as the root and execute the message passing algorithm X m12(x2) 1 Computing P(X3) m23(x3) m42(x2) X2 X3 X4 Eric Xing 11 Computing node marginals z Naïve approach: z Complexity: NC z N is the number of nodes z C is the complexity of a complete message passing z Alternative dynamic programming approach z 2-Pass algorithm (next slide Î) z Complexity: 2C! Eric Xing 12 6 The message passing protocol: z A two-pass algorithm: X1 m21(X 1) m12(X 2) m32(X 2) X2 m42(X 2) X4 X3 m24(X 4) m23(X 3) Eric Xing 13 Belief Propagation (SP-algorithm): Sequential implementation Eric Xing 14 7 Belief Propagation (SP-algorithm): Parallel synchronous implementation z For a node of degree d, whenever messages have arrived on any subset of d-1 node, compute the message for the remaining edge and send! z A pair of messages have been computed for each edge, one for each direction z All incoming messages are eventually computed for each node Eric Xing 15 Correctness of BP on tree z Collollary: the synchronous implementation is "non-blocking" z Thm: The Message Passage Guarantees obtaining all marginals in the tree z What about non-tree? Eric Xing 16 8 Another view of SP: Factor Graph z Example 1 X X X f X 1 5 fa 1 d 5 X X 3 fc 3 f fe X2 X4 b X2 X4 P(X1) P(X2) P(X3|X1,X2) P(X5|X1,X3) P(X4|X2,X3) fa(X1) fb(X2) fc(X3,X1,X2) fd(X5,X1,X3) fe(X4,X2,X3) Eric Xing 17 Factor Graphs z Example 2 X1 X1 fa fc X2 X3 X2 X3 f ψ(x1,x2,x3) = fa(x1,x2)fb(x2,x3)fc(x3,x1) b z Example 3 X1 X1 fa X2 X3 X2 X3 ψ(x1,x2,x3) = fa(x1,x2,x3) Eric Xing 18 9 Factor Tree z A Factor graph is a Factor Tree if the undirected graph obtained by ignoring the distinction between variable nodes and factor nodes is an undirected tree X1 X1 fa X2 X3 X2 X3 ψ(x1,x2,x3) = fa(x1,x2,x3) Eric Xing 19 Message Passing on a Factor Tree z Two kinds of messages 1. ν: from variables to factors 2. µ: from factors to variables x f1 j x f xi fs i s f3 xk Eric Xing 20 10 Message Passing on a Factor Tree, con'd z Message passing protocol: z A node can send a message to a neighboring node only when it has received messages from all its other neighbors z Marginal probability of nodes: x f1 j x f xi fs i s f3 xk P(xi) ∝∏s 2 N(i) µsi(xi) ∝νis(xi)µsi(xi) Eric Xing 21 BP on a Factor Tree X X3 1 X2 ν µ 1d µd2 e2 ν3e f f X X1 d X2 e 3 µ ν ν µ d1 2d 2e e3 µ µa1 ν3c c3 ν1a µb2 ν2b f fa fb c Eric Xing 22 11 Why factor graph? z Tree-like graphs to Factor trees X1 X1 X 2 X2 X X3 4 X 3 X4 X5 X6 X5 X6 Eric Xing 23 Poly-trees to Factor trees X X1 2 X 1 X2 X3 X4 X3 X4 X5 X5 Eric Xing 24 12 Why factor graph? z Because FG turns tree-like X1 X1 graphs to factor trees, z and trees are a data-structure X 2 X2 that guarantees correctness of X X3 4 BP ! X3 X4 X5 X6 X5 X6 X1 X2 X 1 X2 X3 X4 X3 X4 X5 X5 Eric Xing 25 Max-product algorithm: computing MAP probabilities f i j k l Eric Xing 26 13 Max-product algorithm: computing MAP configurations using a final bookkeeping backward pass f i j k l Eric Xing 27 Summary z Sum-Product algorithm computes singleton marginal probabilities on: z Trees z Tree-like graphs z Poly-trees z Maximum a posteriori configurations can be computed by replacing sum with max in the sum-product algorithm z Extra bookkeeping required Eric Xing 28 14 Inference on general GM z Now, what if the GM is not a tree-like graph? z Can we still directly run message message-passing protocol along its edges? z For non-trees, we do not have the guarantee that message-passing will be consistent! z Then what? z Construct a graph data-structure from P that has a tree structure, and run message-passing on it! Æ Junction tree algorithm Eric Xing 29 Elimination Clique z Recall that Induced dependency during marginalization is captured in elimination cliques z Summation <-> elimination z Intermediate term <-> elimination clique B A B A A C A A C D A E F C D E z Can this lead to an generic E E F inference algorithm? G H Eric Xing 30 15 A Clique Tree B A B A A m m C c b A A md C D mf A E F me C D E mh E E F mg G H me (a,c,d) = ∑ p(e | c,d)mg (e)m f (a,e) e Eric Xing 31 From Elimination to Message Passing z Elimination ≡ message passing on a clique tree B A B A B A B A B A B A B A B A A C D C D C D C D C D C D C E F E F E F E F E G H G H G ≡ B A B A A m m C c b A A md C D mf A m (a,c,d) E F e me C D = p(e | c,d)mg (e)m f (a,e) ∑ E mh e E E F mg G H z Messages can be reused Eric Xing 32 16 From Elimination to Message Passing z Elimination ≡ message passing on a clique tree z Another query ..

Load more