Image Segmentation Combining Markov Random Fields and Dirichlet Processes

ANR meeting Image segmentation combining Markov Random Fields and Dirichlet Processes Jessica SODJO IMS, Groupe Signal Image, Talence Encadrants : A. Giremus, J.-F. Giovannelli, F. Caron, N. Dobigeon Jessica SODJO ANR meeting 1 / 28 ANR meeting Plan 1 Introduction 2 Segmentation using DP models Mixed MRF / DP model Inference : Swendsen-Wang algorithm 3 Hierarchical segmentation with shared classes Principle HDP theory 4 Conclusion and perspective Jessica SODJO ANR meeting 2 / 28 ANR meeting Introduction Segmentation – partition of an image in K homogeneous regions called classes – label the pixels : pixel i $ zi 2 f1;:::; K g Bayesian approach – prior on the distribution of the pixels – all the pixels in a class have the same distribution characterized by a parameter vector Uk – Markov Random Fields (MRF) : exploit the similarity of pixels in the same neighbourhood Constraint : K must be fixed a priori Idea : use the BNP models to directly estimate K Jessica SODJO ANR meeting 3 / 28 ANR meeting Segmentation using DP models Plan 1 Introduction 2 Segmentation using DP models Mixed MRF / DP model Inference : Swendsen-Wang algorithm 3 Hierarchical segmentation with shared classes Principle HDP theory 4 Conclusion and perspective Jessica SODJO ANR meeting 4 / 28 ANR meeting Segmentation using DP models Notations – N is the number of pixels – Y is the observed image – Z = fz1;:::; zN g – Π = fA1;:::; AK g is a partition and m = fm1;:::; mK g with mk = jAk j A1 A2 m1 = 1 m2 = 5 A3 m3 = 6 mK = 4 AK FIGURE: Example of partition Jessica SODJO ANR meeting 5 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model Markov Random Fields (MRF) – Description of the image by a neighbouring system Considered pixel Neighbours 4-neighbours 8-neighbours FIGURE: Examples of neighbouring system –A clique c is either a singleton either a set of pixels in the same neighbourhood Jessica SODJO ANR meeting 6 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model Markov Random Fields Let θi 2 fU1;:::; UK g be the parameter vector associated to the i-th pixel MRF , p(θi j θ−i ) = p(θi j θV(i)) where V(i) is the set of neighbours of pixel i Hammersley-Clifford theorem ) Gibbs field ! 1 1 X p(θ) = exp (−Φ(θ)) = exp − Φ (θ ) (1) Z Z c c Φ Φ c with Φc(θc) the local potential and Φ(θ) the global one Limitation: K is assumed to be known Jessica SODJO ANR meeting 7 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model Potts model The Potts model is a special MRF defined by : 0 1 X M(Π) / exp @ βij 1zi =zj A (2) i$j where – i $ j means that the pixels i and j are neighbours – βij > 0 if i and j are neighbours and βij = 0 otherwise Jessica SODJO ANR meeting 8 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model The DP model k−1 0 0 Y 0 τk j γ; H ∼ Beta(1; γ) τk = τk (1 − τl ) (3) l=1 where Beta(:) is the Beta distribution P1 Let us write τ ∼ Stick(γ), τ = fτ1; τ2;:::g and k=1 τk = 1 1 X G j γ; H ∼ DP(γ; H) G = τk δUk (4) k=1 with iid Uk j H ∼ H (5) The distribution of the observations is f , defined as : yi j θi ∼ f (: j θi ) and θi j G ∼ G (6) Jessica SODJO ANR meeting 9 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model The DP model The Chinese Restaurant Process says, K −i −i X m γ θ j θ ∼ k δ + H i −i N − 1 + γ Uk N − 1 + γ k=1 −i – mk is the size of cluster k if we remove pixel i from the partition – K −i is the number of clusters in the image with the i-th pixel removed – Uk is the parameter vector associated to the k-th cluster Limitation : the spatial interactions are not taken into account Jessica SODJO ANR meeting 10 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model Principle of the segmentation using DP models Define a distribution on the partitions using : – a model that allows that pixels in the same neighbourhood are likely to be in the same cluster (MRF) – DP model to deduce automatically the number of clusters (and if needed their parameters) Jessica SODJO ANR meeting 11 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model Prior distribution mixing DP and MRF 1 X 1 X p(θ) / exp(− Φi (θi )) exp(− Φc(θc)) ZG ZM i c2C2 | {z } | {z } Ψ(θ) DP model M(θ) MRF model where – C2 means jcj > 2 and j:j is the size. – Φi (:) is defined as : N Z Y Φi (θi ) = − log G(θi ) and ZG = exp(− log G(θi ))dθ1 ::: dθN i=1 N Y ) Ψ(θ) = G(θi ) i=1 P. Orbanz & J. M. Buhmann Nonparametric Bayesian image segmentation, International Journal of Computer Vision, 2007 Jessica SODJO ANR meeting 12 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model Prior distribution mixing DP and MRF We can deduce : K X −i γ P(θi j θ−i ) / M(θi j θ−i )mk δUk + H (7) ZΦ k=1 Probability of assignment to a new cluster : Z qi0 / f (yi j θ)H(θ)dθ (8) Ωθ Probability of assignment to an existing cluster : −i qik / mk exp(−Φ(Uk j θ−i ))f (yi j Uk ) (9) Parameter update : Y Uk ∼ G0(Uk ) f (yi j Uk ) (10) iji2Ak Jessica SODJO ANR meeting 13 / 28 ANR meeting Segmentation using DP models Inference : Swendsen-Wang algorithm Swendsen-Wang algorithm : principle * Estimation based on the joint posterior p(θ; Z j Y ) * Intractable ) Markov Chain Monte Carlo (MCMC) Problem : very slow convergence Goal : Sample faster the partition of the image – Introduction of a new set of latent variables r such that : p(Π; r) = p(Π)p(r j Π) Q p(r j Π) = p(rij j Π) 1<i<j<N p(rij = 1 j Π) = 1 − exp(βij δij 1zi =zj ) The marginal posterior p(θ; Z j Y ) is unchanged – The links define the "so-called" spin-clusters Jessica SODJO ANR meeting 14 / 28 ANR meeting Segmentation using DP models Inference : Swendsen-Wang algorithm Swendsen-Wang algorithm : principle – Update the labels of the spin-clusters This operation update simultaneously the labels of all the pixels in a spin-cluster FIGURE: Example of label update for spin-clusters Jessica SODJO ANR meeting 15 / 28 ANR meeting Segmentation using DP models Inference : Swendsen-Wang algorithm Swendsen-Wang algorithm : principle – rij ∼ Ber(1 − exp(βij δij 1zi =zj )) with Ber(:) is the Bernouilli distribution Let S = fS1;:::; Spg be the set of spin-clusters. – While removing the spin-cluster Sl , Π = fA−l ;:::; A−l g is the partition obtained while −l 1 K−l removing all pixels in spin-cluster Sl −l −l mk = jAk j Jessica SODJO ANR meeting 16 / 28 ANR meeting Segmentation using DP models Inference : Swendsen-Wang algorithm Swendsen-Wang algorithm : principle For l = 1 : p * The probability to assign pixels in spin-cluster Sl to cluster k is : −l −l −l −l qlk / Ψ(m ;:::; m + jSl j;:::; m )p(ySl j y ) 1 k K−l Ak Q exp(βij (1 − δij )1zi =zj ) f(i;j)ji2Sl ;rij =0g * The probability to assign pixels in spin-cluster Sl to a new cluster is : q = Ψ(m−l ;:::; m−l ; jS j)p(y ) l0 1 K−l l Sl R Q with p(yAk ) = f (yi j Uk )H(Uk )dUk i2Ak Jessica SODJO ANR meeting 17 / 28 ANR meeting Hierarchical segmentation with shared classes Plan 1 Introduction 2 Segmentation using DP models Mixed MRF / DP model Inference : Swendsen-Wang algorithm 3 Hierarchical segmentation with shared classes Principle HDP theory 4 Conclusion and perspective Jessica SODJO ANR meeting 18 / 28 ANR meeting Hierarchical segmentation with shared classes Principle Proposed idea – Different levels of classification can be considered – Coarse categories : urban, sub-urban, forest, etc. – Sub-classes shared between the categories : trees, roads, buildings Taking into account the fact that the classes are shared between different categories can help estimating their parameters and thereby improve the segmentation Jessica SODJO ANR meeting 19 / 28 ANR meeting Hierarchical segmentation with shared classes HDP theory Solution : Hierarchical DP Let J be the number of categories G0 j γ; H ∼ DP(γ; H) Gj j α0; G0 ∼ DP(α0; G0) for j = 1;:::; J ∗ α0 2 R+ G0 is a discrete distribution Discreteness of G0 ) clusters shared among categories Jessica SODJO ANR meeting 20 / 28 ANR meeting Hierarchical segmentation with shared classes HDP theory 1 X G0 = τk δUk (11) k=1 where τ jγ ∼ Stick(γ), τ = fτ1; τ2;:::g and Uk j H ∼ H 1 X Gj = πjk δUk (12) k=1 with πj j α0; τ ∼ DP(α0; τ ) and πj = fπj1; πj2;:::g 'ji j Gj ∼ Gj (13) So, samples of the processes G0 and Gj can be seen as infinite countable mixtures of Dirac measures with respective coefficients τ and πj . Jessica SODJO ANR meeting 21 / 28 ANR meeting Hierarchical segmentation with shared classes HDP theory Principle - Chinese Restaurant Franchise NOTATIONS – J restaurants – Same menu for all restaurants - U1; U2;::: – Tj is the number of tables in restaurant j – θjt is the t-th table of restaurant j – 'ji is the i-th client in restaurant j – njt is the number of clients at a table t – ηjk is the number of tables in restaurant j P which have chosen dish Uk and ηk = k ηjk Jessica SODJO ANR meeting 22 / 28 ANR meeting Hierarchical segmentation with shared classes HDP theory Principle - Chinese Restaurant Franchise Restaurant 1 '11 '12 '14 '13 U 1 U 2 U 2 = = = ..

Image Segmentation Combining Markov Random Fields and Dirichlet Processes

Lecturenotes 4 MCMC I – Contents

Mathematisches Forschungsinstitut Oberwolfach Scaling Limits in Models of Statistical Mechanics

Random and Out-Of-Equilibrium Potts Models Christophe Chatelain

Theory of Continuum Percolation I. General Formalism

CFT Interpretation of Merging Multiple SLE Traces

A Novel Approach for Markov Random Field with Intractable Normalising Constant on Large Lattices

A Review on Statistical Inference Methods for Discrete Markov Random Fields Julien Stoehr, Richard Everitt, Matthew T

Critical Gaussian Multiplicative Chaos: Convergence of The

The Potts Model and Tutte Polynomial, and Associated Connections Between Statistical Mechanics and Graph Theory

Markov Chain Monte Carlo Sampling

Sharp Phase Transition for the Random-Cluster and Potts Models Via Decision Trees

MRF), Potts Model, Wireless Sensor Networks (Wsns