<<

ANR meeting

Image segmentation combining Markov Random Fields and Dirichlet Processes

Jessica SODJO

IMS, Groupe Signal Image, Talence Encadrants : A. Giremus, J.-F. Giovannelli, F. Caron, N. Dobigeon

Jessica SODJO ANR meeting 1 / 28 ANR meeting

Plan

1 Introduction

2 Segmentation using DP models Mixed MRF / DP model Inference : Swendsen-Wang algorithm

3 Hierarchical segmentation with shared classes Principle HDP theory

4 Conclusion and perspective

Jessica SODJO ANR meeting 2 / 28 ANR meeting Introduction

Segmentation – partition of an image in K homogeneous regions called classes

– label the pixels : pixel i ↔ zi ∈ {1,..., K } Bayesian approach – prior on the distribution of the pixels – all the pixels in a class have the same distribution characterized by a parameter vector Uk – Markov Random Fields (MRF) : exploit the similarity of pixels in the same neighbourhood Constraint : K must be fixed a priori

Idea : use the BNP models to directly estimate K

Jessica SODJO ANR meeting 3 / 28 ANR meeting Segmentation using DP models

Plan

1 Introduction

2 Segmentation using DP models Mixed MRF / DP model Inference : Swendsen-Wang algorithm

3 Hierarchical segmentation with shared classes Principle HDP theory

4 Conclusion and perspective

Jessica SODJO ANR meeting 4 / 28 ANR meeting Segmentation using DP models

Notations

– N is the number of pixels – Y is the observed image – Z = {z1,..., zN } – Π = {A1,..., AK } is a partition and m = {m1,..., mK } with mk = |Ak |

A1 A2 m1 = 1 m2 = 5 A3 m3 = 6 mK = 4 AK

FIGURE: Example of partition Jessica SODJO ANR meeting 5 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model Markov Random Fields (MRF)

– Description of the image by a neighbouring system

Considered pixel Neighbours

4-neighbours 8-neighbours

FIGURE: Examples of neighbouring system

–A clique c is either a singleton either a set of pixels in the same neighbourhood

Jessica SODJO ANR meeting 6 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model Markov Random Fields

Let θi ∈ {U1,..., UK } be the parameter vector associated to the i-th pixel

MRF ⇔ p(θi | θ−i ) = p(θi | θV(i)) where V(i) is the set of neighbours of pixel i

Hammersley-Clifford theorem ⇒ Gibbs field ! 1 1 X p(θ) = exp (−Φ(θ)) = exp − Φ (θ ) (1) Z Z c c Φ Φ c

with Φc(θc) the local potential and Φ(θ) the global one

Limitation: K is assumed to be known

Jessica SODJO ANR meeting 7 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model Potts model

The Potts model is a special MRF defined by :   X M(Π) ∝ exp  βij 1zi =zj  (2) i↔j where – i ↔ j means that the pixels i and j are neighbours

– βij > 0 if i and j are neighbours and βij = 0 otherwise

Jessica SODJO ANR meeting 8 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model The DP model

k−1 0 0 Y 0 τk | γ, H ∼ Beta(1, γ) τk = τk (1 − τl ) (3) l=1 where Beta(.) is the Beta distribution P∞ Let us write τ ∼ Stick(γ), τ = {τ1, τ2,...} and k=1 τk = 1 ∞ X G | γ, H ∼ DP(γ, H) G = τk δUk (4) k=1 with iid Uk | H ∼ H (5) The distribution of the observations is f , defined as :

yi | θi ∼ f (. | θi ) and θi | G ∼ G (6)

Jessica SODJO ANR meeting 9 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model The DP model

The Chinese Restaurant Process says,

K −i −i X m γ θ | θ ∼ k δ + H i −i N − 1 + γ Uk N − 1 + γ k=1

−i – mk is the size of cluster k if we remove pixel i from the partition – K −i is the number of clusters in the image with the i-th pixel removed

– Uk is the parameter vector associated to the k-th cluster

Limitation : the spatial interactions are not taken into account

Jessica SODJO ANR meeting 10 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model Principle of the segmentation using DP models

Define a distribution on the partitions using : – a model that allows that pixels in the same neighbourhood are likely to be in the same cluster (MRF) – DP model to deduce automatically the number of clusters (and if needed their parameters)

Jessica SODJO ANR meeting 11 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model Prior distribution DP and MRF

1 X 1 X p(θ) ∝ exp(− Φi (θi )) exp(− Φc(θc)) ZG ZM i c∈C2 | {z } | {z } Ψ(θ) DP model M(θ) MRF model where

– C2 means |c| > 2 and |.| is the size. – Φi (.) is defined as : N Z Y Φi (θi ) = − log G(θi ) and ZG = exp(− log G(θi ))dθ1 ... dθN i=1 N Y ⇒ Ψ(θ) = G(θi ) i=1 P. Orbanz & J. M. Buhmann Nonparametric Bayesian image segmentation, International Journal of Computer Vision, 2007 Jessica SODJO ANR meeting 12 / 28 ANR meeting Segmentation using DP models Mixed MRF / DP model Prior distribution mixing DP and MRF

We can deduce : K X −i γ P(θi | θ−i ) ∝ M(θi | θ−i )mk δUk + H (7) ZΦ k=1 Probability of assignment to a new cluster : Z qi0 ∝ f (yi | θ)H(θ)dθ (8) Ωθ Probability of assignment to an existing cluster : −i qik ∝ mk exp(−Φ(Uk | θ−i ))f (yi | Uk ) (9) Parameter update : Y Uk ∼ G0(Uk ) f (yi | Uk ) (10)

i|i∈Ak

Jessica SODJO ANR meeting 13 / 28 ANR meeting Segmentation using DP models Inference : Swendsen-Wang algorithm Swendsen-Wang algorithm : principle

* Estimation based on the joint posterior p(θ, Z | Y ) * Intractable ⇒ Monte Carlo (MCMC) Problem : very slow convergence Goal : Sample faster the partition of the image – Introduction of a new set of latent variables r such that :

p(Π, r) = p(Π)p(r | Π) Q p(r | Π) = p(rij | Π) 1

p(rij = 1 | Π) = 1 − exp(βij δij 1zi =zj ) The marginal posterior p(θ, Z | Y ) is unchanged – The links define the "so-called" -clusters

Jessica SODJO ANR meeting 14 / 28 ANR meeting Segmentation using DP models Inference : Swendsen-Wang algorithm Swendsen-Wang algorithm : principle

– Update the labels of the spin-clusters This operation update simultaneously the labels of all the pixels in a spin-cluster

FIGURE: Example of label update for spin-clusters

Jessica SODJO ANR meeting 15 / 28 ANR meeting Segmentation using DP models Inference : Swendsen-Wang algorithm Swendsen-Wang algorithm : principle

– rij ∼ Ber(1 − exp(βij δij 1zi =zj )) with Ber(.) is the Bernouilli distribution

Let S = {S1,..., Sp} be the set of spin-clusters.

– While removing the spin-cluster Sl , Π = {A−l ,..., A−l } is the partition obtained while −l 1 K−l removing all pixels in spin-cluster Sl −l −l mk = |Ak |

Jessica SODJO ANR meeting 16 / 28 ANR meeting Segmentation using DP models Inference : Swendsen-Wang algorithm Swendsen-Wang algorithm : principle

For l = 1 : p

* The probability to assign pixels in spin-cluster Sl to cluster k is :

−l −l −l −l qlk ∝ Ψ(m ,..., m + |Sl |,..., m )p(ySl | y ) 1 k K−l Ak Q exp(βij (1 − δij )1zi =zj ) {(i,j)|i∈Sl ,rij =0}

* The probability to assign pixels in spin-cluster Sl to a new cluster is :

q = Ψ(m−l ,..., m−l , |S |)p(y ) l0 1 K−l l Sl R Q with p(yAk ) = f (yi | Uk )H(Uk )dUk i∈Ak

Jessica SODJO ANR meeting 17 / 28 ANR meeting Hierarchical segmentation with shared classes

Plan

1 Introduction

2 Segmentation using DP models Mixed MRF / DP model Inference : Swendsen-Wang algorithm

3 Hierarchical segmentation with shared classes Principle HDP theory

4 Conclusion and perspective

Jessica SODJO ANR meeting 18 / 28 ANR meeting Hierarchical segmentation with shared classes Principle Proposed idea

– Different levels of classification can be considered – Coarse categories : urban, sub-urban, forest, etc. – Sub-classes shared between the categories : trees, roads, buildings Taking into account the fact that the classes are shared between different categories can help estimating their parameters and thereby improve the segmentation

Jessica SODJO ANR meeting 19 / 28 ANR meeting Hierarchical segmentation with shared classes HDP theory

Solution : Hierarchical DP

Let J be the number of categories

G0 | γ, H ∼ DP(γ, H)

Gj | α0, G0 ∼ DP(α0, G0) for j = 1,..., J ∗ α0 ∈ R+ G0 is a discrete distribution

Discreteness of G0 ⇒ clusters shared among categories

Jessica SODJO ANR meeting 20 / 28 ANR meeting Hierarchical segmentation with shared classes HDP theory

∞ X G0 = τk δUk (11) k=1

where τ |γ ∼ Stick(γ), τ = {τ1, τ2,...} and Uk | H ∼ H

∞ X Gj = πjk δUk (12) k=1

with πj | α0, τ ∼ DP(α0, τ ) and πj = {πj1, πj2,...}

ϕji | Gj ∼ Gj (13)

So, samples of the processes G0 and Gj can be seen as infinite countable mixtures of Dirac measures with respective coefficients τ and πj .

Jessica SODJO ANR meeting 21 / 28 ANR meeting Hierarchical segmentation with shared classes HDP theory Principle - Chinese Restaurant Franchise

NOTATIONS – J restaurants

– Same menu for all restaurants - U1, U2,...

– Tj is the number of tables in restaurant j

– θjt is the t-th table of restaurant j

– ϕji is the i-th client in restaurant j

– njt is the number of clients at a table t

– ηjk is the number of tables in restaurant j P which have chosen dish Uk and ηk = k ηjk

Jessica SODJO ANR meeting 22 / 28 ANR meeting Hierarchical segmentation with shared classes HDP theory Principle - Chinese Restaurant Franchise Restaurant 1 ϕ11 ϕ12 ϕ14 ϕ13 U 1 U 2 U 2 = = = ... Menu θ 11 θ 12 θ 13 U1 U2 Restaurant 2 U3 ϕ21 ϕ22 . 2 1 ϕ23 U ϕ25 U = = ... . θ 21 θ 22 .

Restaurant 3 ϕ31 ϕ32 ϕ33 U 1 U 2 U 3 = = = ... θ 31 θ 32 θ 33

Jessica SODJO ANR meeting 23 / 28 ANR meeting Hierarchical segmentation with shared classes HDP theory Principle - Chinese Restaurant Franchise ϕ 11 U 1 Exemple : Restaurant 1 ϕ13 = θ 11 n 11 = 2

ϕ12 2 U n12 = = 1 θ 12 ϕ15 1 Menu = n13 U1 η ϕ14 2 1 U = = 3 13 θ 0 U2 η α 2 = 4

η3 = 1 U3 θ14 U4 γ Jessica SODJO ANR meeting 24 / 28 ANR meeting Hierarchical segmentation with shared classes HDP theory Principle - Chinese Restaurant Franchise ϕ 11 U 1 Exemple : Restaurant 1 ϕ13 = θ 11 n 11 = 2

ϕ12 2 U n12 = = 1 θ 12 ϕ15 1 Menu = n13 U1 η ϕ14 2 1 U = = 3 13 θ 0 U2 η α 2 = 4

η3 = 1 U3 θ14 U4 γ Jessica SODJO ANR meeting 24 / 28 ANR meeting Hierarchical segmentation with shared classes HDP theory Principle - Chinese Restaurant Franchise ϕ 11 U 1 Exemple : Restaurant 1 ϕ13 = θ 11 n 11 = 2

ϕ12 2 U n12 = = 1 θ 12 ϕ15 1 Menu = n13 U1 η ϕ14 2 1 U = = 3 13 θ 0 U2 η α 2 = 4

η3 = 1 U3 θ14 U4 γ Jessica SODJO ANR meeting 24 / 28 ANR meeting Hierarchical segmentation with shared classes HDP theory Principle - Chinese Restaurant Franchise ϕ 11 U 1 Exemple : Restaurant 1 ϕ13 = θ 11 n 11 = 2

ϕ12 2 U n12 = = 1 θ 12 ϕ15 1 Menu = n13 U1 η ϕ14 2 1 U = = 3 13 θ 0 U2 η α 2 = 4

η3 = 1 U3 θ14 U4 γ Jessica SODJO ANR meeting 24 / 28 ANR meeting Hierarchical segmentation with shared classes HDP theory Chinese Restaurant Franchise

Tj X njt α0 ϕji | ϕj1, . . . , ϕji−1, α0, G0 ∼ δθjt + G0 (14) i − 1 + α0 i − 1 + α0 t=1

K X ηk γ θ | θ , . . . , θ , . . . , θ , γ, ∼ δ + jt j1 21 jt−1 H P Uk P H (15) ηk + γ ηk + γ k=1 k k

Y. W. Teh, M. I. Jordan, M. J. Beal & D. M. Blei Hierarchical Dirichlet Processes, JASA, 2006

Jessica SODJO ANR meeting 25 / 28 ANR meeting Conclusion and perspective

Plan

1 Introduction

2 Segmentation using DP models Mixed MRF / DP model Inference : Swendsen-Wang algorithm

3 Hierarchical segmentation with shared classes Principle HDP theory

4 Conclusion and perspective

Jessica SODJO ANR meeting 26 / 28 ANR meeting Conclusion and perspective

Conclusion – Spatial constraints : Potts model – Flexibility : DP model – Rapidity : Swendsen-Wang algorithm – Sharing : HDP

Perspective – Efficient sampling algorithm

Jessica SODJO ANR meeting 27 / 28 ANR meeting Thank

Thank you for your attention

Jessica SODJO ANR meeting 28 / 28