Unsup ervised Progressive Parsing of Poisson Fields
Using Minimum Description Length Criteria
y
Rob ert D. Nowak Mario A. T. Figueiredo
Dept. of Electr. and Comp. Eng. Instituto de Telecomunicac~oes, and
Rice University Instituto Sup erior Tecnico
Houston, TX, U.S.A. 1049-001 Lisb oa, Portugal
Abstract
This paper describes novel methods for estimating
piecewise homogeneous Poisson elds based on mini-
mum description length (MDL) criteria. By adopt-
ing a coding-theoretic approach, our methods are able
to adapt to the the observed eld in an unsupervised
(a) (b) (c)
manner. We present a parsing scheme basedon xed
multiscale trees (binary, for 1D, quad, for 2D) and an
Figure 1: Illustration of the problem addressed in this
adaptive recursive partioning algorithm, both guided
pap er: (a) piecewise-constantintensity function (Pois-
by MDL criteria. Experiments show that the recursive
son intensities 0.05, 0.2, and 0.4); (b) scatter-plot of
scheme outperforms the xedtree approaches.
observed photon events; (c) intensity eld parsing by
MDL-based recursive algorithm.
1 Intro duction
MDL criteria without recourse to asymptotic approx-
Consider a realization from a spatial Poisson p oint
imations. In contrast, most applications of MDL in-
pro cess whose underlying intensity function is (or can
volve Gaussian statistics (which are real-valued) and
be approximated as) piecewise constant (as exempli-
require asymptotic arguments. Hence, our application
ed in Figure 1(a)). This typ e of data arises in photon-
of MDL here is esp ecially simple and well motivated.
limited imaging, particle and astronomical physics,
This work is a co ding-theoretic alternative to related
computer trac network analysis, and many other ap-
Bayesian estimation schemes [2, 3,4].
plications involving counting statistics. In particular,
photon-limited images are formed by detecting and
2 Mininum Description Length
counting individual photon events (e.g., in gamma-ray
The MDL principle addresses the following ques-
astronomyornuclear medicine).
tion: given a set of generation mo dels, which b est
From an observed realization (see Figure 1(b)), our
explains the observed data? To get a handle on
goal is to parse,orsegment, the observation space into
the notion of \b est," Rissanen employed the follow-
regions of (roughly) homogeneous intensity (Figure
ing gedankenexperiment. Supp ose that we wish to
1(c)); i.e., regions of space in which the distribution of
transmit the observed data x to a hyp othetical re-
points is well mo deled by a spatial Poisson distribution
ceiver. Given a (probabilistic) generation mo del for
with constantintensity. In this pap er, we prop ose two
the data, say p(xj ), the Shannon-optimal co de length
new metho ds for unsup ervised progressive parsing of
is log p(xj ). Of course, the receiver would also need
Poisson elds based on Rissanen's minimum descrip-
to know the mo del parameters to deco de the trans-
tion length (MDL) principle [1]. One of the interesting
mission; then, if is a priori unknown, we also need
asp ects of our development is that, since the Poisson
to estimate it, co de it, and transmit it. Now, consider
K
data (counts) are integer-valued, we are able to derive
a set of K comp eting mo del classes fp (xj )g . In
i i
i=1
each class i, the \b est" mo del is the one that gives the
Supp orted by the National Science Foundation, grantno.
minimum co de length,
MIP{9701692.
y
Supp orted bytheFoundation for Science and Technology
b
= arg min f log p (xj )g = arg max p (xj );
i i i i i
(Portugal), grant Praxis-XXI T.I.T.-1580.
i i
this is simply the maximum likelihood (ML) estimate counts x and x , resp ectively. So,
j;k j +1;2k
within mo del class i. But if the class is a priori
p(x j x ; ) = B i(x j x ; )
j +1;2k j;k j;k j +1;2k j;k j;k
unknown, the \b est" overall mo del is the one that
x
x
leads to the minimum description length: the sum of
j;k
j +1;2k
x
j +1;2k +1
= (1 ) : (2)
j;k
j;k
x