in Pro c. IEEE International Conference on , Brisbane, Australia,

1998

Multi-feature Hierarchical Template Matching

Using Distance Transforms

D.M. Gavrila

Daimler-Benz AG, Research and Technology

Wilhelm Runge St. 11

89081 Ulm, Germany

[email protected]

ture representation. Matching pro ceeds by cor- Abstract

relating the template against the DT image; the

We describe a multi-feature hierarchical algo-

correlation value is a measure of similarityinim-

rithm to eciently match N objects (templates)

age space.

with an image using distancetransforms (DTs).

Previous work on DT-based matching [1] [2]

The matching is under translation, but it can

[7] [3] [11] [5] [10] [6] has dealt with the case

cover more general transformations by generating

of matching one template against an image, al-

the various transformed templates explicitly. The

lowing certain geometrical transformations (e.g.

novel part of the algorithm is that, in addition to

translation, rotation, ane). Here we consider

acoarse-to- ne search over the translation pa-

a more general case of matching N templates

rameters, the N templates aregrouped o -line

with an image under translation. Matching of

into a template hierarchy based on their similar-

one template under more general transformations

ity. This way, multiple templates can be matched

can b e seen as a sp ecial case when all the trans-

simultaneously at the coarse levels of the search,

formed templates are generated explicitly. In ad-

resulting in various speed-up factors. Further-

dition to a coarse-to- ne searchover the trans-

more, in matching, features are distinguishedby

lation parameters, the N templates are group ed

type and separate DT's arecomputed for each

o -line into a template hierarchy based on their

type (e.g. basedonedge orientations). These

similarity. Multiple templates can b e matched

concepts are il lustrated in the application of traf-

simultaneously at the coarse levels of the search,

c sign detection.

resulting in various sp eed-up factors.

The outline of the pap er is as follows. Section

1 Intro duction

2 reviews previous work on distance transforms,

distance measures and matching strategies. Sec-

Matching is a central problem in pattern recog-

tion 3 discusses the prop osed extensions to the

nition and . A common applica-

DT matching scheme, whichinvolve the use of

tion is ob ject detection and tracking. The vari-

multiple features and an ecent match strategy

ous matching metho ds that have b een prop osed

by means of a template hierarchy. Section 4 lists

can b e distinguished bywhattyp e of features are

exp eriments in the application of trac sign de-

used [12]. At the one end there are pixel-based

tection. Finally,we conclude in Section 5.

metho ds, which t mo dels directly to ( ltered)

image pixels. At the other end there are sym-

b olic matching metho ds which op erate on a few

2 Previous Work

high-level features (e.g. parts of ob jects and their

relations) and apply graph matching metho ds to

2.1 Distance Transforms

establish corresp ondence.

A distance transform (DT) converts a binary im- In this pap er, we consider metho ds for im-

age, which consists of feature and non-feature age matching using distance transforms (DTs).

pixels, into an image where each pixel value de- Matching using DTs involves intermediate-level

notes the distance to the nearest feature pixel. features [2] which are extracted lo cally at various

DTs approximate global distances by propagat- image lo cations, e.g. edge p oints. A DT converts

ing lo cal distances at image pixels. Particular the binary image, which consists of feature and

DT algorithms dep end on a variety of factors. non-feature pixels, into a DT image where each

One factor is whether they result in a Euclidean pixel denotes the distance to the nearest feature

distance metric or not (EDTs vs. WDT) [8] [13]. pixel. Similarly, the ob ject of interest is repre-

Figure 1 illustrates a EDT. WDTs de ne vari- sented by a binary template using the same fea- ximations of the \true" Euclidean dis- ous appro Raw

Image

tance measure. One such approximation is the

chamfer-2-3 metric [1] [2] [13], used in our exp er-

feature extraction

iments. Another factor is how the distances are

Feature Feature 4.23.6 2.82.22.0 2.2 2.8 3.6 4.2 Image Template 3.6 2.8 2.2 1.4 1.0 1.4 2.2 2.8 3.6 (binary) (binary) 2.8 2.2 1.4 1.0 0.0 1.0 1.4 2.2 2.8 2.2 1.4 1.0 0.0 1.0 0.0 1.0 1.4 2.2 DT DT

1.4 1.0 0.0 1.0 1.4 1.0 0.0 1.0 1.4 correlation 1.0 0.0 1.0 1.0 1.0 1.0 1.0 0.0 1.0 DT DT 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 Image Template 1.41.0 1.0 1.0 1.0 1.0 1.0 1.0 1.4

2.22.0 2.0 2.0 2.0 2.0 2.0 2.0 2.2

Figure 2: Matching using a DT

Figure 1: A binary pattern and its Euclidean Dis-

tance Transform

distances of the template features to the near-

est features in the image. The lower these dis-

propagated over the image, whether in a raster

tances are, the b etter the matchbetween image

scan or a contour scan fashion. Most algorithms

and template at this lo cation. There are a num-

use a raster scan fashion where the propagation of

ber of matching measures that can b e de ned on

distances is in a manner indep endent of the fea-

the distance distribution. One p ossibilityisto

ture lo cations in the image, with a mask of xed

use the average distance to the nearest feature.

size and shap e. Contour scan algorithms prop-

This is the chamfer distance.

agate the distances from the feature lo cations.

X

1

Some DT approaches also weigh the distances

D (T; I)  d (t) (1)

chamf er I

jT j

from features by their salience, where salientfea-

t2T

tures (e.g. edge strength, length, curvature) re-

where jT j denotes the numb er of features in T

sult in comparably lower "distance" values [10].

and d (t) denotes the distance b etween feature t

Finally, there are sequential and parallel DT al-

I

in T and the closest feature in I . The cham-

gorithms [4].

fer distance consists thus of a correlation b e-

tween T and the distance image of I ,followed

2.2 Match Measures and Strategies

by a division. Other more robust measures re-

duce the e ect of missing features (i.e. due to

Matching with DT is illustrated schematically in

o cclusion or segmentation errors) by using the

Figure 2. It involves two binary images, a seg-

average truncated distance or the f -th quantile

mented template T and a segmented image I ,

value (the Hausdor distance) [7] [11]. In ap-

whichwe'll call "feature template" and "feature

plications, a template is considered matched at

image". The "on" pixels denote the presence of

lo cations where the distance measure D (T; I)is

a feature and the "o " pixels the absence of a

below a user-supplied threshold 

feature in these binary images. What the actual

features are, do es not matter for the matching

D (T; I) < (2)

metho d. Typically, one uses edge- and corner-

p oints. The feature template is given o -line for

a particular application, and the feature image Figure 3 illustrates the matching scheme of

is derived from the image of interest by feature Figure 2 for the typical case of edge features. Fig-

extraction. ure 3a-b shows an example image and template.

Figure 3c-d shows the and DT

Matching T and I involves computing the dis-

transformation of the edge image. The distances

tance transform of the feature image I . The

in the DT image are intensity-co ded; lighter col-

template T is transformed (e.g. translated, ro-

ors denote increasing distance values.

tated and scaled) and p ositioned over the result-

ing DT image of I ; the matching measure D (T; I) The advantage of matching a template (Figure

is determined by the pixel values of the DT im- 3b) with the DT image (Figure 3d) rather than

age which lie under the "on" pixels of the tem- with the edge image (Figure 3c) is that the re-

plate. These pixel values form a distribution of sulting similarity measure will b e more smo oth

 computation of DT image: serial vs.

parallel, salience weighing

 match measures: Euclidean vs. robust

measures, directed vs. undirected measures

 matching N templates: none

 global search algorithms: exhaustive vs.

(a) (b)

hierarchical (in transformation space, in im-

age resolution)

3 Extensions

3.1 Multiple Feature-Typ es: Edge

Orientation

(c) (d)

So far, no distinction has b een made regarding

the typ e of features. All features would app ear

Figure 3: (a) original image (b) template (c) edge

in one feature image (or template), and subse-

image (d) DT image

quently, in one DT image. If there are several fea-

ture typ es, and one considers the match of a tem-

plate at a particular lo cation of the DT image,

as a function of the template transformation pa-

it is p ossible that the DT image entries re ect

rameters. This enables the use of various e-

shortest distances to features of non-matching

cent search algorithms to lo ckonto the correct

typ e. The similarity measure would b e to o op-

solution, as will b e discussed shortly. It also al-

timistic, increasing the number of false positives

lows more variabilitybetween a template and an

one can exp ect from matching.

ob ject of interest in the image. Matching with

the unsegmented (gradient) image, on the other

A simple way to takeadvantage of p ossibil-

hand, typically provides strong p eak resp onses

ity to distinguish feature typ es is to use sep-

but rapidly declining o -p eak resp onses.

arate feature-images and DT images, for each

typ e. Thus having M distinct feature typ es re-

Anumb er of extensions have b een prop osed

sults in M feature images and M DT images.

to the basic DT matching scheme. Some deal

Similarly,the\untyp ed" feature template is sep-

with hierarchical approaches to improve match

arated in M \typ ed" feature templates. Match-

eciency and use multiple image resolutions [2].

ing pro ceeds as b efore, but now the match mea-

Others use a pruning [3] [7] or a coarse-to- ne

sure b etween image and template is the sum of

approach [11] in the parameter space of relevant

the match measures b etween template and DT

template transformations. The latter approaches

image of the same typ e.

take advantage of the smo oth similarity measure

asso ciated with DT-based matching; one need

Wenow consider the frequent case of the use of

not to match a template for each lo cation, ro-

edge p oints as features. For this case, we prop ose

tation or other transformation. Other extensions

the use of edge orientation as feature typ e by

involve the use of a un-directed ("symmetric")

partitioning the unit circle in M bins

similarity measure b etween image and a template

i i +1

[7] [5]. In this case, a DT is applied on b oth

f [ 2; 2 ] ji =0;:::;M 1 g (3)

M M

the image and template. Matching takes places

Thus a template edge p oint with edge orientation

with the feature image and feature template, vice

is assigned to the typ ed template with index

versa, as seen in Figure 2.

Here is a summary of various asp ects covered

b M c (4)

in past work on DT-based matching

2

We still have to account for measurement error

 features: edge p oints, corner p oints

in the edge orientation and the tolerance we'll

 multi-typing:none

allowbetween the edge orientation of template

and image p oints during matching. Let the abso-

 distance metric:chamfer-2-3, chamfer-3-

lute measurement error in edge orientation of the 4, Euclidean

measure b etween template and image at a \cor- template and image p oints b e  and  , re-

T I

rect" lo cation. Let  denote the distance along sp ectively. Let the allowed tolerance on the edge

the diagonal of a unit grid element. Then by orientation during matching b e  . In order

tol

having to account prop erly for these quantities, a tem-

1

plate edge p oint is assigned to a range of typ ed

 =  +  (7)

l tol l

templates, namely those with indices 2

one has the desirable prop erty that, using un-

( +) ( )

M c; :::; b M cg (5) fb

truncated distance measures suchasthechamfer

2 2

distance, one can assure that the coarse-to- ne

mapp ed cyclically over the interval 0;:::;M 1,

approach will not miss a solution. The second

with

term accounts for the (worst) case that the so-

 = + + (6)

T I tol

lution lies at the center of the 4 enclosing grid

points which form a square.

For applications where there is no sign informa-

tion asso ciated with the edge orientation, a tem-

Now consider the case where the ab ove L-level

plate edge p oint is also assigned to the typ ed tem-

searchiscombined with a L-level template hier-

plates one obtains by substituting +  for in

archy. Matching can b e seen as traversing the

Equation (5).

tree structure of templates. Each no de corre-

sp onds to matching a (prototyp e) template p

3.2 Matching N Templates:

with the image at no de-sp eci c lo cations. For

Template Hierarchy

the lo cations where the distance measure b e-

tween template and image is b elow user-supplied

One often encounters the problem of matching

threshold  , one computes new interest lo cations

p

N templates with an image. If the N templates

for the children no des (generated by sampling the

b ear no relationship to each other, there is little

lo cal neighb orho o d with a ner grid) and adds

one can do b etter than match each of the tem-

the children no des to the list of no des to b e pro-

plates separately. If there is some structure in the

cessed. The matching pro cess starts at the ro ot,

template distribution, one can do b etter. The

the interest lo cations lie initially on a uniform

prop osed scheme to match the N related tem-

grid over relevant regions in the image. The tree

plates involves the use of a template hierarchy,

can b e traversed in breadth- rst or depth- rst

in addition to a coarse-to- ne searchover the im-

fashion. In the exp eriments, we use depth- rst

age. The idea is that at a coarse level of search,

traversal which has the advantage that one needs

when the image grid size of the search is large,

to maintain only L 1setsofinterest lo cations.

it would b e inecienttomatcheachoftheN

ob jects separately, if they are relatively similar

Let p b e the template corresp onding to the

to each other. Instead, one would group similar

no de currently pro cessed during the traversal and

templates together and representthemby a pro-

let C = ft ;:::;t g b e the set of templates cor-

1 c

totyp e template; matching would b e done with

resp onding to its children no des. Let  b e the

p

this prototyp e, rather than with the individual

maximum distance b etween p and the elements

templates, resulting in a (p otentially signi cant)

of C .

sp eed-up. This grouping of templates is done

 =maxD (p; t ) (8)

p i

t 2C

i

at various levels, resulting in a hierarchy, where

at the leaf levels there are the N templates one

Then byhaving

needs to match with, and on intermediate levels

1

there are the prototyp es.

 =  +  +  (9)

p tol p l

2

To make matters more concrete, consider rst

the case of a coarse-to- ne search where one one has the desirable prop erty that, using un-

matches a single template under translation. As- truncated distance measures suchasthechamfer

sume there are L levels of search(l =1; :::; L), distance, one can assure that the coarse-to- ne

determined by the size  of the underlying uni- approach using the template hierarchy will not

l

form grid and the distance threshold  which miss a solution. The thresholds one obtains by

l

determines when a template matches suciently Equation (9) are quite conservative, in practice

enough to consider matching on a ner grid (in one can use lower thresholds to sp eed up match-

the neighb orho o d of the promising solution). Let ing, at the cost of p ossibly missing a solution (see

 denote the allowed tolerance on the distance Exp eriments). tol

4 Exp eriments Subsection 3.2. Coarse-to- ne sampling uses a

grid size of  =8; 4; 1 for the three levels of

the template tree. We used distance thresholds

To illustrate the prop osed matching metho d we

 =3:5; 1:35; 0:6 pixels for the three levels, re-

apply it to the detection of circular and trian-

l

sp ectively.

gular (up/down) signs, as seen on highways and

secondary roads. For the moment, we do not

The exp eriments involved b oth o - and online

consider trac signs which app ear tilted and/or

tests. O -line, we used a database of 1000 im-

skewed in the image; the only shap e parameter

ages, taken during day-time (sunny, rainy) and

considered is scale. Edge p oints are used as fea-

night-time. We obtained single-image detection

tures, further di erentiated by their edge orien-

rates of ab out 90%, when allowing solutions to

tation. The edge orientations are discretized in

deviate by 2 pixels and by radius 1 from the

8values. We use templates for circles and tri-

values obtained byahuman. Typically, there

angles with radii in the range of 7-18 pixels (the

were 4-6 false p ositives p er image (in a later

images are of size 360 by 288 pixels). This leads

pictograph classi cation phase, more than 95%

to a total of 36 templates, for which a template

of these were rejected using a RBF network).

tree is sp eci ed \manually" as in Figure 4. The

Figure 5 illustrates the followed hierarchical ap-

tree has three levels (not counting the ro ot level,

proach. The white dots indicate lo cations where

which contains no template). The ro ot no de has

the matchbetween image and a (prototyp e) tem-

six children corresp onding to two prototyp es for

plate of the template tree was go o d enough to

each of the three main shap es to b e matched: cir-

consider matching with more sp eci c templates

cle, triangle-up, triangle-down. The prototyp es

(the children), on a ner grid. The nal detec-

at the rst level of the hierarchy are simply the

tion result is also shown. More detection results

templates with radii equal to the median value of

are given in Figure 6, including some false p osi-

intervals [7-12] and [13-18], namely 9 and 15. The

tives The trac signs of the database that were

prototyp es at the second level are the templates

not detected were had lowcontrast, were tilted

with radii equal to the median value of intervals

or skewed. Improvement of the detection rate

[7-9], [10-12], [13-15] and [16-18]. Each template

can thus b e achieved in a relative straightforward

(or prototyp e) is partitioned into 8 typ ed tem-

manner, bylowering the edge threshold and by

plates based on edge orientation. The direction

adding more templates.

of the edge orientation is sp eci ed, we only search

Given image width W , image height H ,and

for circles and triangles with a "light-inside-dark-

K templates, a non-hierarchical matching algo-

outside" contour characteristic.

rithm would require W  H  K correlations b e-

tween template and image. In the presented hi-

erarchical approach b oth factors W  H and K

y a coarse-to- ne approachinim-

Td(9) Td(15) C(15) Tu(9) Tu(15) are pruned (b

age space and in template space). It is not p os- vide an analytical expression for the

7-12 13-18 13-18 7-12 13-18 sible to pro sp eed-up, b ecause it dep ends on the actual im-

C (9)

age data and template distribution. Typically,

wehave observed sp eed-up factors in the range

C (8) C (11) of 200-400.

C (7) C (8) C (9) C (10) C (11) C (12)

5 Conclusion

C (R) : Tu (R) : Td (R) :

In this pap er we prop osed two extensions to DT- hing. The rst extension dealt with R R based matc

R

tiating the features bytyp e (i.e. by edge

2R 2R di eren

R R

orientation) and the second dealt with matching

using a template hierarchy. We observed that

Figure 4: Template hierarchy

this approach can result in a signi cant sp eed-

up when compared to the exhaustiveapproach,

in the order of two magnitudes. Some interest- Matching uses a depth-order traversal over

ing problems lie ahead regarding the automatic the template tree, in the manner describ ed by

generation of the template hierarchy. ysis and Machine Intel ligence , 10(6):849{

865, November 1988.

[3] G.E. Ford D.W. Paglieroni and E.M. Tsu-

jimoto. The p osition-orientation masking

approach to parametric search for tem-

plate matching. IEEE Transactions on Pat-

tern Analysis and Machine Intel ligence ,

16(7):740{747, 1994.

[4] H. Embrechts and D. Ro ose. A parallel eu-

clidean distance transformation algorithm.

CVIU, 63(1):15{26, January 1996.

[5] D. M. Gavrila and L. S. Davis. 3-D mo del-

(a)

based tracking of humans in action: a multi-

view approach. In IEEE Conferenceon

Computer Vision and Pattern Recognition,

pages 73{80, San Francisco, 1996.

[6] D. Huttenlo cher. Monte carlo compari-

son of distance transform based matching

measures. In ARPA Image Understanding

Workshop, pages 1179{1183, 1997.

[7] D. Huttenlo cher, G. Klanderman, and W.J.

Rucklidge. Comparing images using the

hausdor distance. IEEE Transactions on

(b)

Pattern Analysis and Machine Intel ligence ,

15(9):850{863, 1993.

Figure 5: Trac sign detection: (a) day and (b)

[8] F. Leymarie and Martin D. Levine. Fast

night (white dots denote intermediate results; the

raster scan distance propagation on the dis-

lo cations matched during hierarchical search)

crete rectangular lattice. Computer Vision,

Graphics, and Image Processing. Image Un-

derstanding, 55(1):84{94, January 1992.

[9] D. Mumford. Mathematical theories of

shap e: Do they mo del p erception? In SPIE

Vol. 1570 Geometric Methods in Computer

Vision, pages 2{10, 1991.

[10] P.L. Rosin and G.A.W. West. Salience dis-

tance transforms. GMIP, 57(6):483{521,

November 1995.

Figure 6: More detection results

[11] W. Rucklidge. Lo cating ob jects using the

hausdor distance. In International Confer-

ence on Computer Vision, pages 457{464,

References

1995.

[1] H. Barrowetal.Parametric corresp ondence

[12] P. Suetens, P.Fua, and A. Hanson. Com-

and chamfer matching: Twonewtechniques

putational strategies for ob ject recognition.

for image matching. In International Joint

ACM Computing Surveys, 24(1):6{61, 1992.

ConferenceonArti cial Intel ligence,pages

[13] E. Thiel and A. Montanvert. Chamfer

659{663, 1977.

masks: Discrete distance functions, geomet-

[2] G. Borgefors. Hierarchical chamfer match- rical prop erties, and optimization. In Inter-

ing: A parametric edge matching algo- national Conference on Pattern Recognition,

rithm. IEEE Transactions on Pattern Anal- pages 244{247, The Hague, 1992.