IEEE TRANSACTIONS ON ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 1 Deformation Based Curved Shape Representation

Girum G. Demisse, Djamila Aouada, and Bjorn¨ Ottersten

n Abstract—In this paper, we introduce a deformation based representation space for curved shapes in R . Given an ordered set of points sampled from a curved shape, the proposed method represents the set as an element of a finite dimensional matrix Lie group. Variation due to scale and location are filtered in a preprocessing stage, while shapes that vary only in rotation are identified by an equivalence relationship. The use of a finite dimensional matrix Lie group leads to a similarity metric with an explicit geodesic solution. Subsequently, we discuss some of the properties of the metric and its relationship with a deformation by least action. Furthermore, invariance to reparametrization or estimation of point correspondence between shapes is formulated as an estimation of function. Thereafter, two possible approaches are presented to solve the point correspondence estimation problem. Finally, we propose an adaptation of k- clustering for shape analysis in the proposed representation space. Experimental results show that the proposed representation is robust to uninformative cues, e.g. local shape perturbation and displacement. In comparison to state of the art methods, it achieves a high precision on the Swedish and the Flavia leaf datasets and a comparable result on MPEG-7, Kimia99 and Kimia216 datasets.

Index Terms—Shape representation, similarity metric, shape matching, deformation. !

1 INTRODUCTION

HAPE analysis and modelling finds many applications S in computer vision, , and engi- neering [19], [27], [34], [48]. As a result, different kinds of feature detectors have been designed in an effort to define a robust similarity metric and to train classifiers [7], [60]. However, features are usually finite dimensional and shapes are continuous objects. Hence, in [52] a finite dimensional object is argued to be inadequate to represent an infinite dimensional object, theoretically. Alternatively, shapes can be treated as parametrized functions and a shape space Fig. 1: Illustration of the proposed representation. Given a as a Hilbert space, [30], [37], [38], [52], [64]. In general, fixed starting point p1, the curve is reconstructed by the the parametrization approach leads to a nonlinear and successive action of the (g1, ··· , gz−1). infinite dimensional shape space which, in most cases, is complemented with a metric and treated as a Riemannian manifold. There are several advantages in using the Rie- sequently, a wide variety of metrics have been proposed mannian framework. The first advantage is the treatment in an infinite dimensional shape space [37], [38], [39]. Al- of the shape space as a smooth manifold which is natural though some of the proposed metrics have an explicit considering the nonlinearity of shapes. Secondly, under the geodesic equation, some lead to a nontractable geodesic Riemannian framework, a smoothly varying metric can be equation [38]. Moreover, almost all of the proposed metrics defined on a nonlinear space. Such a metric is essential to depend on the differentiability of the considered curved describe distance, area, and other geometric notions in a shapes. Thus, self-intersecting shapes or shapes that are nonlinear space. Lastly, given the Riemannian framework, not smooth are often excluded. In this paper, we elaborate a nonlinear shape space can be linearised, at least locally, and extend the deformation based curved shape represen- using the notion of geodesic connections. As a result, shape tation, proposed in [15], that does not impose smoothness analysis problems can be treated with statistical models. De- constraints while leading to a simple and explicit geodesic spite all the described advantages, there are many potential equation.1 In [15], a new curved shape representation is metrics one can define over a nonlinear shape space, leading formulated on the curve’s deformation space, which leads to different computational and interpenetration challenges. to a much simpler metric with an explicit solution to its 2 For instance, the L metric in an infinite dimensional shape geodesic equation. The metric has a close relationship with space gives a zero distance between different shapes. Con- what is sometimes referred to as effort functional [44], [63]. Hence, it measures the cost of optimally deforming one • All authors are with the Interdisciplinary Center for Security, Reliability, curved shape to another as a distance metric. Moreover, and Trust of the University of Luxembourg, Luxembourg, L-2721. E-mail:{girum.demisse, djamila.aouada, bjorn.ottersten}@uni.lu 1. The complete MATLAB code can be found at https://github.com/ Manuscript received July 19, 2016. DemisseG/Curved shape representation IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 2 curves are explicitly represented in the deformation space, Section5, two possible approaches for estimating point cor- which is formulated as a finite dimensional Lie group. respondence between shapes are presented. In Section6, the Unlike previous deformation based approaches [14], [19], metric is used for learning a k-means classifier. Experimental [21], [22], [61], the proposed representation does not refer scenarios and results are presented in Section7. The paper to a template shape to compute deformations. Instead, the ends with concluding remarks in Section8. intrinsic characteristics of a deforming curve are represented by finitely many rigid transformation matrices. Intuitively, these transformation matrices capture how a curved shape 2 RELATED WORK bends and stretches through space, see Fig.1. The main ad- In this section, we present a brief literature review on shape vantages of representing curves with the proposed approach representations and similarity metrics defined on them. For are the simplification of the geodesic distance and path be- a complete and more extensive survey see [64]. tween shapes, which are expressed explicitly. Consequently, In [28], Kendall defines shape as what remains after the optimality and computational cost of the distance com- removing rotation, translation and uniform scaling. Conse- putation are completely dependent on the point correspon- quently, almost all shape representations have a long stand- dence estimation method. Furthermore, the group operator ing goal to be invariant to the mentioned shape preserving of the deformation space can be used to either extract or transformations. To that end, the pioneering shape repre- apply curve deformations, since the curves themselves are sentation is a landmark-based shape representation [28]. represented as elements of the deformation space. Lastly, Landmark-based methods represent a curved shape by a the induced metric is left invariant and relatively robust to selected set of points. These points are believed to be dis- local noise. Nevertheless, there are some drawbacks. The tinctive of the shape and are called landmarks or features. n estimation of point correspondence is formulated in terms Subsequently, a curved shape S embedded in R and rep- mn of computing an optimal sampling function. Hence, the resented by selected m landmarks will be a point in R . accuracy of the computed geodesic distance is dependent In such a setting, invariance to translation and scaling is on the sample size (number of points). In general, a large achieved by centring the m vectors to zero and scaling mn sample size leads to a better performance. This drawback p ∈ R to unit norm [53]. Such a space, according to embodies the fundamental assumption of the approach, Kendall, is called pre-shape space. Later on, invariance to which is any curved shape can be distinctively described by an rotation is achieved by defining an equivalence relationship arbitrarily large number of discrete points. Computational cost, between a point p in a pre-shape space and all of its orien- on the other hand, is a polynomial function of the sample tations. Thus, the resulting final space, called shape space, mn size. The main contributions of this paper are summarized is described by modding out the orientations, R /SO(n). as follows: Here, SO(n) describes a group of n-dimensional rotations. The major drawback of landmark-based approaches is the 1) A relatively simple yet effective representation for need for identifying landmarks, which are often selected curved shapes, described in [15], is introduced. Particu- subjectively. This leads to inconsistent representations, as lar to this representation, curves are treated as elements the same shape can be represented by two completely of the deformation space which is a finite dimensional different sets of landmark points. Nevertheless, in [10], [17], Lie group. The representation space assumes only con- [28], an elegant statistical framework in mn/SO(n) was tinuity and not differentiability. Hence, it is inclusive R introduced for landmark based approaches. Alternatively, of curves that are continuous yet not differentiable in [30], [38], [54], [61], curved shapes were represented everywhere. by a parametrized function leading to an infinite dimen- 2) A similarity metric with an explicit geodesic distance sional space. Similar to landmark-based approaches, shape that is equivalent to the L2-norm is proposed; hence, preserving transformations are filtered from the represen- simplifying distance computation. Furthermore, impor- tation space. Furthermore, invariance to reparametrization tant practical properties of the metric are discussed and is achieved by modding out shape preserving diffeomor- a relationship with a deformation by a least action is phisms. In the infinite dimensional setting, a shape space is established. usually given as Imm( 1, n)/Diff( 1), where Imm( 1, n) 3) Given the proposed deformation based representation S R S S R is the space of all parametrized functions immersed in n of curves, two possible approaches for addressing point R and defined on a 1-dimensional circle, 1, while Diff( 1) correspondence estimation are proposed. Characteris- S S is the group of diffeomorphisms acting on 1. In such a tics and advantages of one over the other are discussed. S space, the distance between shapes is defined by the length 4) The proposed metric is used for learning a fairly simple of the geodesic curve connecting them. Consequently, the classifier with k-means clustering. simplest possible metric is L2(a, b) = R ha, bids, where h·, ·i 5) Finally, a comparison of the proposed approach with is a smoothly varying metric tensor, a and b are vector fields other shape representation approaches including the tangent to a curve at the shape space and integrated with square root velocity (SRV) framework [54] is provided respect to the arc length. Regardless, the geodesic equation on several datasets. of the L2 metric is difficult to solve. Furthermore, a distance The rest of the paper is organized as follows: in Section2 function based on the L2 metric can, potentially, lead to an overview of related works is presented followed by abnormal results; the distance between two different shapes a detail description of the proposed shape representation can be zero [38]. Subsequently, to avoid such abnormalities and the metric it induces in Section3. Properties and in- a first order Sobolev metric was introduced in [37], [38], terpretations of the metric are presented in Section4. In with numerical solutions. In [54], an isomorphism from IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 3

1 n 1 Imm(S , R )/Diff(S ), with a first order Sobolev metric, to 3 PROPOSED SHAPE REPRESENTATION a Hilbert manifold by a mapping function called square ¯ root velocity (SRV) was proposed. SRV leads to a geodesic Let S be a continuous curved shape, and let S be an approx- equation with an explicit solution for open curves, while imation of it with z discrete points. For now, we will assume the geodesic distance for closed curves is computed itera- these points to be sampled with uniform arc length interval. tively with a method called path-straightening. In general, In addition to being a set of points, S carries an order, i.e. in almost all parametrization approaches, shapes are as- (p1, ··· , pz) 6= (pz, ··· , p1). However, the order and the sumed to be C∞(infinitely differentiable) or at least C2, starting point p1 of a shape are selected a priori, for now. since most metrics in these shape spaces are defined based Later in the paper, we will discuss the impact of the starting on differential quantities, e.g., a first order Sobolev met- point and the ordering direction; this is similar to what is ric. Consequently, curves that are self-intersecting and not referred to as reparametrization in the literature [37], [54], 1 smooth are excluded. Furthermore, this leads to algorithms [63]. Furthermore, a given curved shape S is assumed to be 2 that are sensitive to noise, making a pre-smoothing stage a able to deform to another curved shape S . We denote the necessity [36]. space of such shapes by C. However, not all deformations are shape altering. In particular, translation with respect to a fixed frame of reference, uniform scaling and rotation pre- Alternative to both landmark and parametrization ap- serve both the shape and the order of points. Consequently, proaches, primarily in the work of Grenander et al. [3], [21], shape preserving deformations cause redundancy in C and [22], [61], [62], shapes were studied from a different perspec- need to be filtered out. Similarly to [28], translation and tive; they were studied based on how they deform from one uniform scaling of a given shape, S = (p1, ··· , pz) where n to another. In [61], a metric defined on a deformation space pi ∈ R , are filtered out as follows is shown to induce a distance metric on a shape space, given ! the deformation space is a group and acts transitively on ∗ ∗ p1 − p¯ pz − p¯ S = (p1, ··· , pz) = , ··· , , (1) elements of the shape space. Subsequently, the similarity h h between two shapes is measured by a deformation that q minimizes an objective function called effort functional– the 1 Pz n Pz 2 where p¯ = pi ∈ R , h = kpi − p¯k2 ∈ R, distance between the group’s identity and the given defor- z i=1 i=1 k · k denotes the L2-norm. Deformation due to rotation, mation, see [61], [63]. Nevertheless, different approaches 2 however, cannot be filtered in a preprocessing stage, see [53] consider different deformation spaces. In [57], [61], the d for further details. Consequently, variations due to rotation considered deformation space is a set of C functions, for are identified by an equivalence relationship. Let R ∈ SO(n) some d ≥ 0. Together with a composition operation, the d be an n-dimensional rotation matrix; here, SO(n) denotes the space of C functions is shown to be an infinite dimen- special orthogonal group. Subsequently, the deformation of sional Lie group. In this deformation space, elastic shape a given shape Sj by any R will define an equivalence class matching is done by solving a discrete version of the effort [Sj] in C. Thus, [Sj] is the set of all shapes that are generated functional with dynamic programming. In [22], a general by rotating Sj. However, if the correct point correspondence that analyses configurations generated by between Sk and Sj is given then R can be computed by geometric units (lines, points, etc.) based on the deformation optimizing from a template configuration was presented. In this case, the considered deformation space is a direct product of j k 2 arg min kRS − S k2. (2) matrix Lie groups, see [20] for a brief overview. Recently, R∈SO(n) in [19], 3D shapes are modelled based on how they deform from a template shape by the action of finite dimensional Subsequently, we identify a given shape S ∈ C by the matrix Lie group. However, in [19], point correspondence following mapping function is assumed which is not trivial to achieve. In a more similar ( approach to what is presented here, in [29], the Frenet-Serret G = (g1, ··· , gz) if S is a closed curve, f(S) = (3) frame has been used to represent deformations of a curved G = (g1, ··· , gz−1) if S is an open curve, shape. However, the Frenet-Serret frame is applicable only to curves with a non-vanishing curvature. Later in [23], a such that representation where the Frenet-Serret frame is replaced by the action of a general linear group is presented. gi × pi = pi+1, (4)

where gi=1,··· ,z are transformation matrices that map one In summary, most deformation based approaches follow point to another with respect to a fixed coordinate frame. a paradigm called deformable template, where geometric vari- In this work, we will only consider Euclidean transfor- ations are modelled with respect to a given template shape. mations without reflection and thus G is an element of In this paper, we present yet another deformation based SE(n)z, which is a direct product of a Special Euclidean shape representation without a reference template shape. group. Although (4) is underdetermined, in this work, the Consequently, shapes are represented as elements of the gi=1,··· ,z refer to optimal transformations that preserve the deformation space, which in our case is a finite dimensional fixed global frame, see Appendix on computational details. matrix Lie group. Later in the paper, we will detail similari- Intuitively, the representation f(·) attempts to encode every ties between our approach and what was proposed in [61]. curve S ∈ C as the path of a travelling particle from a IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 4

fixed starting point, see Fig.1. The inverse of the mapping 3.1 Distance in SE(n)z function, for closed curves, is defined as The proposed representation space SE(n)z is a matrix Lie z group, which is not a Euclidean space. As a result, the usual −1 Y f (G) = (p1, g1p1, g2g1p1, ··· , ( gi)p1). (5) definition of distance as a straight line does not generalize to i=1 SE(n)z. Alternatively, techniques from differential geometry The inverse for open curves can be defined similarly. More- can be used to define distance in a curved space. In this over, the representation f(·) also preserves the equivalence subsection, we will overview concepts from Lie theory and relationship defined in C. differential geometry to later develop a distance function in SE(n), and its direct product group SE(n)z. Proposition 1. If Ga and Gb are the representations of Sa, Sb ∈ j a b a b A Lie group is a smooth manifold with smooth group op- [S ] then G is equivalent to G by conjugacy, G ∼ G . erations; that is, the group’s binary operator (x, y) 7→ xy−1 is C∞. The tangent space at the identity of the group e is Proof. Let Sa = (pa, ··· , pa) and Sb = (pb , ··· , pb ). Since, 1 z 1 z an algebra called Lie algebra, which we will denote by g. Sa, Sb ∈ [Sj] we can write Sb = RSa, where R ∈ SO(n). The smooth and invertible binary operator of a Lie group Rotation is an isometry, thus RSa = (Rpa, ··· , Rpa). The 1 z enables one to define a diffeomorphism onto itself. For representation f(Sb) = (gb, ··· , gb), such that 1 z instance, consider a left translation of a Lie group defined b b b as L : G → aG, a ∈ G, where G denotes the Lie group. gi × pi = pi+1 a = R × pa , Meanwhile, the differential structure, due to Lie groups i+1 being smooth manifolds, enables one to do calculus on a Lie can then be expressed in terms of f(Sa) as follows group. To compute distance, volume and other geometric notions, however, an additional structure called metric is b a a a gi × R × pi = R × gi × pi needed. Consequently, a Lie group G can be complemented b a −1 with a smoothly varying metric tensor h·, ·i, making it a gi = R × gi × R . Riemannian manifold. The metric tensor h·, ·i is defined at a b Thus, f(S ) and f(S ) are equivalent by conjugacy, i.e., the tangent space TgG as h·, ·ig : TgG × TgG → R for every b a −1 f(S ) = Rf(S )R .  g ∈ G, see [16], [42] for a detailed discussion. Moreover, a metric tensor on a Lie group is said to be left invariant if the As a result, the equivalence relationship defined in C z left translation diffeomorphism is an isometery, i.e., if the carries over to the representation space SE(n) . Compu- following is true tationally, if two given shapes belong to the same shape class then the corresponding eigenvalues of the transfor- hx, yie = hdLax, dLayia, ∀x, y ∈ g, ∀a ∈ G, (8) mation matrices in f(Sb) and f(Sa) are similar. Finally, the representation space of equivalent classes is described where dLa is the derivative of the left translation La. As z a result, a left invariant Riemannian metric is identified by modding out the rotations as SE(n) /SO(n), for closed −1 curves, and SE(n)z−1/SO(n), for open curves. Note that with h·, ·ie through the pullback map, dLa . Subsequently, the representation space still assumes point correspon- the length of a given curve γ defined on a subset of R, dence or parametrization. Thus, neither SE(n)z/SO(n) nor γ :[t0, t1] → G, is given as z−1 SE(n) /SO(n) are invariant to reparametrization; we will Z t1 q address this issue in Section5. Moreover, any given shape `(γ) = hγ˙ (t), γ˙ (t)iγ(t)dt, (9) that is represented in SE(n)z/SO(n) is deformable to any t0 z other representation in SE(n) /SO(n). where γ˙ (·) ∈ Tγ(t)G is the derivative of the curve. There 1 1 Let f(S ) be the representation of S that we want are several possible curves that start γ(t0) ∈ M and end at 2 2 to deform to S represented by f(S ). Given the point γ(t1) ∈ M; the shortest curve among them is called geodesic 1 2 correspondence between S and S , we can extract the curve. The distance between γ(t0) and γ(t1) is defined as deformation of f(S1) to f(S2) as the length of the geodesic curve L 2 1 −1 G = f(S )f(S ) , (6) d(γ(t0), γ(t1)) = Inf{`(γ)}. (10) so that GLf(S1) = f(S2). Note that f(S1)−1 is ele- The derivatives of a given curve γ define a vector field on ment wise inversion of the representation, unlike f −1(S1) the manifold γ˙ . More interestingly, if a vector field γ˙ (h) which is a map from the representation space to C. Since defined on a Lie group is left invariant, i.e., if the following SE(n)z/SO(n) is a group, the deformation GL is an element is true for h ∈ G of SE(n)z/SO(n) acting from the left. Alternatively, a defor- mation that acts from the right is defined as dLaγ˙ (h) =γ ˙ (ah) ∈ TahG, (11)

GR = f(S1)−1f(S2). (7) then it is parallel along the curve, and its integral curve γ(t) = exp(tγ˙ ) is geodesic. If any two points on a manifold Henceforth, we will use GL and GR to describe deforma- can be connected by a geodesic curve then the manifold is tions that act from the left and deformations that act from called geodesically complete or compact. the right, respectively. Consequently, the deformation space Similarly, geodesic curves and the geodesic distance can under consideration is the same as the proposed shape be defined in SE(n). However, SE(n) is not a compact group representation space. but a semidirect product of a compact group, SO(n), and IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 5

n 0 R , which can be described in a homogeneous coordinates S1 S1 as follows R v  g = i i , R ∈ , v ∈ n. i 0 1 s.t., i SO(n) i R (12)

Subsequently, geodesic curves that connect two points in 2 20 n S S SO(n) and R , respectively, are defined as −1 t ϕR(t) = R1(R1 R2) (13) ϕv(t) = v1 + (v2 − v1)t, (14)

1 where t ∈ [0, 1]. It can easily be checked that ϕR(t) is Fig. 3: The first set of shapes shows two examples where S 0 geodesic in SO(n), though, not necessarily unique [8], [41], deforms to S1 . The second set shows the transported defor- n 0 whereas ϕv(t) is clearly geodesic since R is a vector space. mation to their similar objects S2 to give S2 , respectively. Consequently, in [65] the following curve in SE(n) is proven to be geodesic. 4.1 Properties ϕ (t) ϕ (t) ϕ(t) = R v , 0 1 (15) 1) The proposed distance metric is left invariant. In general, a metric ρ that is defined on a group G is said to be left where t ∈ [0, 1]. Moreover, we can define a Riemannian invariant if ρ(a, b) = ρ(qa, qb), ∀a, b, q ∈ G, and the defined metric on the Lie algebra of SE(n) as metric (20) is left invariant see [40] for further details. The implication of this is that the distance between two shapes h(R1, v1), (R2, v2)ie = hR1, R2i + hv1, v2i, (16) is invariant to a deformation acting on both shapes from the where R ∈ so(n), the Lie algebra of SO(n). Subsequently, us- left. More concretely, let G0 ∈ SE(n)z be a deformation acting ing (15) and (16), the length of a geodesic curve connecting on the representations of two distinct shapes, f(S1) and 2 1 2 0 1 0 2 g1, g2 ∈ SE(n) can be computed by transporting the tangent f(S ), then d(f(S ), f(S )) = d(G f(S ), G f(S )). This vectors with the pullback to the Lie algebra. The geodesic fact can be observed by plugging the action of G0 into (20), distance, in this case, reads as in which case it will cancel itself out. As discussed in [15], this property is particularly important in transporting de- Z 1 q −1 −1 formation between two similar shapes. To clarify further, d(g1, g2) = hdLϕ(t)(ϕ ˙(t)), dLϕ(t)(ϕ ˙(t))iedt. (17) 0 we consider below a deformation transportation problem 1 10 Since ϕ(t) is a geodesic curve, the tangent vectors ϕ˙(t) discussed in [54]. Let S and S be shape contours rep- are parallel along ϕ(t). Hence, the geodesic distance given resenting exactly the same real world object denoted by 0 in (17) is simplified as O1, except that S1 is deformed under some unknown 0 external factor. For instance, S1 and S1 can be contours d(g , g ) = (1 − 0) × hϕ˙(t), ϕ˙(t)i 1 2 1 2 e (18) of O from a different viewing angle. Furthermore, let O T 2 2 1/2 1 2 = (k log(R1 R2)kF + kv2 − v1k2) , denote a similar, but not identical, object to O with S 0 as its shape contour. Subsequently, given S1, S1 , and S2 where k · k denotes the Frobenius norm. Subsequently, we 0 F the problem of estimating S2 , a deformed S2 under the extend the geodesic curve ϕ(t) given in (15) to the direct same external factor, can be framed as a deformation trans- product space SE(n)z = SE(n) × · · · × SE(n) as follows 1 z portation problem. In [54], the deformation is estimated ζ(G , G ) = (ϕ(t) , ··· , ϕ(t) ), by transporting the vector field along the geodesic curve 1 2 1 z (19) 0 connecting S1 and S1 . Alternatively in our framework, the 1 such that ϕ(t)i is the geodesic curve between gi ∈ G1 and deformation due to an external factor can be factored out 2 0 gi ∈ G2. It can be shown that (19) is a geodesic curve in as GL = f(S1 )f(S1)−1. Consequently, since our metric z the product group, see [16]. A geodesic distance in SE(n) is left invariant, d(f(S1), f(S2)) = d(GLf(S1), GLf(S2)). 0 follows directly from (19) and is given as Thus, f(S2 ) = GLf(S2), see Fig.3. 1 2 2 1 2 2 1/2 2) The proposed metric does not compute differential quantities d(G1, G2) = (d(g , g ) + ··· + d(g , g ) ) . (20) 1 1 z z of curved shapes. Most infinite dimensional representations In summary, the geodesic path and distance between the define a similarity metric based on differential quanti- representations of two shapes f(S1) and f(S2) can be com- ties [37], [54]. However, differential quantities are highly puted using (19) and (20), respectively. Nevertheless, the de- sensitive to noise and local perturbations. Consequently, a fined geodesic distance (20) is not invariant to parametriza- pre-smoothing stage is necessary before analysing a shape. tion. On the other hand, pre-smoothing a shape can potentially filter legitimate features, see Fig.2. Although the proposed representation is based on the relative transformation matri- 4 PROPERTIES AND INTERPRETATION ces between neighbouring points, it is not as severely sensi- In this section, we present some of the important properties tive as curvature is, for example, to local perturbations [36]. of the proposed representation and its metric followed by a Hence, needs no pre-smoothing. discussion on its interpretation and relationship with earlier 3) The proposed representation encodes a given curve by the works. relative deformation of points as opposed to absolute deforma- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 6

1

2

3

4

5

6

Fig. 2: Shapes along the geodesic path between the initial shape (first column) and target shape (last column). The odd rows show results from our approach while the even rows are results from [54]. All shapes are represented by 100 uniformly sampled and normalized points. We note that results from [54] are smoothed and loses local features of the shapes.

where e is the identity of SE(n). While, the cost of deforma- tion under AD is given as z 1 2 X 1/2 da(S , S ) = ( d(e, Ti)) (23) i=1 z X 2 2 1/2 Fig. 4: Two pairs of curves. Under AD, the second pair of = ( k log(Ri)kF + kvik2) . curves are more similar than the first, while under RD the i=1 first pair are more similar than the second. Note that dr is the same as (20). The main difference between dr and da is tolerance to displacement– dr emphasizes 1 1 1 intrinsic differences between curves, while da emphasizes tion of points. To further elaborate, let S = (p1, ··· , pz) 2 2 2 cost of displacement per point, see Fig4. and S = (p1, ··· , pz) be two shapes that are scaled and aligned, see Fig.4. Subsequently, we define absolute deforma- 1 1 tion of points (AD) as the transformation of every pi ∈ S to 4.2 Interpretation 2 2 match its corresponding point pi ∈ S . In earlier works, The proposed distance function, as derived in (20), has a such deformations are modelled by displacement vector direct relationship to what is called effort functional [57], [61], fields [2] and in some cases by the action of a matrix Lie [63]. In general, distance between shapes based on effort group [19]. If we formulate the matching of each point, from functional is written as S1 to S2, by the action of rigid transformation matrices then z ¯1 ¯2 ¯1 ¯2 the AD is given by G = (T1, ··· ,Tz) ∈ SE(n) such that ds(S , S ) = min{dΨ(e, ψ): S = ψ(S )}, (24) ψ∈Ψ S2 = GS1 = (T p , ··· ,T p ) 1 1 z z (21) where Ψ is the space of allowed deformations for matching ¯2 ¯1 Alternatively, rather than aligning every matching point, as S to S , ds and dΨ are distance metrics in the shape and is done with AD, one can align the intrinsic properties of deformation spaces, respectively. Such formulation leads the curves– these are properties of a curve that are mainly to a natural interpretation of distance as a measure of invariant to displacement. In this paper, intrinsic properties deformation by least action, hence the term effort functional. of a given curve are approximated by the relative deforma- For further mathematical details see [45], [57]. Similarly, tion of the curve’s points, as defined in (3) by f(·). Hence, a version of (24) under the proposed representation can alignment of S1 and S2 under relative deformation of points be formulated by explicitly defining the space of defor- 1 2 (RD) is given as f(S2) = f(S1)GR, where GR ∈ SE(n)z, see mations Ψ for matching S¯ and S¯ . To define Ψ more Section3. Subsequently, using (18) the cost of deformation precisely, we introduce monotonic and injective maps from under RD is given by a discrete parameter space to the arc length parametrization z as ξi : [1, z] → [0, `]. Here, ` denotes the length of a curve, z 1 2 X R 1/2 and i are variables for any positive integer, where the latter dr(S , S ) = ( d(e, gi )) (22) i=1 is used as an identifier for different mappings. Given a fixed z z, the mappings have a similar effect as to what are referred X 1 T 2 2 2 1 2 1/2 = ( k log((Ri ) Ri )kF + kvi − vi k2) . as shape and orientation preserving diffeomorphisms [63], i=1 although in this case there is no guarantee for a mapping to IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 7

in [12] is presented and solved with dynamic program- ming. Regardless of the dissimilarity function, in [49] it is shown that restricting the matching solutions to those that preserve order of points leads to a more accurate result than the unrestricted case. In this section, we formulate ¯ ¯ ¯ ¯ S S ◦ ξ1 S ◦ ξ2 S ◦ ξ∗ point correspondence estimation problem as an estimation ¯ of sampling function and propose two possible approaches Fig. 5: Different samplings of a given shape S. The red dots to solve it. The general form of shape matching cost (27) denote sampled points. The last sampling ξ∗ is a uniform is given as a measure of shape similarity and deformation sampling. Note that ξ1 and ξ2 do not preserve shape. cost. However, as shown in (26), the proposed similarity metric is equivalent to effort functional. Hence, it measures the cost of deformation as a dissimilarity between shapes. preserve shape, see Fig.5. In other words, the mappings Subsequently, since GR = f(S1)−1f(S2) we describe (26) in are sampling functions that select z ordered points from i j j ¯j terms of the sampling functions as a continuous curve. Subsequently, Si = S ◦ ξi denotes ¯j 1 −1 2 ordered sample points from S obtained by using ξi. Finally, min d(e, f(Si ) f(Sj )). (28) ξ ,ξ we define the set of deformations between S¯1 and S¯2, for a i j R 1 R 2 given z, as Ψ = {G | ∃ξi, ξj : f(S¯ ◦ ξi)G = f(S¯ ◦ ξj)}. However, the cost functional given in (28) assumes the Consequently, under the framework of the proposed repre- sampling functions preserve shape. On the contrary, if the sentation, distance between two given shapes is given as sampling function does not preserve shape then (28) will ¯1 ¯2 1 2 deviate from the target shape, see Fig.5. In this work, ds(S , S ) = min d(f(Si ), f(Sj )). (25) GR∈Ψ instead of insisting on sampling functions that preserve shape we add a penalty term on (28) so that we get a Moreover, using the left invariance property of the metric constrained objective functional that attempts to balance and the definition of Ψ,(25) can be simplified as follows optimal point matching and shape preservation. To that end, ¯1 ¯2 1 1 R R ξ ds(S , S ) = min d(f(Si ), f(Si )G ) = d(e, G ). (26) let ∗ denote a uniform sampler and assume that it preserves GR∈Ψ shape. Subsequently, the alteration of a given shape S¯1, Hence, a version of the effort functional (24) is described when sampled by ξi, from a uniformly sampled one is as (26) in the proposed representation space. However, quantified as if ξi and ξj are given a priori, Ψ will be a singleton– z z R X ∗ ∗ X i i since G can be computed directly, as described in (7). ρS¯1 (ξ∗, ξi) = AS¯1 (pj−1, pj ) − AS¯1 (pj−1, pj). (29) For instance, we have so far assumed a perfect point-to- j=2 j=2 point correspondence between shapes. Thus, ξ and ξ are i j Each term of (29) computes the enclosed area of S¯1 when assumed to be known. Consequently, the proposed metric sampled by ξ and ξ using Green’s theorem. Note that can be viewed as measuring the optimal deformation between ∗ i the curves are oriented. Consequently, ρ ¯1 (·, ·) is used to two shapes given optimal point correspondence. In this context, S penalize a sampler that does not preserve area– in case optimal deformation is to be interpreted as the deformation of non-planar curves one can replace the area preservation with the least action. While, optimal point correspondence requirement with length preservation requirement which is stands for the estimation of ξ and ξ such that (25) is i j less restrictive. Subsequently, the constrained point corre- minimized. spondence cost functional is given as 1 −1 2 min{α × d(e, f(Si ) f(Sj )) + β × (ρS¯1 (ξ∗, ξi) + ρS¯2 (ξ∗, ξj))}. 5 POINTCORRESPONDENCEESTIMATION ξi,ξj In general, point correspondence estimation is given by the (30) solution of the following We have added weighting factors α and β to emphasize 1 2 the effect of one over the other– assigning a large value arg min{ds(ψ(S¯ ), S¯ ) + E(ψ)}, (27) ψ∈Ψ for β leads to an objective functional that prefers samplers that preserve area even with a high deformation cost and where Ψ denotes a deformation space, ds is a dissimilarity vice versa for a small β value, see Fig.7. The objective metric, and E(·) is an energy term that measures the cost functional given in (30) is symmetric. Moreover, its solution of the deformation. The objective of (27) is to solve not can be estimated via dynamic programming, since it can be only for optimal point matching but for smooth deforma- decomposed into linearly sequential sub-problems [6]. Sub- tions as well. There are several works in the formulation sequently, we propose two possible methods for estimating and optimization of both ds and E(·). In the deformation point correspondence between shapes based on (30). In the based approaches, the energy term is usually described in next subsections, we will consider the optimization of (30) terms of the effort functional given in (24)[61]. However, with dynamic programming, and later we will consider its there are several elaborate formulations of the energy term optimization constrained to uniform sampling. based on elasticity theory [45], [63]. Similarly, extensive work has been done in formulating a dissimilarity func- tion. Particular to shape matching, in [12] a dissimilarity 5.1 Optimal sampling function that is based on curvature is formulated. Later, In this subsection, we will describe a dynamic programming in [55] and [50], a symmetric version of what was proposed based solution for optimal point correspondence estimation IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 8

Index of sampled points: [1,z] Index of sampled points: [1,z] Index of sampled points: [1,z] Index of sampled points: [1,z] 20 40 60 80 100 10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100

550 20 20 20 20 500 40 40 40 40 450

60 400 60 60 60

80 350 80 80 80

100 300 100 100 100 250 120 120 120 120 200 140 140 140 140 150 Index of shape points Index of shape points Index of shape points Index of shape points 160 160 160 160 100

180 50 180 180 180

200 200 200 200

(a) (b) (c) α = 1, β = 10 (d) α = 1, β = 1 (e) α = 10, β = 1

1 Fig. 6: Intermediate values of dynamic programming. (a) shows two input shapes. S∗ , coloured in green, is uniformly 2 sampled and S¯ , coloured in red, is to be sampled optimally. (b) shows the search space, defined by the charts Ui, and the cost of selecting a point from S¯2 for the ith position with the color coding. (c-e) shows three optimal sampling paths based on (32) for different values of α and β. The green path is an optimal for α = 1 and β = 0, the red path is an optimal for α = 0 and β = 1, and finally the yellow path is an optimal sampling for α and β values shown below the figures.

Parameter values Geodesic α β η Geodesic

α = 10, β = 1 20.9151

3.6601 α = 1, β = 1 1 0

2.0915 α = 1, β = 10

20.9151 Fig. 7: Geodesic paths for different values of α and β. Note the impact of a large β value on the deformation (geodesic) 1 10 3.6601 and the final result. 2.0915 between open curves. Later, we will address the case of closed curves. As indicated earlier, the objective functional Fig. 8: Geodesic paths between two shapes under different given in (30) can be solved with dynamic programming in a elasticity constraint η. Note that an objective functional with polynomial time. To show this fact, we will first describe z an appropriate β value gives a consistent result regardless sample points of a shape in a more general form as of the elasticity constraint.

k S¯ ◦ ξj = (x1, x2, ··· , xz): xi ∈ Ui, (31) Subsequently, given a uniformly sampled reference curve where xi=1,··· ,z are variables that can take points as a value S1, let its representation be f(S1) = (g∗, ··· , g∗). Subse- ¯k ∗ ∗ 1 z from their respective domain Ui=1,··· ,z, while the Ui ⊂ S quently, we solve for optimal sampler of a given curve S¯2, k are charts of the curve S¯ . Hence, the size of Ui introduces 1 with respect to S∗ , by optimizing (30), which is rewritten as x a restriction on the search space of the corresponding i. z In this work, we define the charts Ui=1,··· ,z by sliding a X arg min φi(xi−1, xi), (32) pre-defined window size to cover the whole curve, see ξj i=2 Fig. 6a. The ratio of the window sizes, defined on two φ given shapes, determines the constraint on the elasticity of a where i is defined as η ∗ 2 deformation from one to another, we denote this ratio by . φi(xi−1, xi) = α × d(gi , gi) For instance, If w1 and w2 are the window sizes defined for ∗ ∗ + β × (A ¯2 (p , p ) − A ¯2 (xi−1, xi)), (33) S¯1 and S¯2, respectively, then we approximate the elasticity S i−1 i S 2 constraint by η = w2/w1, see Fig.8. Moreover, we fix where xi−1 and xi are sampled points of S¯ using a can- the sampling of one of the argument shapes to uniform didate sampler for the (i − 1)th and ith position, such that 2 sampling. Fixing one of the samplings converts what was gixi−1 = xi. Moreover, only the penalty term of S¯ is added 1 a symmetric objective functional (30) to asymmetric one. in (32), since the sampling of S∗ is fixed to uniform sampling This is computationally efficient, when aligning a set of its penalty is zero. Subsequently, (32) is identified as the shapes, e.g., with a fixed reference shape n-shapes requires composition of linearly sequential sub-problems. As a result, n − 1 alignment operations, whereas a direct one-to-one its solution is estimated by computing the minimum and alignment requires (n−1)! alignment operations. Moreover, the minimizer of (32) sequentially, see Appendix. Assuming we have not seen a significant variation in performance due the same search space size, i.e. ∀i, s = |Ui|, the optimal to the choice of a particular curve as a reference, see results sampling function can be estimated in O(s2z) time for in Subsection 7.1 where each shape is used as a reference. open curves. The point correspondence estimation, in closed IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 9

Uniform sampling Optimal sampling 1 0.8 0.8 0.6 0.6

0.4 0.4

0.2 0.2 0 0 -0.2

-0.4 -0.2

-0.6 -0.4 -0.8

Fig. 9: Optimal vs Uniform sampling. For the optimal case, -1 -0.6 the green curves are sampled uniformly, and the curves in -0.5 0 0.5 1 1.5 2 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 red optimally. Note that the optimal sampler adjusts the (a) (b) sampling rate while the uniform sampling does not. Fig. 10: Examples of clustering results for k=2. (a) based on optimal sampling. (b) based on uniform sampling. curves, follows the same procedure. However, unlike the open curves the starting point is not known a priori. Hence, 6 APPLICATION IN SHAPE ANALYSIS the optimal sampling for every point as a starting point is computed and the point with the least cost will be chosen As discussed in earlier sections, the proposed representation z as a starting point along with its sampling solution. space, SE(n) , is a curved space. Nevertheless, it is possible to fit statistical models and do further shape analysis using the defined distance metric, (20). Particularly, in [14] an 5.2 Uniform sampling inhomogeneous time Markov process is used to capture In this subsection, we will present point correspondence es- the statistical properties of a deformable shape represented timation for uniformly sampled shapes, as discussed in [15]. in a matrix Lie group. Grouping a collection of data into The approach restricts shape sampling functions to a group distinct clusters is an alternative data analysis technique of z-cyclic permutations, given the shapes to be aligned are that is extensively studied in a wide of problems. uniformly sampled. To elaborate, given two uniformly sam- Apart from the well-known k-means algorithm, there are 1 2 pled closed curves S∗ and S∗ the estimation of point corre- several clustering algorithms emphasizing different aspects spondence is formulated as estimating the starting point and of the problem [46], [59]. However, the performance of a 2 the orientation of S∗ ; since the sampling rate is uniform it is clustering algorithm depends on a goal specific similarity not optimized. Subsequently, a family of uniform samplers metric, among other factors. In that regard, we propose (20) i is defined as: ξ∗ : [0, z] 3 j → (i + j)mod z ∈ [0, z], where or (35), depending on the assumption, as a similarity metric mod represents the modulo operation. As a result, for a to cluster shapes. To that end, we discuss modelling of a fixed orientation the optimal staring point is estimated by shape class, represented in SE(n)z, by k-means clustering. running through all possible starting points and selecting the one that minimizes the following 6.1 K-means for shape clustering 1 2 1 2 i In k-means clustering, the central idea is to estimate k points I(f(S∗ ), f(S∗ )) = min d(f(S∗ ), f(S∗ ◦ ξ∗)). (34) i∈[1,z] such that the data points around these k centroids are more similar with each other under the metric being considered. Note that (34) is a restricted version of (30). Furthermore, Moreover, the k points are thought of as the means of the approach considers orientation reversing matchings by the clusters [35]. Consequently, we will mainly discuss the generalizing (34) to the following objective functional computation of in SE(n)z. Over the past −→ ←− arg min(I(f(S1), f(S2)), I(f(S1), f(S2))), (35) years, considerable work has been done on characterizing ∗ ∗ ∗ ∗ the mean of several matrices [4], [9], [25]. For our purpose, where −→ and ←−, represent ordering in clock-wise and however, we will consider Frechet´ mean and its local con- anti-clockwise direction, respectively. Meanwhile, the point struction called Karcher mean [26], [47]. Frechet´ mean is correspondence between two given open curves is done a generalization of a distribution’s centroid in any metric by testing only the different end points as a starting space U; it is defined as point. The solution of (35) is estimated iteratively using X min δ(˜x, y)2, (36) nested loops. Thus, the time-complexity of aligning closed x˜∈U y∈U curves is O(z2). Moreover, it is worth to note that if −→ ←− 2 2 −1 −1 where δ(·, ·) denotes a distance metric. Essentially, it charac- f(S∗ ) = (g1, ··· , gz) then f(S∗ ) = (gz , ··· , g1 ). In com- parison, if optimal sampling is used for estimating the point terizes the mean of a distribution, under a given metric, as correspondence between closed shapes the time-complexity a value that minimizes the dispersion of the dataset. Subse- would be O(s2z2). Although computationally efficient, the quently, by replacing the distance metric δ(·, ·) with (18) we uniform sampling gives a meaningful matching only for get the following objective function for computing the mean shapes with small deformations or linearly elastic deforma- of m rigid transformation matrices. m m tions. On the contrary, the optimal sampling scheme gives X X min k log(R˜T R )k2 + min kv − v˜k2. (37) consistently meaningful matching even under occlusion and j F n j 2 R˜∈SO(n) v˜∈R missing parts, see Fig.9. j=1 j=1 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 10

The second term of (37) is satisfied by taking the arithmetic details are more likely to be captured with our method. n mean, since R is a vector space. For the first term, however, Swedish leaf: The Swedish leaf dataset contains 1125 ex- an optimal global solution is not ensured. Thus, the local so- ample leaves from 15 leaf types. The example shapes are lution of Frechet´ mean (Karcher mean) is computed [1], [47] distributed uniformly; there are 75 example leaves from – a direct product of (37) is optimized to compute the mean each species. To compare our approach with other methods, of shape representations. Subsequently, given a prior knowl- we follow the same experimental scenario discussed in [31], edge of k, expectation-maximization (EM) algorithm is used [43] , which is nearest neighbour classification. We randomly to estimate the k mean points and the clusters around them, select 25 leaf shapes from each type and use the left out 50 see Fig. 10. Note that, results of the EM algorithm depend for testing. We repeat the 100 times and take on the initialization. If the k is not known a priori alternative the average classification rate. In all of the 100 , assumptions can be made on the distribution of the data to the shapes are represented by z = 200 uniformly sampled automatically estimate k [24]. In general, k-means of shape points. Our method achieved an average classification rate clusters are estimated by optimizing the following with EM- of 99.50 with 0.01287 . Table.4 shows algorithm the comparison of our result with other shape matching m k methods. We note that performance of a nearest neighbour X X i j 2 classifier is not an explicit measure of a similarity metric’s min rijd(f(S˜ ), f(S )) , (38) j=1 i=1 performance, as the classification is based on the best result among elements of a group. Consequently, in Fig. 12a we ˜ where f(Si) is the representation of the estimated mean show the precision-recall (PR) curve of a retrieval result th th shape of the i cluster, and rij is 1 if and only if the j obtained by performing the leave-one-out experimental sce- th shape is assigned to the i cluster otherwise is 0. nario on the dataset. Note that, the area under the PR curve is smaller in the Swedish leaf case as compared to 7 EXPERIMENTAL RESULTS the area under the PR curve for the Flavia leaf dataset. In general, we observe that the inter-class similarity in the In this section, we present two main experimental scenarios Flavia leaf dataset is high as compared to the Swedish leaf and a concrete computational time cost of the approach. dataset making discrimination in Flavia difficult, see Fig. 11. In Subsection 7.1, an evaluation of the accuracy and rep- However, the intra-class variation in the Swedish dataset is resentativeness of the proposed approach is presented. In mainly due to nonlinear elastic deformations, whereas in Subsection 7.2, the impact of occlusion, sampling point the Flavia it is mainly due to a combination of rotation and variation, and local perturbation on the proposed approach scaling. Hence, distance metric based on uniform sampling, is presented. In Subsection 7.3, the computational cost of our see Subsection 5.2, does not perform well in the Swedish approach and [54] is presented. leaf dataset as compared to the Flavia leaf dataset, in terms of precision and recall. 7.1 Shape retrieval Generic shapes: To evaluate the impact of optimal sam- Given a query shape, a shape retrieval system tries to rank pling based point correspondence estimation, we tested our shapes in a given database according to their similarity approach on generic shape datasets, MPEG-7 [32], Kimia99 to the query. Usually, such systems are solely based on a and Kimia216 [51]. The MPEG-7 dataset contains 70 shape distance metric rather than a mathematical model of shape classes each containing 20 elements, Kimia99 is composed of categories. Hence, the performance of a shape retrieval sys- 9 shape classes with 11 examples each, and Kimia216 con- tem highlights the representativeness of the used distance tains 18 class of shapes with 12 examples. All the datasets metric. As a result, we present the evaluation of a shape are composed of shape categories with variations due to retrieval system that is based on the proposed distance different view point and/or large deformations. As a result, metric (34) and (32) on different datasets. these datasets pose a significantly challenging problem than Flavia leaf dataset: The dataset contains 32 types of leaf either the Flavia or the Swedish leaf dataset do. We evaluate species with a total of 1907 examples see Fig. 11a. In [31] our approach, on all datasets, by replicating the experimen- a leave-one-out test scenario was performed on the dataset tal scenario described in [5] and [58], which is similar to the to evaluate an elastic similarity metric derived from SRV- leave-one-out. In the case of Kimia99 and Kimia216 retrieval framework [54]. Leave-one-out is a setup where every leaf accuracy is measured by counting the overall results of the is used as a query against the rest of the dataset. To compare top 10 (in Kimia99) and top 11 (in Kimia216) retrievals from our approach with other methods, we also replicate the the same class, excluding the query shape [58]. Meanwhile, leave-one-out scenario with Mean Average Precision (MAP) retrieval accuracy on the MPEG-7 dataset is measured based used as a performance measure. For this experiment, every on what is called “Bull’s eye score”– Bull’s eye score takes leaf shape is represented by z = 200 points that are uni- the overall percentage of retrieval results, among the first formly sampled from its boundary. Table3 summarizes the 40, that belong to the same class as the query [58]. In all result of our approach and results reported in [31] and [43]. datasets, we evaluate our approach for z = 100 using Although our method achieved high MAP, it is not neces- both uniform and optimal sampling–for the optimal case sarily inclusive of all relevant informations; precision drops the query shape is sampled uniformly and each element as recall goes to 1, see Fig. 12a. Nonetheless, it outperformed from the dataset are sampled optimally using (32). More the elastic shape metric and the Gaussian elastic metric, importantly, in both cases, matching starting points and discussed in [31], in terms of MAP. One possible reason for shape orientation are estimated as discussed in Section 5.2 this is that we do not pre-smooth the data and thus local and3, respectively. Furthermore, β = 0.1, α = 1, and IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 11 Methods 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th SC [7] 97 91 88 85 84 77 75 66 56 37 Shock graph [51] 99 99 99 98 98 97 96 95 93 82 Height functions [58] 99 99 99 99 98 99 99 96 95 88 Symbolic representation [13] 99 99 99 98 99 98 98 95 96 94 Our method with optimal sampling 99 99 97 97 97 98 93 90 79 53 Our method with uniform sampling 91 81 73 75 63 57 51 44 35 30

TABLE 1: Retrieval results on Kimia99 shape dataset.

Methods 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th SC [7] 214 209 205 197 191 178 161 144 131 101 78 Shock graph [51] 216 216 216 215 210 210 207 204 200 187 163 Skeleton graph matching [5] 216 216 215 216 213 210 210 207 205 191 177 Our method with optimal sampling 216 216 208 205 203 193 192 178 169 162 123 Our method with uniform sampling 212 210 188 181 174 165 159 151 141 132 120

TABLE 2: Retrieval results on Kimia216 shape dataset.

α = 1, β = 5 Index of sampled points: [1,z] 10 20 30 40 50 60 70 80 90 100

20

40 (a) (b) 60 80

100

120

140

Fig. 11: Examples of different leaf types. (a) from the Flavia Index of shape points 160

180 leaf dataset. (b) from the Swedish leaf dataset. 200 α = 1, β = 1 Index of sampled points: [1,z] 10 20 30 40 50 60 70 80 90 100

20

40

60 1 1 80

100 0.9 0.9 120

0.8 0.8 140

Index of shape points 160

0.7 0.7 180

200 0.6 0.6

0.5 0.5 α = 1, β = 5 Index of sampled points: [1,z]

0.4 Precision 0.4 Precision 10 20 30 40 50 60 70 80 90 100

0.3 0.3 20 40

0.2 0.2 σ=0 60

σ=0.5 80 0.1 Flavia 0.1 σ=1.5 100 Swedish Leaf σ=2.5 0 120 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 140

Recall Index of shape points Recall 160

180 (a) 200 (b) α = 1, β = 1 Index of sampled points: [1,z] 10 20 30 40 50 60 70 80 90 100

Fig. 12: (a) PR curves on the Flavia and the Swedish leaf 20 40 dataset. (b) PR curves under different noise magnitudes 60 80

100 introduced to the original fighter jet planes dataset. 120

140 Index of shape points 160

180

200 α = 1, β = 5 Index of sampled points: [1,z] 10 20 30 40 50 60 70 80 90 100 η = 20 in all of the optimal sampling based point matchings. 20 40 In Table.1 and2 we compare our result with other ap- 60 80

100 proaches evaluated on Kimia99 and Kimia216. On MPEG-7, 120 140 Index of shape points retrieval based on the uniform sampling approach achieved 160 180 a bull’s eye score of 68.02%, while retrieval based on the 200 optimal sampling approach achieved 84.17%. We note that α = 1, β = 1 Index of sampled points: [1,z] 10 20 30 40 50 60 70 80 90 100 retrieval results based on optimal samplings achieve a bet- 20

40 ter score than results based on uniform samplings, in all 60 80 three datasets. In general, given a large η and a small 100 120

140 Index of shape points β (relative to α) retrieval based on an optimal sampling 160 180 should perform better than retrieval based on a uniform 200 sampling. Nevertheless, we stress that the reported optimal sampling based results are based on estimates of the starting Fig. 13: Impact of β on partial shape matching. In all of point and the shape orientation alignment from the uniform the above examples, η = 37.46. Each row shows optimal sampling approach. Hence, the optimal sampling inherits sampling of a shape, shown in red, to match the uniformly misalignment errors from the uniform sampling, especially sampled shape shown in green. Note that, a large value of in cases of large deformations, which lead to performance β leads to a matching that favours area preservation. degradation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 12

Methods MAP 3 Optimal Angle function [30] 45.87 Uniform Shape context [7] 47.00 2.5 TSLA [43] 69.93 2

Elastic metric with 200 points [31] 81.86 Distance 1.5

Gaussian elastic metric with 200 points [31] 92.37 1 0 50 100 150 200 250 300 350 400 Our method with 200 points 94.11 Sampling size (z)

TABLE 3: MAP on the Flavia leaf dataset. Our result is Fig. 14: Impact of sampling size under the optimal and the highlighted at the bottom. uniform sampling based point correspondence estimation.

Methods Recognition rate Fourier descriptors [33] 89.60 Shape-Tree [18] 96.28 Subsection 5.1, can be tuned to perform well by adjusting TSLA [43] 96.53 β in (32). To illustrate this, we select example shapes with Elastic metric with 200 points [31] 99.18 partial similarity from Mythological creatures and 1070-shape Our approach with 200 points 99.50 datasets. Subsequently, in Fig. 13 we show matching results β TABLE 4: Nearest-neighbour recognition rate on the for different values of . In all the matching scenarios, η 37.46 Swedish leaf dataset. Our result is highlighted at the bottom. the elasticity constraint is held constant to . Hence, the solution search space is fixed. Nevertheless, in case of occlusion or partial matching, we note that forcing the 7.2 Robustness sampler to preserve area when that is clearly not the case leads to point mismatching, hence, a small value should be In this subsection, we test the robustness of the proposed assigned to β. similarity metric to local noise and sampling size variations, Effects of sampling size: The proposed approach casts the and the effectiveness of optimal sampling under occlusion. point correspondence estimation problem as a sampling To that end, we use shapes from fighter jets dataset [56], function estimation problem, hence the sampling function Mythological creatures [11], and 1070-Shape dataset that is optimized and not given. However, the number of points can be found at http://vision.lems.brown.edu/content/ z is selected a priori. Here, we investigate the impact of available-software-and-databases. having different values of z on the optimal point corre- Local shape perturbations: In general, local perturbation spondence estimation and its accuracy. In general approx- of a curved shape with a noise that does not alter the imating a shape, especially one with intricate structure, by shape degrades the performance of a shape alignment and a small number of points leads to a less detailed result. retrieval system. Consequently, we evaluate the tolerance of Nevertheless, it is desirable to have a consistent distance the proposed approach to local perturbations on fighter jets for different sample sizes. To that end, Fig. 14 shows how dataset [56]. The dataset contains 7 types of fighter jets each the distance between two open curves, shown in Fig. 15, with 30 examples. The main cause of intra-class variation varies under different sampling sizes for both the uniform is deforming parts of the plane and rotation. We begin our and the optimal sampling. The result is obtained by varying experiment by introducing an additive white Gaussian noise z while the rest of the parameters are fixed α = 1, β = 10. to the shapes of the fighter jets. We denote the standard We observe that the distance under the optimal sampling σ deviation of the noise by . For all subsequent experiments, is more consistent relative to the uniform sampling based the contour of each shape is approximated by uniformly distance. sampled z = 200 points. Next, we do a leave-one-out test scenario where the original (noise free) dataset is queried by every shape from datasets corrupted by noise with 7.3 Computational cost different distributions. Table.5 summarizes the computed As discussed earlier, the geodesic distance between two MAP values and Fig. 12b shows their respective precision- given curves can be computed explicitly. Hence, the com- recall curve. In general, the proposed similarity metric is putational cost of a geodesic distance is completely depen- tolerant to local perturbations that do not alter the shape dent on the computational cost of the point correspondence significantly. estimation approaches, which are O(s2z2) and O(z2) for Correspondence estimation under occlusion: Shapes can the optimal and the uniform sampling, respectively. The exhibit partial similarity due to occlusions or nonlinear elas- entire proposed approach, from the computation of the tic deformations, see Fig. 13. Thus, a point correspondence representation matrices up to the optimal sampling esti- estimation method has to be able to handle nonlinear elastic mation, is implemented in MATLAB R2014a running on an deformations. For instance, the uniform sampling, discussed Intel core i7-3540M with 3.0 GHz×4 processing speed and in Subsection 5.2, is constrained to linear elastic deforma- 7.7 GB RAM with Ubuntu 14.04 64-bit operating system. tions, thus it fails to match shapes that are partially similar, Henceforth, the reported computation cost is for shapes see Fig.9. Alternatively, the optimal sampling, presented in approximated by z = 100 sampled points and includes the cost of computing the representation f(·) and the pre- σ 0 0.5 1.5 2.5 processing stage. Subsequently, the geodesic distance (20) MAP 97.11 96.72 89.95 83.27 computation, assuming established point correspondence, takes 0.1045 seconds. Point correspondence estimation, un- TABLE 5: MAP on the fighter jets dataset with a Gaussian der the uniform sampling, between two closed curves, noise of different standard deviations. shown in Fig.2 first row, takes 11.1346 seconds. The same IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 13

[7] Serge Belongie, Jitendra Malik, and Jan Puzicha. Shape matching and object recognition using shape contexts. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 24(4):509–522, 2002.1, 11, 12 [8] Rajendra Bhatia. Positive definite matrices. Princeton University Press, 2009.5 [9] Dario Bini, Beatrice Meini, and Federico Poloni. An effective matrix satisfying the ando-li-mathias properties. Fig. 15: A geodesic path from the first to the last curve. These of Computation, 79(269):437–452, 2010.9 [10] Fred L Bookstein. Size and shape spaces for landmark data in two curves are used in estimating the effects of sample size. dimensions. Statistical Science, pages 181–222, 1986.2 [11] Alexander M Bronstein, Michael M Bronstein, Alfred M Bruck- stein, and Ron Kimmel. Analysis of two-dimensional non-rigid matching problem is estimated with a C++ implementation shapes. International Journal of Computer Vision, 78(1):67–88, 2008. 12 of SRV framework [54] running on the same machine in [12] Isaac Cohen, Nicholas Ayache, and Patrick Sulger. Tracking points 2.0298 seconds. Meanwhile, the optimal sampling of open on deformable objects using curvature information. In Computer curves or closed curves with a known starting point and VisionECCV’92, pages 458–466. Springer, 1992.7 orientation has taken 64.5833 seconds for η = 7.013. In [13] Mohammad Reza Daliri and Vincent Torre. Robust symbolic rep- resentation for shape recognition and retrieval. , general, the computational cost of the optimal sampling is 41(5):1782–1798, 2008. 11 much higher than both SRV and the uniform sampling. [14] Girum Demisse, Djamila Aouada, and Bjorn¨ Ottersten. Template- based statistical shape modelling on deformation space. In 22nd IEEE International Conference on Image Processing, 2015.2,9 [15] Girum Demisse, Djamila Aouada, and Bjorn¨ Ottersten. Similarity ONCLUSION 8 C metric for curved shapes in euclidean space. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.1,2, We have presented a deformation based representation ap- 5,9 proach for curved shapes. The approach considers curves to [16] Manfredo Perdigao do Carmo Valero. Riemannian geometry. 1992. be elements of a finite dimensional matrix Lie group. Such 4,5 a representation led to an explicit geodesic equation and [17] Ian L Dryden and Kanti V Mardia. Statistical shape analysis, a left invariant distance metric. As a result, the geodesic volume 4. Wiley Chichester, 1998.2 [18] Pedro F Felzenszwalb and Joshua D Schwartz. Hierarchical curve and distance between shapes can be computed in matching of deformable shapes. In Computer Vision and Pattern a closed form and statistical analysis of shapes can be Recognition, 2007. CVPR’07. IEEE Conference on, pages 1–8. IEEE, done in a finite dimensional matrix space. Furthermore, 2007. 12 there is less restriction on the family of curves being con- [19] Oren Freifeld and Michael J Black. Lie bodies: A manifold representation of 3d human shape. In Computer Vision–ECCV 2012, sidered as compared to other parametrization approaches; pages 1–14. Springer, 2012.1,2,3,6 ours does not require curves to be differentiable as long [20] Ulf Grenander. Geometries of knowledge. Proceedings of the as there is continuity in a topological sense. However, National Academy of Sciences, 94(3):783–789, 1997.3 curves with changing topology are beyond the scope of [21] Ulf Grenander and Daniel Macrae Keenan. On the shape of plane images. SIAM Journal on , 53(4):1072–1094, the representation. Moreover, two possible solutions for 1993.2,3 point correspondence estimation are presented. In both so- [22] Ulf Grenander and Michael I Miller. Representations of knowledge lutions, the problem of point correspondence is treated as in complex systems. Journal of the Royal Statistical Society. Series B a sampling function estimation problem. Although compu- (Methodological), pages 549–603, 1994.2,3 [23] Ulf Grenander and Michael I Miller. Pattern theory: from representa- tationally expensive, the optimal sampling estimation pro- tion to inference. Oxford University Press, 2007.3 vides an explicit control over elasticity constraint through [24] Greg Hamerly and Charles Elkan. Learning the k in k means. weighting constants and window sizes. Experimental re- Advances in neural information processing systems, 16:281, 2004. 10 sults show that the approach gives a discriminative rep- [25] Jeffrey Ho, Guang Cheng, Hesamoddin Salehian, and Baba Ve- muri. Recursive karcher expectation estimators and geometric resentation that is robust to local perturbation, displace- law of large numbers. In Proceedings of the Sixteenth International ment and occlusion. Mainly, in comparison to the SRV Conference on Artificial Intelligence and Statistics, pages 325–332, framework the MATLAB implementation of the approach is 2013.9 slower. [26] Hermann Karcher. Riemannian center of mass and mollifier smoothing. Communications on pure and applied mathematics, 30(5):509–541, 1977.9 [27] Michael Kass, Andrew Witkin, and Demetri Terzopoulos. Snakes: REFERENCES Active contour models. International journal of computer vision, 1(4):321–331, 1988. [1] P-A Absil, Robert Mahony, and Rodolphe Sepulchre. Optimization [28] David G Kendall. Shape manifolds, procrustean metrics, and algorithms on matrix manifolds. Princeton University Press, 2009. 10 complex projective spaces. Bulletin of the London Mathematical [2] Yali Amit. A nonlinear variational problem for image matching. Society, 16(2):81–121, 1984.2,3 SIAM Journal on Scientific Computing, 15(1):207–224, 1994.6 [29] N Khaneja, Michael I. Miller, and Ulf Grenander. Dynamic pro- [3] Yali Amit, Ulf Grenander, and Mauro Piccioni. Structural image gramming generation of curves on brain surfaces. IEEE Transac- restoration through deformable templates. Journal of the American tions on Pattern Analysis and Machine Intelligence, 20(11):1260–1265, Statistical Association, 86(414):376–387, 1991.3 1998.3 [4] T Ando, Chi-Kwong Li, and Roy Mathias. Geometric means. [30] Eric Klassen, Anuj Srivastava, Washington Mio, and Shantanu H Linear algebra and its applications, 385:305–334, 2004.9 Joshi. Analysis of planar shapes using geodesic paths on shape [5] Xiang Bai and Longin Jan Latecki. Path similarity skeleton graph spaces. Pattern Analysis and Machine Intelligence, IEEE Transactions matching. IEEE transactions on pattern analysis and machine intelli- on, 26(3):372–383, 2004.1,2, 12 gence, 30(7):1282–1292, 2008. 10, 11 [31] Hamid Laga, Sebastian Kurtek, Anuj Srivastava, and Stanley J [6] Richard E Bellman and Stuart E Dreyfus. Applied dynamic program- Miklavcic. Landmark-free statistical analysis of the shape of plant ming. Princeton university press, 2015.7 leaves. Journal of theoretical biology, 363:41–52, 2014. 10, 12 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 14

[32] Longin Jan Latecki, Rolf Lakamper, and T Eckhardt. Shape [57] Alain Trouve.´ An infinite dimensional group approach for physics descriptors for non-rigid shapes with a single closed contour. In based models in pattern recognition. preprint, 1995.3,6 Computer Vision and Pattern Recognition, 2000. Proceedings. IEEE [58] Junwei Wang, Xiang Bai, Xinge You, Wenyu Liu, and Longin Jan Conference on, volume 1, pages 424–429. IEEE, 2000. 10 Latecki. Shape matching and classification using height functions. [33] Haibin Ling and David W Jacobs. Shape classification using Pattern Recognition Letters, 33(2):134–143, 2012. 10, 11 the inner-distance. Pattern Analysis and Machine Intelligence, IEEE [59] Mingrui Wu and Bernhard Scholkopf.¨ A local learning approach Transactions on, 29(2):286–299, 2007. 12 for clustering. In Advances in neural information processing systems, [34] Wei Liu, Anuj Srivastava, and Jinfeng Zhang. Protein structure pages 1529–1536, 2006.9 alignment using elastic shape analysis. In Proceedings of the First [60] Mingqiang Yang, Kidiyo Kpalma, Joseph Ronsin, et al. A survey ACM International Conference on and Computational of shape feature extraction techniques. Pattern recognition, pages Biology, pages 62–70. ACM, 2010.1 43–90, 2008.1 [35] James MacQueen et al. Some methods for classification and anal- [61] Laurent Younes. Computable elastic distances between shapes. ysis of multivariate observations. In Proceedings of the fifth Berkeley SIAM Journal on Applied Mathematics, 58(2):565–586, 1998.2,3,6,7 symposium on mathematical statistics and probability, volume 1, pages [62] Laurent Younes. Optimal matching between shapes via elastic 281–297. Oakland, CA, USA., 1967.9 deformations. Image and Vision Computing, 17(5):381–389, 1999.3 [36] Siddharth Manay, Daniel Cremers, Byung-Woo Hong, Anthony J [63] Laurent Younes. Parametrized Plane Curves. Springer, 2010.1,3,6, Yezzi, and Stefano Soatto. Integral invariants for shape matching. 7 Pattern Analysis and Machine Intelligence, IEEE Transactions on, [64] Laurent Younes. Spaces and manifolds of shapes in computer 28(10):1602–1618, 2006.3,5 vision: An overview. Image and Vision Computing, 30(6):389–397, [37] Andrea CG Mennucci. Metrics of curves in shape optimization 2012.1,2 and analysis. In Level Set and PDE Based Reconstruction Methods in [65] Milos Zefran, Vijay Kumar, and Christopher B Croke. On the gen- Imaging, pages 205–319. Springer, 2013.1,2,3,5 eration of smooth three-dimensional rigid body motions. Robotics [38] Peter W Michor and . Riemannian geometries on and Automation, IEEE Transactions on, 14(4):576–589, 1998.5 spaces of plane curves. arXiv preprint math/0312384, 2003.1,2 [39] Peter W Michor and David Mumford. An overview of the rieman- nian metrics on spaces of curves using the hamiltonian approach. Applied and Computational Harmonic Analysis, 23(1):74–113, 2007.1 [40] John Milnor. Curvatures of left invariant metrics on lie groups. Advances in mathematics, 21(3):293–329, 1976.5 [41] Maher Moakher. Means and averaging in the group of rotations. SIAM journal on matrix analysis and applications, 24(1):1–16, 2002.5 [42] Frank Morgan and James F Bredt. Riemannian geometry: a beginner’s guide. Jones and Bartlett, 1993.4 [43] Sofiene Mouine, Itheri Yahiaoui, and Anne Verroust-Blondet. A shape-based approach for leaf classification using multiscaletri- angular representation. In Proceedings of the 3rd ACM conference on International conference on multimedia retrieval, pages 127–134. ACM, 2013. 10, 12 Girum G. Demisse received Bsc in from Hawassa University, Ethiopia, and [44] David Mumford. Pattern theory: a unifying perspective. In First Msc in computer science–smart systems –from European congress of mathematics, pages 187–224. Springer, 1994.1 Jacobs University Bremen, Germany, in 2008 [45] David Mumford and Agnes` Desolneux. Pattern theory: the stochastic and 2013, respectively. He is currently study- analysis of real-world signals. CRC Press, 2010.6,7 ing for a PhD degree in computer science at [46] Andrew Y Ng, Michael I Jordan, Yair Weiss, et al. On spectral clus- the Interdisciplinary Center for Security, Relia- tering: Analysis and an algorithm. Advances in neural information bility, and Trust, University of Luxembourg. His processing systems, 2:849–856, 2002.9 main research interests lie at the intersection [47] Xavier Pennec. Intrinsic statistics on riemannian manifolds: Basic of mathematical modelling, computer vision and tools for geometric measurements. Journal of Mathematical Imaging machine learning. and Vision, 25(1):127–154, 2006.9, 10 [48] Xavier Pennec. Statistical computing on manifolds: from rieman- nian geometry to computational anatomy. In Emerging Trends in Visual Computing, pages 347–386. Springer, 2009.1 [49] Clayton Scott and Robert Nowak. Robust contour matching via the order-preserving assignment problem. IEEE Transactions on Image Processing, 15(7):1831–1838, 2006.7 [50] Thomas B Sebastian, Philip N Klein, and Benjamin B Kimia. On aligning curves. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 25(1):116–125, 2003.7 [51] Thomas B Sebastian, Philip N Klein, and Benjamin B Kimia. Recognition of shapes by editing their shock graphs. IEEE Transac- tions on pattern analysis and machine intelligence, 26(5):550–571, 2004. 10, 11 [52] Eitan Sharon and David Mumford. 2d-shape analysis using con- Djamila Aouada (S’05–M’09) received the State formal mapping. International Journal of Computer Vision, 70(1):55– Engineering degree in electronics in 2005, from 75, 2006.1 the Ecole´ Nationale Polytechnique (ENP), Al- [53] Christopher G Small. The of Shape, Springer Series giers, Algeria, and the Ph.D. degree in elec- in Statistics. New York: Springer-Verlag, 1996.2,3 trical engineering in 2009 from North Carolina [54] Anuj Srivastava, Eric Klassen, Shantanu H Joshi, and Ian H State University (NCSU), Raleigh, NC. She is Jermyn. Shape analysis of elastic curves in euclidean spaces. Research Scientist at the Interdisciplinary Cen- Pattern Analysis and Machine Intelligence, IEEE Transactions on, tre for Security, Reliability, and Trust (SnT), at 33(7):1415–1428, 2011.2,3,5,6, 10, 13 the University of Luxembourg. Dr. Aouada has [55] Hemant D Tagare. Shape-based nonrigid correspondence with been leading the computer vision activities at the application to heart motion analysis. Medical Imaging, IEEE Trans- SnT since 2009. She has worked as a consultant actions on, 18(7):570–579, 1999.7 for multiple renowned laboratories (Los Alamos National Laboratory, [56] Ninad Thakoor, Jean Gao, and Sungyong Jung. Hidden markov Alcatel Lucent Bell Labs., and Mitsubishi Electric Research Labs.). Her model-based weighted likelihood discriminant for 2-d shape clas- research interests span the areas of signal and image processing, sification. Image Processing, IEEE Transactions on, 16(11):2707–2719, computer vision, pattern recognition and data modelling. She is the co- 2007. 12 author of two IEEE best paper awards. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.–, NO.–, MONTH YEAR 15

Bjorn¨ Ottersten (S’87–M’89–SM’99–F’04) was born in Stockholm, Sweden, 1961. He received the M.S. degree in electrical engineering and ap- plied physics from Linkping University, Linkping, Sweden, in 1986. In 1989 he received the Ph.D. degree in electrical engineering from Stanford University, Stanford, CA. Dr. Ottersten has held research positions at the Department of Electri- cal Engineering, Linkoping¨ University, the Infor- mation Systems Laboratory, Stanford University, the Katholieke Universiteit Leuven, Leuven, and the University of Luxembourg. During 96/97 Dr. Ottersten was Director of Research at ArrayComm Inc, a start-up in San Jose, California based on Otterstens patented technology. He has co-authored journal papers that received the IEEE Signal Processing Society Best Paper Award in 1993, 2001, 2006, and 2013 and 3 IEEE conference papers receiving Best Paper Awards. In 1991 he was appointed Professor of Signal Processing at the Royal Institute of Technology (KTH), Stockholm. From 1992 to 2004 he was head of the department for Signals, Sensors, and Systems at KTH and from 2004 to 2008 he was dean of the School of Electrical Engineering at KTH. Currently, Dr. Ottersten is Director for the Interdisciplinary Centre for Security, Reliability and Trust at the Univer- sity of Luxembourg. His research interests include security and trust, reliable wireless communications, and statistical signal processing.