<<

An efficient Exact-PGA algorithm for constant curvature

Rudrasis Chakraborty1, Dohyung Seo2, and Baba C. Vemuri1 1Department of CISE, University of Florida, FL 32611, USA 2U-Systems, A GE Healthcare Company, CA, USA 1 2 {rudrasis, vemuri}@cise.ufl.edu {dhseo.118}@gmail.com

Abstract 1. Introduction Principal Component Analysis (PCA) is a widely used dimensionality reduction technique in Science and Engineering. PCA however requires the input data to lie in a vector space. With the advent of new technolo- -valued datasets are widely encountered in gies and wide spread use of sophisticated feature ex- many computer vision tasks. A non-linear analog of traction methods, manifold-valued data have become the PCA algorithm, called the Principal Geodesic Anal- ubiquitous in many fields including but not limited ysis (PGA) algorithm suited for data lying on Rieman- to, Computer Vision, Medical Imaging and Machine nian manifolds was reported in literature a decade ago. Learning. A nonlinear version of PCA, called the Prin- Since the objective function in the PGA algorithm is cipal Geodesic Analysis (PGA), for data lying on Rie- highly non-linear and hard to solve efficiently in gen- mannian manifolds was introduced in [8]. eral, researchers have proposed a linear approximation. Since the objective function of PGA is highly non- Though this linear approximation is easy to compute, it linear and hard to solve in general, researchers pro- lacks accuracy especially when the data exhibits a large posed a linearized version of the PGA [8]. Though this variance. Recently, an alternative called the exact PGA linearized PGA, hereafter referred to as PGA, is com- was proposed which tries to solve the optimization with- putationally efficient, it lacks accuracy for data with out any linearization. For general Riemannian mani- large spread/variance. In order to solve the objective folds, though it yields a better accuracy than the orig- function exactly, Sommer et al., [25] proposed to solve inal (linearized) PGA, for data that exhibit large vari- the original objective function (not the approximation) ance, the optimization is not computationally efficient. and called it exact PGA . While exact PGA attempts In this paper, we propose an efficient exact PGA al- to solve this complex nonlinear optimization problem, gorithm for constant curvature Riemannian manifolds it is however computationally inefficient. Though it (CCM-EPGA). The CCM-EPGA algorithm differs sig- is not possible to efficiently and accurately solve this nificantly from existing PGA algorithms in two aspects, optimization problem for a general manifold, however, (i) the distance between a given manifold-valued data for manifolds with constant , we for- point and the principal submanifold is computed an- mulate an efficient and exact PGA algorithm, dubbed alytically and thus no optimization is required as in CCM-EPGA. It is well known in geometry, by virtue the existing methods. (ii) Unlike the existing PGA al- of the Killing-Hopf theorem [4], that any non-zero con- gorithms, the descent into codimension-1 submanifolds stant curvature manifold is isomorphic to either the hy- does not require any optimization but is accomplished persphere (SN ) or the (HN ), hence in through the use of the Rimeannian inverse Exponential this work, we present the CCM-EPGA formulation for map and the parallel transport operations. We present (SN ) and (HN ). Our formulation has several applica- theoretical and experimental results for constant cur- tions to Computer Vision and Statistics including di- vature Riemannian manifolds depicting favorable per- rectional data [21] and color spaces [19]. Several other formance of the CCM-EPGA algorithm compared to applications of are, shape analy- existing PGA algorithms. We also present data recon- sis [30], Electrical Impedance Tomography, Geoscience struction from the principal components which has not Imaging [28], Brain Morphometry [29], Catadiaoptric been reported in literature in this setting. Vision [3] etc. In order to depict the effectiveness of our proposed

13976 CCM-EPGA algorithm, we use the average projection to an efficient and novel algorithm for exact PGA on error as defined in [25]. We also report the computa- constant curvature manifolds (CCM-EPGA). tional time comparison of the CCM-EPGA with the Let M be a Riemannian manifold. Let us sup- PGA [8] and the exact PGA [25] algorithms respec- pose we are given a dataset, X = x , , x , where { 1 ··· n} tively. Several variants of the PGA exist in literature xj M. Let us assume that the finite sample Fr´echet and we briefly mention a few here. In [23], authors mean∈ [9] of the data set exists and be denoted by µ. Let computed the principal geodesics (without approxima- Vk be the space spanned by mutually orthogonal vec- tion) only for a special Lie group, SO(3). Geodesic tors (principal directions) v1, , vk , vj TµM, j. th { ··· } ∈ ∀ PCA (GPCA) [14, 13] solves a different optimization Let Sk be the k geodesic subspace of TµM, i.e., function namely, optimizing the projection error along Sk = Expµ(Vk), where Exp is the Riemannian expo- the geodesics. Authors in [13], minimize the projection nential map (see [4] for definition). Then, the principal error instead of maximizing variance in geodesic sub- directions, vi are defined recursively by spaces (defined later in the paper). GPCA does not n use a linear approximation, but it is restricted to man- 1 2 vi = arg max d (µ, ΠSi (xj)) (1) ifolds where a closed form expression for the geodesics v v ⊥ n =1, Vi−1 j=1 exists. More recently, a probabilistic version of PGA k k ∈ X Si = Expµ(spanVi 1, vi) (2) called PPGA was presented in [31], which is a nonlinear − version of PPCA [27]. None of these methods attempt where d(x, y) is the geodesic distance between x M to compute the solution to the exact PGA problem and y M,Π (x) is the point in S closest to x ∈M. defined in [25]. Another recent work in [11], reports S The PGA∈ algorithm on M is summarized in Alg.∈1. a non-linear generalization of PGA, namely the prin- cipal geodesic curves, and argues about its usefulness Algorithm 1 The PGA algorithm on manifold M over PGA. 1: The rest of the paper is organized as follows. In Given a data set X = x1, , xn M, and 1 L dim(M) { ··· } ∈ ≤ Section 2, we present the formulation of PGA. We also ≤ discuss the details of the linearized version of PGA [8] 2: Compute the FM, µ, of X [1] and exact PGA [25]. Our formulation of CCM-EPGA 3: Set k 1 4: ←0 0 is presented in Section 2. Experimental results for the Set x¯1, , x¯n x1, , xn 5: while{ k··· L do} ← { ··· } CCM-EPGA algorithm along with comparisons to ex- ≤ n 1 act PGA and PGA are presented in Section 3. In ad- 6: 2 k 1 Solve vk = arg max d (µ, ΠSk (¯xj − )) dition to synthetic data experiments, we present the v v v ⊥ n =1, TµM, Vk−1 j=1 k k ∈ ∈ X comparative performance of CCM-EPGA on two real as in Eq. (1). data applications. In Section 4, we present the for- 7: k 1 k 1 Project x¯1− , , x¯n− to a k co-dimension mulation for the reconstruction of data from principal one submanifold{ Z···of M, which} is orthogonal to directions and components in this nonlinear setting. the current geodesic subspace. Finally, in section 5, we draw conclusions. 8: k k Set the projected points to x¯1 , , x¯n 9: k k + 1 { ··· } 2. Principal Geodesic Analysis 10: end while← Principal Component Analysis (PCA) [17] is a well known and widely used statistical method for dimen- 2.1. PGA and exact PGA sionality reduction. Given a vector valued dataset, it returns a sequence of linear subspaces that maximize In Alg. 1 (lines 6 7), as the projection operator Π the variance of the projected data. The kth subspace is hard to compute,− hence a common alternative is to is spanned by the principal vectors v1, v2, , vk locally linearize the manifold. This approach [8] maps which are mutually orthogonal. PCA{ is well··· suited} all data points on to the tangent space at µ, and as for vector-valued data sets but not for manifold-valued the tangent plane is a vector space, one can use the inputs. A decade ago, the nonlinear version called the PCA to compute the principal directions. This simple Principal Geodesic Analysis (PGA) was developed to scheme is an approximation to the PGA and naturally cope with manifold-valued inputs [8]. In this section, raises the following question: Is it possible to do PGA first, we briefly describe this PGA algorithm, then, we (solve Eq. (1)) without any linearization? The answer show the key modification performed in [25] to arrive is yes. But, computation of the projection operator, at what they termed as the exact PGA algorithm. We ΠS(x), i.e., the closest point to x in S is computation- then motivate and present our approach which leads ally expensive. In [25], Sommer et al. give an alterna-

3977 1 ¯ θ ¯ tive formulation for the PGA by minimizing the aver- by, Expψ− (ψ) = sin(θ) (ψ ψ cos(θ)) where, θ = 2 − age squared reconstruction error, i.e., d (xj, ΠSi (xj)) d(ψ, ψ¯). 2 instead of d (µ, ΠSi (xj)) in eqns. (1). They use an op- timization scheme to compute this projection. Further, 2.2.2 Basic Riemannian Geometry of HN they termed their algorithm, exact PGA, as it does not require any linearization. However, their optimization The hyperbolic N-dimensional manifold can be embed- scheme is in general computationally expensive and for ded in RN+1 using any of three different models. In a data set with large variance, convergence is not guar- this paper, we use the hyperboloid model [15]. In this N N t anteed. Hence, for large variance data, their exact PGA model, H is defined as H = x = (x1, , xN+1) N+1 { ··· ∈ is still an approximation as it might not converge. This R < x, x >H = 1, x1 > 0 , where the inner | N − } motivated us to formulate an accurate and computa- product on H , denoted by < x, y >H is defined as tionally efficient exact PGA, at least in cases where it < x, y > = x y + N+1(x y ). H − 1 ∗ 1 i=2 i ∗ i is feasible to do so. Geodesic distanceP: The geodesic distance be- • 2.2. Efficient and accurate exact PGA tween ψ, ψ¯ HN is given by, d(ψ, ψ¯) = 1 ∈ cosh− ( < ψ, ψ¯ >H ). In this paper, we present an analytic expression − N for the projected point and design an effective way Exponential Map: Given a vector v TψH , • ∈ to project data points on to the co-dimension k sub- the Riemannian Exponential map on HN is de- manifold (as in 1, line 7). An analytic expression is fined as, Expψ(v) = cosh( v )ψ + sinh( v )v/ v . | | | | | | in general not possible to derive for arbitrary Rieman- Inverse Exponential Map: The tangent vec- nian manifolds. However, for constant curvature Rie- • N N tor v TψH directed from ψ to ψ¯ is given by mannian manifolds, i.e., S (positive constant curva- 1∈ θ Exp− (ψ¯) = (ψ¯ ψ cosh(θ)) where, θ = ture) and HN (negative constant curvature), we derive ψ sinh(θ) − an analytic expression for the projected point and de- d(ψ, ψ¯). vise an efficient algorithm to project data points on to For the rest of this paper, we consider the underly- a co-dimension k submanifold. Both these manifolds ing manifold, M, as a constant curvature Riemannian are quite commonly encountered in Computer Vision manifold, i.e., M is diffeomorphic to either SN or HN , [10, 20, 3, 29, 30] as well as in many other fields of where N = dim(M)[4]. Let ψ, ψ¯ M, v T ¯M. Science and Engineering. The former more so than the ψ Further, let y(v, ψ) be defined as the∈ projection∈ of ψ latter. Even though, there are applications that can be on the geodesic submanifold defined by ψ¯ and v. Now, naturally posed in hyperbolic spaces (e.g., color spaces we will derive a closed form expression for y(v, ψ) in in Vision [19], catadiaoptric images [3] etc.), their full the case of SN and HN . potential has not yet been exploited in Computer Vi- sion research as much as in the former case. We first present some background material for the N-dimensional spherical and hyperbolic manifolds and then derive an analytical expression for the projected point.

2.2.1 Basic Riemannian Geometry of SN Geodesic distance: The geodesic distance be- • tween ψ, ψ¯ SN is given by, d(ψ, ψ¯) = arccos(ψtψ¯). ∈

N Exponential Map: Given a vector v TψS , • the Riemannian Exponential map on S∈N is de- fined as Expψ(v) = cos( v )ψ + sin( v )v/ v . The Exponential map gives the| | point which| | is| located| on the great circle along the direction defined by the tangent vector v at a distance v from ψ. | | Figure 1: Projection of a data point on to a geodesic Inverse Exponential Map: The tangent vec- submanifold of the sphere. • tor v T SN directed from ψ to ψ¯ is given ∈ ψ

3978 2.3. Analytic expression for y(v, ψ) on SN where, Theorem 1. SN v SN Let ψ and Tψ¯ . Then the ¯ ∈ ∈ 1 (< v, ψ >H )/( < ψ, ψ >H ) projection of ψ on the geodesic submanifold defined by a = tanh− − ¯ v ψ and v, i.e., y(v, ψ) is given by: | | ! . (< v, ψ >)/(< ψ, ψ¯ >) y(v, ψ) = cos(arctan )ψ¯ Proof. As before, consider the hyperbolic triangle v ¯ | | ! shown in, ψψyψ, where yψ = y(v, ψ). Let, a = d(ψ,¯ y ), b△= d(y , ψ) and c = d(ψ, ψ¯). Also, let (< v, ψ >)/(< ψ, ψ¯ >) ψ ψ + sin(arctan )v/ v A = ∠ψψy¯ , B = ∠ψψy¯ , C = ∠ψy¯ ψ. Clearly, since v | | ψ ψ ψ | | ! yψ is the projected point , C = π/2. Then, B is the (3) angle between Logψ¯(ψ) and v. Hence,

Proof. Consider the spherical triangle shown in Fig. v < ,ψ>H ¯ ¯ ¯ coth c < v, ψ >H 1, ψψyψ, where yψ = y(v, ψ). Let, a = d(ψ, yψ), cosh B = sinh c − (8) △ ¯ ∠ ¯ v b = d(yψ, ψ) and c = d(ψ, ψ). Also, let A = ψψyψ, | | B = ∠ψψy¯ ψ, C = ∠ψy¯ ψψ. Clearly, since yψ is the projected point , C = π/2. So, Then, from hyperbolic trigonometry, as tanh a = cosh B tanh c, we get < c (ψ ψ¯ cos c), v > cos B = sin c − ¯ 1 (< v, ψ >H )/( < ψ, ψ >H ) c v a = tanh− − (9) v | | v < ,ψ> cot c < v, ψ¯ > | | ! = sin c − (4) v | | Note that, since the arc length between ψ¯ and yψ Here, < ., . > denotes the Euclidean inner product, is a, hence, using the Exponential map, we can show N+1 where both ψ and v are viewed as points in R , that yψ = cosh(a)ψ¯ + sinh(a)v/ v . | | i.e., the ambient space. Note that, < v, ψ¯ >= 0, as  N v T ¯S . From spherical trigonometry, we know that ∈ ψ tan a = cos B tan c. Given the closed form expression for the projected point, now we are in a position to develop an efficient ∴ cos B tan c = cos c projection algorithm (for line 7 in Alg. 1), which is v | | presented in Alg. 2. Note that, using Alg. 2, data (< v, ψ >)/(< ψ, ψ¯ >) points on the current submanifold are projected to a = v submanifold of dimension one less, which is needed by | | the PGA algorithm in line 6. Also note that, in order to ensure existence and uniqueness of FM on SN , we (< v, ψ >)/(< ψ, ψ¯ >) have restricted the data points to be within a geodesic ∴ a = arctan (5) N v ! ball of convexity radius < π/2 [1]. On H , FM exists | | and is unique everywhere. Hence, using the Exponential map, we can show that Note that in order to descend to the codimension-1 yψ is given by, submanifolds, we use step-1 and step-2 instead of the optimization method used in the exact PGA algorithm y = cos(a)ψ¯ + sin(a)v/ v (6) ψ | | of [25].  3. Experimental Results Analogously, we can derive the formula for y(v, ψ) N N In this section, we present experiments demonstrat- on H , v T ¯H . ∈ ψ ing the performance of CCM-EPGA compared to PGA N N [8] and exact PGA [25]. We used the average projection Theorem 2. Let ψ H and v Tψ¯H . Then the projection of ψ on the∈ geodesic submanifold∈ defined by error, defined in [25], as a measure of performance in ψ¯ and v, i.e., y(v, ψ) is given by: our experiments. The average projection error is de- n fined as follows. Let xi i=1 be data points on a man- y = cosh(a)ψ¯ + sinh(a)v/ v (7) ifold M. Let µ be the{ mean} of the data points. Let, v ψ | |

3979 Algorithm 2 Algorithm for projecting the data points to a co-dimension one submanifold N N 1: Input: a data point xi S (H ), a geodesic submanifold defined at µ ∈and v T SN (T HN ), ∈ µ µ and y(v, xi) which is the projection of ψ on to the geodesic submanifold. 2: Output:x ¯i which is the projection of the data point N 1 N 1 xi to a subspace, S − (H − ), that is orthogonal to the current geodesic submanifold. 3: Step 1. Evaluate the tangent vector, vi N N ∈ Ty(v,xi)S (Ty(v,xi)H ) directed towards xi using the inverse Exponential map. It is clear that vi is orthogonal to v. 4: µ Step 2. Parallel transport vi to µ. Let vi de- note the parallel transported vector. The geodesic µ submanifold defined by µ and vi is orthogonal to geodesic submanifolds obtained from the previous steps in Alg. 1. 5: Step 3. Setx ¯ y(vµ, x ) i ← i i

2 be the first principal direction and S = Expµ(v). Then Figure 2: Synthetic data (Data 1) on S the error (E) is defined as follows: 1 n for high variance data, CCM-EPGA yields significantly E = d2(x , Π (x )) (10) n i S i better accuracy. Also, CCM-EPGA is computationally i=1 X very fast in comparison to exact PGA. The results in where d(., .) is the geodesic distance function on M. Table 1 indicate that CCM-EPGA outperforms exact We also present the computation time for each of the PGA in terms of efficiency and accuracy. Although, three algorithms. All the experimental results reported the required time for PGA is less than that of CCM- here were obtained on a desktop with a single 3.33 GHz EPGA, in terms of accuracy, CCM-EPGA dominates Intel-i7 CPU and 24 GB RAM. PGA. 3.1. Comparative performance of CCM•EPGA on 3.2. Comparative performance on point•set data Synthetic data (SN example) In this section, we present the comparative perfor- In this section, we depict the performance of the mance of CCM-EPGA on several synthetic datasets. proposed CCM-EPGA algorithm on 2D point-set data. For each of the synthetic data, we have reported the The database is called GatorBait-100 dataset [22]. This average projection error and computation time for all dataset consists of 100 images of shapes of different fish. three PGA algorithms in Table 1. All the four datasets From each of these images of size 20 200, we first ex- are on S2 and the Fr´echet mean is at the ”north tract boundary points, then we apply× the Schr¨odinger pole”. For all the datasets, samples are in the north- distance transform [7] to map each of these point sets ern hemisphere to ensure that the Fr´echet mean is on a hypersphere (S3999). Hence, this data consists unique. Data 1 and Data 2 are generated by taking of 100 point-sets each of which lie on S3999. As be- samples along a geodesic with a slight perturbation. fore, we have used the average projection error [25], to The last two datasets are constructed by drawing ran- measure the performance of algorithms in the compar- dom samples on the northern hemisphere. In addition, isons. Additionally, we report the computation time data points from Data 1 are depicted in Fig. 2. The for each of these PGA algorithms. We used the code first principal direction is also shown (black for CCM- available online for exact PGA [24]. This online im- EPGA, blue for PGA and red for exact PGA). Further, plementation is not scalable to large (even moderate) we also report the data variance for these synthetic number of data points, and further requires the compu- datasets. By examining the results, it’s evident that tation of the Hessian matrix in the optimization step, for data with low variance, the significance of CCM- which is computationally expensive. Hence, for this EPGA in terms of projection error is marginal, while real data application on the high dimensional hyper-

3980 Data Var. CCM-EPGA PGA exact PGA avg. proj. err. Time(s) avg. proj. err. Time(s) avg. proj. err. Time(s) Data 1 2.16 1.13e 04 0.70 0.174 0.46 2.54e-02 14853 Data 2 0.95 5.87e − 02 0.27 0.59 0.12 0.59 84.38 Data 3 7.1e-03 2.33e − 03 0.19 0.55 0.05 0.55 16.87 Data 4 5.9e-02 0.27− 0.33 0.37 0.14 0.37 71.84

Table 1: Comparison results on synthetic datasets

Method avg. proj. error Time(s) So, the metric is defined as follows: CCM-EPGA 2.83e 10 0.40 t PGA 9.68e − 02 0.28 < u, v >= u Gv (13) − where G = (gij) is the Fisher information matrix. Table 2: Comparison results on Gator-Bait database Now, consider the parameter space for the univariate normal distributions. The parameter space is HF = (µ, σ) R2 σ > 0, i.e., positive half space, which is sphere, we could not report the results for the exact the Hyperbolic∈ | space, modeled by the Poincar´ehalf- PGA algorithm. Though one can use a Sparse matrix plane, denoted by P2. We can define a bijection version of the exact PGA code, along with efficient par- F : H P2 as F (µ, σ) = ( µ , σ). Hence, the uni- 1 F → √2 allelization to make the exact PGA algorithm suitable variate normal distributions can be parameterized by for moderately large data, we would like to point out the 2 dimensional hyperbolic space. Moreover, there that since our algorithm does not need such modifica- exists− a diffeomorphsim between P2 and H2 (the map- tions, it clearly gives CCM-EPGA an advantage over ping is analogous to stereographic projection for SN ), exact PGA from a computational efficiency perspec- thus, we can readily use the formulation in Section 2 to tive. In terms of accuracy, it can be clearly seen that compute principal geodesic submanifold on the mani- CCM-EPGA outperforms exact PGA from the results fold of univariate normal distributions. on synthetic datasets. Both average projection error Motivated by the above formulation, we ask the fol- and computational time on GatorBait-100 dataset are lowing question: Does there exist a similar relation for reported in Table 2. This result demonstrates accu- multivariate normal distributions? The answer is no racy of CCM-EPGA over the PGA algorithm with a in general. But if the multivariate distributions have marginal sacrifice in efficiency but significant gains in diagonal covariance matrix, (i.e., independent uncor- accuracy. related variables in the multivariate case), the above 3.3. PGA on population of Gaussian distributions relation between P2 and H2 can be generalized. Con- (HN example) sider an N dimensional normal distribution parame- terized by− (µ, Σ) where µ = (µ , ,µ )t and Σ is 1 ··· N In this section, we propose a novel scheme to com- a diagonal positive definite matrix (i.e., Σij = σi, if pute principal geodesic submanifolds for the manifold i = j, else Σij = 0). Then, analogous to the univari- of Gaussian densities. Here, we use concepts from in- ate normal distribution case, we can define a bijection formation geometry presented in [2], specifically, the N 2N FN : HF P as follows: Fisher information matrix [18] to define a metric on → this manifold [6]. Consider a normal density f(. θ) in µ1 µN FN (µ, Σ) = ( , σ1, , , σN ) (14) an n dimensional space, with parameters represented| √2 ··· √2 by θ−. Then the ijth entry of the n n Fisher matrix, Hence, we can use our formulation in Section 2 since denoted by g , is defined as follows:× ij there is a diffeomorphism between P2N and H2N . ∂lnf(x θ) ∂lnf(x θ) But, for general non-diagonal N dimensional covari- gij(θ) = f(x θ) | | dx (11) − R | ∂θi ∂θj ance matrix space, SPD(N), the above formulation Z does not hold. This motivated us to go one step For example, for a univariate normal density f(. µ, σ), | further to search for a parameterization of SPD(N) the fisher information matrix is where we can use the above formulation. In [16], au- 1 thors have used the Iwasawa coordinates to parame- σ2 0 (gij(µ, σ)) = 2 (12) 0 2 terize SPD(N). Using the Iwasawa coordinates [26],  σ 

3981 we can get a one-to-one mapping between SPD(N) Method avg. proj. error Time(s) and the product manifold of PD(N) and U(N 1), CCM-EPGA 7.73e 03 0.09 − where PD(N) is manifold of N dimensional diagonal PGA 0.14− 0.05 − positive definite matrix and U(N 1) is the space of exact PGA 0.09 732 (N 1) dimensional upper triangular− matrices, which is isomorphic− − to RN(N+1)/2. We have used the formu- lation in [26], as discussed below. Table 3: Comparison results on Brodatz database

Let Y = V SPD(N), then we can use N ∈ Iwasawa decomposition to represent VN as a tuple (VN 1, xN 1, wN 1). And repeating the following par- tial− Iwasawa− decomposition:−

T I xN 1 VN 1 0 VN = − − (15) 0 1 0 wN 1    − 

N 1 where wN 1 > 0 and xN 1 R − . We − − ∈ get the following vectorized expression: VN t t t 7→ (((w0, x1, w1), x2, w2), , xN 1, wN 1). Note that as ··· − − each of wi is > 0, we can construct a positive definite th diagonal matrix with i diagonal entry being wi. And i as each xi is in R , we will arrange them column-wise to form a upper triangular matrix. Thus, for PD(N), we can use our formulation for the hyperboloid model Figure 3: Approximation of xj from the first k principal of the hyperbolic space given in Section 2, and the stan- components. dard PCA can be applied for RN(N+1)/2.

We now use the above formulation to compute the principal geodesic submanifolds for a covariance de- 4. Data Reconstruction from principal scriptor representation of Brodatz texture dataset [5]. directions and coefficients Similar to our previous experiment on point-set data, in this experiment, we report the average projection In this section, we present a recursive scheme to error and the computation time. We adopt a similar approximate an original data point with principal di- procedure as in [12] to derive the covariance descrip- rections and coefficients. We present a reconstruction N tors for the texture images in the Brodatz database. method for data on S , the reconstruction for data N Each 256 256 texture image is first partitioned into 64 points on H can be done in an analogous manner. × N th th non-overlapping 8 8 blocks. Then, for each block, the Let xj S be the j data point and vk the k × ∈ k th covariance matrix (FF T ) is summed over the blocks. principal vector.x ¯j is the k principal component of N th Here, the covariance matrix is computed from the fea- xj. Note that on S , k principal component of a t k 2 2 data point is y(v , x ). Let x be the approximated ∂I ∂I ∂ I ∂ I k j j ture vector F = I, ∂x , ∂y , ∂x2 , ∂y2 . We make k 1 | | | | | | | | xj from the first k principal components. Let wj − the covariance matrix positive definite by adding a k 1 k 1 k 1 be Logµxj − / Logµxj − . Let w¯ j − be the parallel small positive diagonal matrix. Then, each image is k k 1 k k transported vector wj − from µ tox ¯j . Let, v¯k be the represented as a normal distribution with zero mean k 1 parallel transported version of v to x − . We refer and this computed covariance matrix. Then, we used k j readers to Fig. 3 for a geometric interpretation. the above formulation to map each normal distribution on to H10. The comparative results of CCM-EPGA Now, we will formulate a recursive scheme to recon- with PGA and exact PGA are presented in Table 3. struct xj. Let us reconstruct the data using the first The results clearly demonstrate the efficiency and ac- (k 1) principal components. Then, the kth approx- − k curacy of CCM-EPGA over the PGA and the exact imated point xj is the intersection of two geodesics k 1 k k 1 PGA algorithms. defined by xj − , v¯k andx ¯j , w¯ j − . Let these two great

3982 circles be denoted by problem. Using these analytic expressions, we achieved a much more accurate and efficient solution for PGA on k 1 G1(t) = cos(t)xj − + sin(t)v¯k (16) constant curvature manifolds, that are frequently en- k k 1 countered in Computer Vision, Medical Imaging and G2(u) = cos(u)¯xj + sin(u)w¯ j − (17) Machine Learning tasks. We presented comparison re- k sults on synthetic and real data sets demonstrating fa- At t = α1 and u = α2, let G1(α1) = G2(α2) = xj . k 1 vorable performance of our algorithm in comparison to Since, v¯ and w¯ − are mutually orthogonal, we get, k j the state-of-the-art. k 1 k 1 k tan(α ) tan(α ) =< x − , w¯ − >< x¯ , v¯ > (18) 1 2 j j j k Acknowledgement Note that, as our goal is to solve for α or α to get 1 2 This research was funded in part by the NSF IIS- xk, we need two equations. The second equation can j 1525431 to Prof. B. C. Vemuri. We thank Dr. Stefan be derived as follows: Sommer of the University of Copenhagen, for providing us the code for his exact PGA algorithm. d(µ,G1(α1)) = d(µ,G2(α2))

This leads to, References

k 1 [1] B. Afsari. Riemannian Lˆ{p} center of mass: Exis- cos(α1) < µ, x − > j tence, uniqueness, and convexity. Proceedings of the cos(α2) = k (19) < µ, x¯j > American Mathematical Society, 139(2):655–673, 2011. 2, 4 Then, by solving Eqs. (18) and (19) we get, [2] S.-i. Amari and H. Nagaoka. Methods of information 4 2 geometry, volume 191. American Mathematical Soc., a cos (α1) + b cos (α1) + d = 0 (20) 2007. 6 where, [3] I. Bogdanova, X. Bresson, J.-P. Thiran, and P. Van- dergheynst. Scale space analysis and active contours k 1 2 k 1 k 1 2 k 2 for omnidirectional images. Image Processing, IEEE a =< µ, x − > < x − , w¯ − > < x¯ , v¯ > j j j j k Transactions on, 16(7):1888–1901, 2007. 1, 3 k 1 2 < µ, x − > [4] W. M. Boothby. An introduction to differentiable man- − j ifolds and Riemannian geometry, volume 120. Aca- k 2 k 1 2 b =< µ, x¯j > + < µ, xj − > demic press, 1986. 1, 2, 3 [5] P. Brodatz. Textures: a photographic album for artists and and designers. Dover Pubns, 1966. 7 d = < µ, x¯k >2 − j [6] S. I. Costa, S. A. Santos, and J. E. Strapasson. Fisher By solving the equation (20), we get information distance: a geometrical reading. Discrete Applied Mathematics, 2014. 6 [7] Y. Deng, A. Rangarajan, S. Eisenschenk, and B. C. b + (b2 4ad)sgn(a) α = arccos − − (21) Vemuri. A riemannian framework for matching point 1 s 2a  p  clouds represented by the schr¨odingerdistance trans- form. In Computer Vision and Pattern Recognition k where, sgn(.) is the signum function. Hence, xj = (CVPR), 2014 IEEE Conference on, pages 3756–3761. G1(α1). This completes the reconstruction algorithm. IEEE, 2014. 5 Our future efforts will be focused on using this recon- [8] P. T. Fletcher, C. Lu, S. M. Pizer, and S. Joshi. Princi- struction algorithm in a variety of applications men- pal geodesic analysis for the study of nonlinear statis- tioned earlier. tics of shape. Medical Imaging, IEEE Transactions on, 23(8):995–1005, 2004. 1, 2, 4 5. Conclusions [9] M. Fr´echet. Les ´el´ements al´eatoires de nature quel- conque dans un espace distanci´e. In Annales de In this paper, we presented an efficient and accu- l’institut Henri Poincar´e, volume 10, pages 215–310. rate exact-PGA algorithm for (non-zero) constant cur- Presses universitaires de France, 1948. 2 n vature manifolds, namely the hypersphere S and the [10] R. Hartley, J. Trumpf, Y. Dai, and H. Li. Rotation n hyperbolic space H . We presented an analytic expres- averaging. International journal of computer vision, sion for the projection of a data point on a geodesic 103(3):267–305, 2013. 3 submanifold, which is required in the PGA algorithm [11] S. Hauberg. Principal curves on riemannian manifolds. and in general involves solving a difficult optimization 2015. 2

3983 [12] J. Ho, Y. Xie, and B. Vemuri. On a nonlinear gener- [29] Y. Wang, W. Dai, X. Gu, T. F. Chan, S.-T. Yau, alization of sparse coding and dictionary learning. In A. W. Toga, and P. M. Thompson. Teichm¨ullershape Proceedings of The 30th International Conference on space theory and its application to brain morphometry. Machine Learning, pages 1480–1488, 2013. 7 In Medical Image Computing and Computer-Assisted [13] S. Huckemann, T. Hotz, and A. Munk. Intrinsic shape Intervention–MICCAI 2009, pages 133–140. Springer, analysis: Geodesic pca for riemannian manifolds mod- 2009. 1, 3 ulo isometric lie group actions. Statistica Sinica, 2010. [30] W. Zeng, D. Samaras, and X. D. Gu. Ricci flow for 3d 2 shape analysis. Pattern Analysis and Machine Intelli- [14] S. Huckemann and H. Ziezold. Principal component gence, IEEE Transactions on, 32(4):662–677, 2010. 1, analysis for riemannian manifolds, with an application 3 to triangular shape spaces. Advances in Applied Prob- [31] M. Zhang and P. T. Fletcher. Probabilistic principal ability, pages 299–319, 2006. 2 geodesic analysis. In Advances in Neural Information [15] B. Iversen. Hyperbolic geometry, volume 25. Cam- Processing Systems, pages 1178–1186, 2013. 2 bridge University Press, 1992. 3 [16] B. Jian and B. C. Vemuri. Metric learning using iwa- sawa decomposition. In Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, pages 1–6. IEEE, 2007. 6 [17] I. Jolliffe. Principal component analysis. Wiley Online Library, 2002. 2 [18] E. L. Lehmann and G. Casella. Theory of point estima- tion, volume 31. Springer Science & Business Media, 1998. 6 [19] R. Lenz, P. L. Carmona, and P. Meer. The hyper- bolic geometry of illumination-induced chromaticity changes. In Computer Vision and Pattern Recogni- tion, 2007. CVPR’07. IEEE Conference on, pages 1–6. IEEE, 2007. 1, 3 [20] R. Lenz and G. Granlund. If i had a fisheye i would not need so (1, n), or is hyperbolic geometry useful in image processing. In Proc. of the SSAB Symposium, Uppsala, Sweden, pages 49–52, 1998. 3 [21] K. Mardia and I. Dryden. Shape distributions for land- mark data. Advances in Applied Probability, pages 742–755, 1989. 1 [22] A. Rangarajan. https://www.cise.ufl.edu/˜anand/ publications.html. 5 [23] S. Said, N. Courty, N. Le Bihan, and S. J. Sang- wine. Exact principal geodesic analysis for data on so (3). In 15th European Signal Processing Conference (EUSIPCO-2007), pages 1700–1705. EURASIP, 2007. 2 [24] S. Sommer. https://github.com/nefan/smanifold, 2014. 5 [25] S. Sommer, F. Lauze, S. Hauberg, and M. Nielsen. Manifold valued statistics, exact principal geodesic analysis and the effect of linear approximations. In Computer Vision–ECCV 2010, pages 43–56. Springer, 2010. 1, 2, 4, 5 [26] A. Terras. Harmonic analysis on symmetric spaces and applications. Springer, 1985. 6, 7 [27] M. E. Tipping and C. M. Bishop. Probabilistic prin- cipal component analysis. Journal of the Royal Sta- tistical Society: Series B (Statistical Methodology), 61(3):611–622, 1999. 2 [28] G. Uhlmann. Inverse Problems and Applications: In- side Out II, volume 60. Cambridge University Press, 2013. 1

3984