<<

Florida State University Libraries

2015 The One-and Two-Sample Problem for Data on Hilbert with Applications to Shape Analysis Mingfei Qiu

Follow this and additional works at the FSU Digital Library. For more information, please contact [email protected] FLORIDA STATE UNIVERSITY

COLLEGE OF ARTS AND SCIENCES

THE ONE- AND TWO-SAMPLE PROBLEM FOR DATA ON HILBERT MANIFOLDS WITH

APPLICATIONS TO SHAPE ANALYSIS

By

MINGFEI QIU

A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy

2015

Copyright c 2015 Mingfei Qiu. All Rights Reserved.

Mingfei Qiu defended this dissertation on November 10, 2015. The members of the supervisory committee were:

Victor Patrangenaru Professor Directing Dissertation

Xiuwen Liu University Representative

Elizabeth Slate Committee Member

Adrian Barbu Committee Member

Robert Clickner Committee Member

Robert Paige Committee Member

The Graduate School has verified and approved the above-named committee members, and certifies that the dissertation has been approved in accordance with university requirements.

ii For my dear parents and grandparents! The courage, character, enthusiasm you instilled in me changed my world!

iii ACKNOWLEDGMENTS

I would sincerely thank my major professor, Dr. Vic Patrangenaru, for opening the door of a fantastic world of Statistics to me. His guidance led me to the success of research, his advice helped me to make correct decisions, and his encouragement grew me up to be a strong person. Thanks to the research assistantship position he offered, I got a chance to devote my time and work ethic in shape analysis. Thanks to his travel support, I was able to present my research to the elite scientists in the academic conferences. Thanks to the various opportunities he provided, I am able to research in such a great area. I also would like to thank Dr. Robert Paige for conducting the research in my second journal paper; thank Dr. Xiuwen Liu for instructing me the 3D projective reconstruction; thank Dr. Elizabeth Slate, Dr. Adrian Barbu, and Dr. Robert Clickner for the help in my study and future career. Thank all of them for kindly being my committee, and providing me a lot of advice about my research topic and presentation skill. Moreover, I would like to thank Dr. Marius Buibas for the help in coding the reconstruction; thank Dr. Leif Ellingson for illuminating the knowledge in shape analysis to me; thank Dr. Daniel Osborne and Dr. Michael Crane for the help in my paper and career. Thank my peers, Mr. Ruite Guo, Mr. David Lester, Mr. Hui Wang and Mr. Kouadio Yao for unconditionally supporting my academic activities and competitions. The last but never the least, I would like to thank National Science Foundation for the financial support, Statistical and Applied Mathematical Sciences Institute (SAMSI) for supporting my travel expenses to the conferences, American Mathematical Society (AMS) for awarding my research in Shape Analysis area. Also I thank Florida State University and its Department of Statistics and the Department of Mathematics for the awards, fellowship and scholarship for supporting my academic performance.

iv TABLE OF CONTENTS

List of Tables ...... vii List of Figures ...... viii Abstract ...... x

1 Introduction 1

2 High Level Electronic Imaging 4

3 Background 11 3.1 Landmark Extraction ...... 11 3.1.1 Methodology of Landmark Extraction ...... 11 3.1.2 Examples of Landmark Extraction ...... 12 3.2 3D Reconstruction ...... 15 3.2.1 Method of 3D Reconstruction ...... 15 3.2.2 Examples of 3D Reconstruction ...... 18 3.3 Projective Geometry ...... 19 3.3.1 Methodology of Projective Geometry ...... 19 3.3.2 Example of Projective Transformation ...... 25

4 Statistics on Manifolds 27 4.1 Veronese-Whitney Means ...... 27 4.2 Hilbert ...... 30

5 Hypothesis Tests for Finite Dimensional Configurations 32 5.1 One-sample Hypothesis Test for Finite Dimensional Configurations ...... 32 5.1.1 Methodology of One-sample Hypothesis Test for Finite Dimensional Configurations 32 5.1.2 Example - Analysis of Image Data for One Polyhedral ...... 34 5.2 Two-sample Hypothesis Test on ...... 36 5.2.1 Methodology of Two-sample Hypothesis Test on Lie Group ...... 36 5.2.2 Examples of Two-sample Hypothesis Tests on Lie Group ...... 40

6 Hypothesis Tests for Infinite Dimensional Configurations 45 6.1 One-sample Hypothesis Test for Infinite Dimensional Configurations ...... 45 6.1.1 Test Statistic for the One-sample Neighborhood Hypothesis ...... 45 6.1.2 Neighborhood hypothesis for the mean shape of an Euclidean contour ...... 46 6.1.3 Example - How far is the Corpus Callosum of an Average Individual from Albert Einstein’s? ...... 48 6.1.4 Note ...... 50 6.2 Two-sample Hypothesis Test for Infinite Dimensional Configurations ...... 50 6.2.1 Methodology of Two-sample Hypothesis Test for Infinite Dimensional Configurations 50 6.2.2 Example - Two-sample Neighborhood Hypothesis Test for the Leaf Data ...... 53

v 6.2.3 Note ...... 56

Appendix A Coordinates of the Landmarks 57

References ...... 59 Biographical Sketch ...... 63

vi LIST OF TABLES

5.1 Coordinates of the blueprint...... 34

5.2 Simultaneous confidence affine intervals for mean projective shape change - BBC actor data . 43

5.3 Simultaneous confidence affine intervals for mean projective shape change - statue data . . . . 44

6.1 The confidence intervals for the total sample variance of landmark 5 of Leaf A and B . . . . . 54

A.1 Corner coordinates from paired images in Figure 2.3 ...... 57

A.2 Coordinates of facial landmarks for BBC actor data in Figure 2.4 ...... 58

A.3 Statue data landmark coordinates Figure 2.5 ...... 58

vii LIST OF FIGURES

1.1 Examples of similarity, affine and projective shape...... 1

2.1 Blueprint of an object...... 4

2.2 3D polyhedral surface. Corners are used as landmarks...... 5

2.3 Top box slightly larger than in Figure 2.2...... 6

2.4 BBC actor data (top 8 images form one sample, bottom 6 form the other sample)...... 7

2.5 Epicurus bust images (top 16 images are from one sample, bottom 8 from another sample). . . 8

2.6 The brain image and the corpus callosum contour of Einstein...... 9

2.7 Corpus callosum midsagittal sections shape data, in subjects age - 64 to 83...... 9

2.8 Original images of leaf A (top 20 figures) and leaf B (bottom 20 figures)...... 10

2.9 Contours extracted from the original images...... 10

3.1 Landmark of the polyhedral in Figure 2.2...... 13

3.2 Landmarks for actor data in Figure 2.4...... 13

3.3 Landmarks - Epicurus bust images in Figure 2.5...... 14

3.4 Matched landmarks on midsagittal sections in for CC data (Einstein’s is the upper left CC). . . 14

3.5 Matched leaf A contours (top 20 figures) and leaf B contours (bottom 20 figures)...... 15

3.6 3D Reconstructions for configurations of corners in Figure 2.3...... 18

3.7 Leaf A 3D reconstruction (top 10 figures) and leaf B 3D reconstruction (bottom 10 figures). . 19

3.8 Original 2D scene and its registered projective coordinates in their spherical representation. . . 21

3.9 Projective coordinates for m=2...... 22

3.10 Leaf A (top 10 figures) and leaf B (bottom 10 figures) contours after projective transformation. 26

4.1 Extrinsic mean...... 29

5.1 Cutoffs of the T ∗ marginal bootstrap statistics ...... 41

5.2 Simultaneous confidence regions for the mean of the five axial marginals (affine coordinates). 42

5.3 Simultaneous confidence regions for the statue data...... 44

viii 6.1 Superimposed icons for 2D direct similarity shapes of CC midsections : sample mean (red) vs Albert Einstein’s (blue)...... 49

6.2 95% bootstrap confidence region for the extrinsic mean CC contour by 1000 resamples. . . . . 50

6.3 5 selected landmarks of Leaf A (left) and Leaf B (right)...... 54

6.4 Icons of changes between the projectively transformed leaves A and B...... 55

6.5 Icons of changes within the projectively transformed leaf A (top 5) and leaf B (bottom 5). . . 56

ix ABSTRACT

This dissertation is concerned with high level imaging analysis. In particular, our focus is on extracting the projective shape information or the similarity shape from digital camera images or Magnetic Resonance Imaging(MRI). The approach is statistical without making any assumptions about the distributions of the random object under investigation. The data is organized as points on a Hilbert manifold. In the case of projective shapes of finite dimensional configuration of points, we consider testing a one-sample null hy- pothesis, while in the infinite dimensional case, we considered a neighborhood hypothesis testing methods. For 3D scenes, we retrieve the 3D projective shape, and use the Lie group structure of the projective shape space. We test the equality of two extrinsic means, by introducing the mean projective shape change. For 2D MRI of midsections of Corpus Callosum contours, we use an automatic matching technique that is necessary in pursuing a one-sample neighborhood hypothesis testing for the similarity shapes. We conclude that the mean similarity shape of the Corpus Callosum of average individuals is very far from the shape of Albert Einstein’s, which may explain his geniality. Another application of our Hilbert manifold methodology is two-sample testing problem for Veronese-Whitney means of projective shapes of 3D contours. Particularly, our data consisting comparing 3D projective shapes of contours of leaves from the same tree species.

x CHAPTER 1

INTRODUCTION

Object data analysis is the statistical analysis of data sets of complex objects [51]. For the statistical analysis of object data the reader is referred to Patrangenaru and Ellingson (2015)[41]. The observed objects could be numbers, vectors, curves, images, shapes of surfaces and so on. Shape analysis is object data analysis for shapes of configurations of points; it is an interdisciplinary subject that combines geometry, statistical inference, computer vision and machine learning. Considering a configuration in Rm, shape deals with the residual structure of this configuration when certain transformations that form a pseudogroup, are filtered out [33]. If two images of the same 2D scene are captured by an ideal pinhole camera, the corresponding transformation between the two images is the composition of two central projections, which is a projective transformation. The affine transformation requires that the two central projections be well approximated by parallel projections ( the scene pictured is very far from the camera, or very small ). And a similarity transformation has even more restrictions, as the parallel projections must be orthogonal projections on the plane of the camera. Similarity transformation is the compositions of rotations, translations and scaling. Figure 1.1 describes an idea about the differences between these three kinds of transformations.

(a) Two squares have the same simi- (b) A square and a rectangular have (c) A square and a arbitrary convex larity, affine and projective shape. the same affine and projective shape. quadrilateral have only the same pro- jective shape.

Figure 1.1: Examples of similarity, affine and projective shape.

In other words, similarity, which was first proposed by Kendall (1984) [28], means the residual structure where translation, scale and rotation have been filtered out [8]. In the history of statistics, ’shape’ primarily

1 referred to similarity shape. Later on, affine shape was also considered. The distributions of affine shape data have been considered by Berthilsson and Heyden (1999)[3]. Patrangenaru and Mardia (2003)[43] defined the affine transformation in terms of group actions (see Sugathadasa (2006)[50] for more details), which is equivalent to the work of Heyden (1995)[25]. Projective shape was considered more recently [33, 45]. Maybank and Stephen (1994) [34] as well as Goodall and Mardia (1999)[20] did considerable work on finding the projective invariant. In 2005, Mardia and Patrangenaru (2005)[33] proposed the projective frame approach, based on projective coordinates, which helped to construct a convenient projective shape space. Thanks to the progress in computer vision technology due to Longuet-Higgins (1981)[31], Faugeras (1992)[16], Hartley et al. (1992)[22] and others, Patrangenaru et al. (2012)[38] concluded that all we see are 3D projective shapes. The shape analysis of 3D configurations from 2D digital camera images started with the work of Patrangenaru et al. (2010)[42]. The most popular distance used between projective shapes is the chord distance via the Veronese- Whitney (VW) embedding of RP k−m−2 in a product of spaces of symmetric matrices. The VW mean of a random projective shape is its Frechet´ mean [19] relative to this extrinsic distance. Asymptotic test statistics for the equality of the extrinsic means of independent random objects (r. o.’s) on a manifold em- bedded in the Euclidean space on manifolds were first considered in Hendriks and Landsman (1998)[24] and more recently in Bhattacharya (2008)[5] and Bhattacharya and Bhattacharya (2011)[4]. In an attempt to address the case of matched pairs, Bhattacharya (2008)[5] also derived a large sample test statistic under the more general assumption of correlated pairs of r. o.s when the sample sizes are equal. Patrangenaru et al. (2010)[42] derived the asymptotic and nonparametric bootstrap distribution for the extrinsic mean projective shapes, which is a one-sample hypothesis testing problem. Beran and Fisher (1998)[2] were the first to use group actions in hypothesis testing problems, a technique later used in Mardia and Patrangenaru (2005)[33]. Motivated by these works, Crane and Patrangenaru (2011)[10] proposed a two-sample hypothesis test based on the mean change of 3D projective shape in a Lie group, which transforms the two-sample problem into a one-sample case, to which they applied the method to the analysis of 3D projective shape of the eye fundus medical images. Recall that for two matched random vectors (X,Y ), one can test if X and Y have difference in means by estimating the mean of difference vector D = X Y and then employing one-sample testing methods. On the manifolds, for Lie group-valued − variables X and Y , mean change is defined by a Lie group multiplication C = X−1 Y . Patrangenaru et ⊙

2 al. (2014)[45] extended statistical inference to the confidence region for the mean 3D projective shape (also see [1] for the 2D case). Munk et al. (2008)[36] show that the well known Hotelling T 2 statistic breaks down in infinite dimen- sional settings. In addition, they point out that for infinite dimensional data, the mean shapes from two different populations can never be identical. As a result, they develop one- and multi-sample neighborhood hypothesis testing methodologies. Here a neighborhood or equivalence null hypothesis replaces the usual “point” null hypothesis which states that two or more infinite dimensional mean shapes are identical. The more appropriate neighborhood approach has the additional benefit of much simpler asymptotic analysis. While the mean and covariance used in Munk et al. (2008)[36] were for real-valued variables, Ellingson et al. (2013)[14] extended their one-sample neighborhood hypothesis testing method by employing extrinsic parameters on Hilbert manifolds. Qiu et al. (2014)[48] and Osborne et al. (2015)[37] applied this method- ology to test how far the mean Corpus Callosum similarity shape of average individuals is from Albert Einstein’s. Inspired by aforementioned works, Qiu et al. (2015)[47] used an extrinsic approach to develop a neighborhood hypothesis test for the mean change in infinite dimensional 3D projective shapes (also see [41, 44, 39]). This dissertation is organized as this: CHAPTER 2 describes digital imaging data obtained from uncal- ibrated cameras or magnetic resonance imaging (MRI); CHAPTER 3 provides a background on labeling, landmark, 3D projection, projective shape and projective frame; in CHAPTER 4, the Veronese-Whitney mean, which is an example of extrinsic mean, is illustrated, as well as its corresponding extrinsic covariance matrix, and the notion of Hilbert manifold is briefly introduced; CHAPTER 5 is on one- and two-sample hypothesis testing for mean 3D projective shapes; and CHAPTER 6 shows the neighborhood one-sample hypothesis test for the mean 2D similarity shapes, and the two-sample neighborhood hypothesis test for the mean 3D projective shapes.

3 CHAPTER 2

HIGH LEVEL ELECTRONIC IMAGING

Imaging data is integral to the human experience-from our own vision, or from the personal photographs that we take every day, to the professional ones used in science, communication, security, transportation, education, space exploration, medical imaging, artistic expression, and entertainment. In daily life, people take pictures without knowing any information about the internal camera parameters or camera position. This leads to large amounts of data that could be used to test if the images taken are from the same person. These kind of ’everyday pictures’ are known as uncalibrated camera images. Patrangenaru et al. (2010)[42] developed nonparametric methodology for analyzing the 3D configurations from their 2D uncalibrated camera images; in this paper, a one-sample hypothesis test was derived. Crane and Patranenaru (2011)[10] introduced the two-sample hypothesis test based on the Lie group structure of the projective shape space. Motivated by the pioneering work, Patrangenaru et al. (2014)[45] applied the two-sample hypothesis test to quality control, facial recognition and scene detection. In industry, quality control managers need to know if the product matches its blueprint to an acceptable degree. For example, if we have the blueprint as shown in Figure 2.1, and test for whether the shape of the projective object in Figure 2.2 is the same as that blueprint, by the method in [42], one could perform a one-sample hypothesis test based on several landmarks.

Figure 2.1: Blueprint of an object.

4 Figure 2.2: 3D polyhedral surface. Corners are used as landmarks.

In Figure 2.1, we display the blueprint of a 3D object. And in Figure 2.2, we test if the polyhedral has the same shape of the blueprint. If we have several camera images of another comparable object (Figure 2.3), and want to test if the new object has the same shape of the one in Figure 2.2, this leads to a two-sample hypothesis testing problem ([10] and [42]).

5 Figure 2.3: Top box slightly larger than in Figure 2.2.

Of course, shape analysis does not only focus on visual quality control problems. We could also apply the methodology to facial recognition area. The face data consisting in 14 digital camera images of an actor in different disguises (BBC data) is displayed in Figure 2.4. This facial imaging data set was used in the context of 2D projective shape by Mardia and Patrangenaru (2005)[33]; the need for a 3D projective shape analysis of this data was mentioned in Patrangenaru et al. (2010)[42], where the authors gave a graphic argument for the equality of the mean 3D projective shapes of a group of facial landmarks, from the frontal images, respectively from the one quarter images. We used 8 frontal images, and 6 one quarter images, as shown in Figure 2.4. The goal is to identify the actor from the projective shape of a 3D configuration of facial landmarks. Figure 3.2 shows one of these images with the landmarks used in our analysis, numbered from 1 to 10. Note that, for this two-sample hypothesis testing problem, the size of the samples are different.

6 Figure 2.4: BBC actor data (top 8 images form one sample, bottom 6 form the other sample).

It happens that in the surveillance videos, images of criminals are only some parts of the face are visible, which makes it hard for the police to identify the correct suspect. Projective shape analysis could be very effective in addressing this problem. For example, the statue data in Figure 2.5 consists of 24 photos taken of statues of Epicurus, who was an ancient Greek philosopher as well as the founder of the school of philosophy called Epicureanism. The first 16 images were taken from a one-head statue, the others were taken from a double-head one, on one side of which is Epicurus and on the other side is Metrodorus of Lampsacus, the most distinguished of the disciples of Epicurus. These pictures were taken at a museum in Rome, Italy. Notice that in the double headed images, most of the left face of Epicurus is hidden. 9 landmarks were selected (see Figure 3.3) from the right side of the statue.

7 Figure 2.5: Epicurus bust images (top 16 images are from one sample, bottom 8 from another sample). 8 Another example of object data is contour data, for which, one extracts infinitely many landmarks. We consider contours, which are the boundaries of 2D topological disks in the plane. For example, the contour of the midsegittal section of the corpus callosum of Albert Einstein and a sample of some fairly typical elderly persons are shown in Figure 2.6(a) and Figure 2.7, and contribute to a very similar one-sample hypothesis testing problem as the blueprint and polyhedral object. Here, how to efficiently obtain infinitely many matched landmarks from the contours is a hard topic. Ellingson et al. (2013)[14] and Qiu et al. (2014)[48] address this problem by randomly sampling the landmarks.

(a) Right hemisphere of Einstein’s brain including CC mid- (b) Contour of Figure 2.6(a). sagittal section.

Figure 2.6: The brain image and the corpus callosum contour of Einstein.

Figure 2.7: Corpus callosum midsagittal sections shape data, in subjects age - 64 to 83.

Since the one-sample hypothesis testing problem for similarity shape contours had already been ad- dressed, one would naturally consider the development of a method for the two-sample problem. Leaf A and Leaf B shown in Figure 2.8 are the leaves cut from a plum tree on the Florida State University campus. We extract their contours via a Wacom tablet, which is a tool that artists use to create cartoons. Their con-

9 tours are displayed in Figure 2.9. Based upon the theorems in Section 3.2, we will address the problem of reconstructing the shapes of the 3D contours from such pairs as in Qiu et al. (2015)[47].

Figure 2.8: Original images of leaf A (top 20 figures) and leaf B (bottom 20 figures).

Figure 2.9: Contours extracted from the original images.

10 CHAPTER 3

BACKGROUND

The analysis of shapes is based on the landmark coordinates. The method discussed in Ellingson et al. (2013)[14] and Qiu et al. (2014)[48] addressed the problem of approximating infinitely landmarks from imaging data, if they are labeled points on a contour; if they are not on contours, one could set the most characteristic point of the object as a landmark. From the 2D images, the 3D projective reconstruction tools retrieve the 3 dimensional projective shape of the object, which is the natural shape space of the object which is essentially 3D. A key idea in finding a representation for the space of projective shapes is the projective frame, which transforms the coordinate of the points into points on a manifold. Mardia and Patrangenaru (2005)[33] proposed the projective frame to facilitate the analyssi of projective shape data as multidimensional axial data.

3.1 Landmark Extraction

Landmarks are the points of key interest on an object. Each landmark is also a point of correspondence on each object that matches between and within populations [11]. A qualified landmark should satisfy two conditions: (1) the point should appear in all data images; (2) the point should be easy to find out. A label is a name or number associated with a landmark, and identifies which pairs of landmarks correspond when comparing two objects. Landmark correspondence is the key to the shape analysis.

3.1.1 Methodology of Landmark Extraction

For contours, we need algorithms to automatically pair the landmarks; while for non-contour objects, this process can be perform manually. We consider contours, which from a geometrical point of view are boundaries of 2D topological disks. To keep the data analysis stable, and to assign a unique labeling, we make the generic assumption that across the population there is a unique anatomical or geometrical landmark starting point p0 on such a contour of perimeter one, so that the label of any other point p on the contour is the

11 ’counterclockwise’ direction at constant speed from p0 to p.A regular planar contour γ˜ is regarded as the range of a piecewise differentiable regular arclength parameterized function γ : [0,L] C,γ(0) = γ(L), → that is one-to-one on [0,L). Two contours γ˜1, γ˜2 have the same direct similarity shape if there is a direct similarity transformation S : C C, such that S(˜γ ) =γ ˜ . Two regular contours γ˜ , γ˜ have the same → 1 2 1 2 reg similarity shape if their centered counterparts satisfy γ˜ , = λγ˜ , , for some λ C 0. Therefore Σ , 2 0 1 0 ∈ \ 2 set of all direct similarity shapes of regular contours, is a dense and open subset of P (H), the projective space corresponding to the H of all square integrable centered functions from S1 to C, here S1 = x R2, x = 1 . (see Ellingson et al. (2013)[14]). { ∈ k k } Algorithm 3.1.1. (Qiu et al. (2014)[48]) This algorithm randomly selects k matched sampling points from the uniform distribution over [0,Lj) for a sample of n contours, where Lj is the perimeter of contour j and j = 1, 2,...,n.

Step 1 Select a common starting point for all n contours such that this represents the contour at time s1.

Step 2 Generate s ,s ,...,sk Uniform(0, 1). Sort s ,s ,...,sk in increasing order and relabel 2 3 ∼ { 1 2 } them as t1,t2,...,tk.

Step 3 Obtain matched sampling points for each contour.

For j=1:n

Evaluate contour j at times t Lj,t Lj,...,tk Lj to obtain z(t Lj), z(t Lj), . . . , z(tk 1 ∗ 2 ∗ ∗ 1 ∗ 2 ∗ ∗ Lj), where z(tj) denote the jth ordered vertex. End

The coordinates in Figure 3.4 were extracted by Algorithm 3.1.1. The performance of this algorithm could be improved by manually selecting p landmarks (p > 1) before Step 2, then apply Step 2 and Step 3 to every two consecutive points of those p points. By selecting those p points, the algorithm would work better for matching landmarks on the contours, especially for projective shape contours.

3.1.2 Examples of Landmark Extraction

Landmarks for Polyhedral Data. For the polyhedral data, only the 19 corners who appear in all the images are selected as the landmark (Figure 3.1).

12 Figure 3.1: Landmark of the polyhedral in Figure 2.2.

Landmarks for Actor Data. For the actor data, the edges of the eyes, nose, mouse and chin are selected as the landmarks (Figure 3.2).

Figure 3.2: Landmarks for actor data in Figure 2.4.

Landmarks for Statue Data. For the statue data in Figure 2.5, some images does not contain the left face, so the landmarks are selected only from the right side of the face (Figure 3.3).

13 Figure 3.3: Landmarks - Epicurus bust images in Figure 2.5.

Landmarks for Corpus Callosum Data. The corpus callosum data is contour data, so we determine the matched landmarks with Algorithm 3.1.1 (Figure 3.4).

Landmark of P11 Landmark of P1 Landmark of P2 Landmark of P3 0.06 0.06 0.06 0.06

0.04 0.04 0.04 0.04 90 90 60 0.02 0.02 240 0.02 90 0.02 90 120 60 60 60 210 120 240 120 240 30 210 240 30 30 30 210 0 0 270 0 210 270 0 270 270 120 180 180 −0.02 150 −0.02 −0.02 150 −0.02 150 1 180 1 180 1 −0.04 −0.04 150 1 −0.04 −0.04

−0.06 −0.06 −0.06 −0.06 −0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 −0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 −0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 −0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 Landmark of P4 Landmark of P5 Landmark of P6 Landmark of P7 0.06 0.06 0.06 0.06

0.04 0.04 0.04 0.04

90 90 90 0.02 0.02 60 0.02 60 0.02 90 60 120 60 30 210 240 30 120 120 210 240 120 210 240 30 210 240 30 0 270 0 0 270 0 270 180 270 −0.02 −0.02 −0.02 150 180 −0.02 150 150 180 150 180 1 1 1 1 −0.04 −0.04 −0.04 −0.04

−0.06 −0.06 −0.06 −0.06 −0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 −0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 −0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 −0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 Landmark of P8 Landmark of P9 Landmark of P10 0.06 0.06 0.06

0.04 0.04 0.04

90 90 90 0.02 60 0.02 0.02 60 120 240 120 60 120 210 30 210 240 270 30 210 240 30 0 0 0 270 270 150 −0.02 −0.02 180 −0.02 150 150 180 1 180 1 −0.04 1 −0.04 −0.04

−0.06 −0.06 −0.06 −0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 −0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 −0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1

Figure 3.4: Matched landmarks on midsagittal sections in for CC data (Einstein’s is the upper left CC).

14 Landmarks for Leaf Data. The leaf data has higher resolution than the corpus callosum data, so we are able to obtain more matched landmarks.

Figure 3.5: Matched leaf A contours (top 20 figures) and leaf B contours (bottom 20 figures).

3.2 3D Reconstruction

For the uncalibrated camera images, projective shape is a more natural approach than affine shape and similarity shape, since the internal parameters of the camera are all unknown. And the methods of tak- ing picutre will often have residual information beyond which can be explained with similarity or affine transformations.

3.2.1 Method of 3D Reconstruction

The projective geometry describes the physics of ideal pinhole camera image acquisition from a 3D ob- ject to 2D camera film. It also provides a justification for the reconstruction of a 3D object from two or more 2D camera images. 3D reconstruction is a hot topic in computer vision. Jiang et al. (2005)[27] proposed automatic 3D face reconstruction with variant pose, illumination and expression (PIE). Ma et al. (2006)[32] and Hartley & Zisserman (2004)[23] reconstruct the 3D object by a simpler way, and consequently, achieve less precise reconstruction effect. We choose the later methodology, because there is no strict requirements for the 3D reconstruction. Suppose f is the focal length that measures the distance from the pinhole to the image plane. p = (X,Y,Z)T R3 is a point in 3D dimension; x = (u,v)T R2 is its projective point in the image plane; ∈ ∈ x =(u,v, f)T is the projection of p relative to the reference frame centered at O. The projection from p 0 −

15 to x is π : R3 R2, where x = π(p) is given by ←→ X Y u = f , v = f . (3.1) − Z − Z For the pinhole camera model,

u X f 0 X Zx = Z = f = − , (3.2) v − Y 0 f Y      −   which is equivalent to

X X u f 0 0 0 − Y Y Zx˜ = Z v = 0 f 0 0   , B   , (3.3)    −  Z Z 1 0 0 1 0  1   1              where Z is usually unknown. Therefore the projective map may be represented by a matrix B where

f 0 0 0 − B = 0 f 0 0 . (3.4)  −  0 0 1 0   In addition to the projective map (3.4), image formation in digital cameras assumes a composition with matrices accounting for camera internal calibration parameters, such as pixel aspect ratio, skew parameter, origin of image coordinates in the principal plane (principal point) and for a change of coordinates between two camera positions involving a roto-translation (R,T ) SO(3) R3. The projective map of pinhole ∈ × camera image acquisition π˜, in homogeneous coordinates, is associated with the matrix:

ku kc u f 0 0 0 0 − R T B˜ = KintBC = 0 kv v 0 f 0 0 = KC, (3.5)  0  −  0T 1 0 0 1 0 0 1 0  3      −1 where ku and kv are scale factors of the image plane in units of the focal length f, and θ = cot kc is the skew, and (u0,v0) is the principal point. K contains internal parameters and perspective (3.4), while C contains the external parameters. The matrix B˜ can be decomposed into a 3 3 matrix P and a 3 1 vector × × p B˜ =( P p ) , so that P = AR and p = AT. (3.6)

Consider a pair of cameras viewing a point [p] RP 3. This point projects onto two image planes, first ∈ to m = [x ] RP 2 and then to m = [x ] RP 2. Since we are working in homogeneous coordinates, 1 1 ∈ 2 2 ∈ [u] is represented by a 4 1 column vector, and m , m are each represented by 3 1 column vectors. Also, × 1 2 × since our camera is uncalibrated, its internal parameters are unknown.

16 The epipolar constraint refers to the fact that the vector from the first camera’s optical center to the first imaged point, the vector from the second optical center to the second imaged point, and the vector from one optical center to the other are all coplanar. If we use only one coordinate system, say the coordinate system of the second camera, the vector from the first camera’s optical center to the first imaged point is T + Rx1, where T is the vector from one optical center to the other; and the vector from the second camera’s optical center to the point is x2. The change of coordinates between the Euclidean frames of the two cameras is given by a roto-translation (R,t) ∈ SO(3) R3. The epipolar constrain can be expressed via a zero exterior product x (T + Rx ) T = 0, × 2 ∧ 1 ∧ which is equivalent to xT (T (Rx )) = 0. (3.7) 2 × 1 Define T as the matrix associated with the linear operator y T y rewrite (3.7) as × → × T T x2 (T×(Ru1)) = x2 Ex1 = 0, (3.8) where E = T×R is the essential matrix. When the camera is uncalibrated, the matrices A1 and A2 from (3.6) containing the internal parameters of the two cameras are needed to transform the camera bound Euclidean coordinates into pixel coordinates:

v1 = A1x1, v2 = A2x2. (3.9)

This yields the equation:

(A−1v )T (T RA−1v )= vT A−1(T RA−1v ) = 0, (3.10) 2 2 × 1 1 2 2 × 1 1 then we obtain T v2 Fv1 = 0, (3.11)

−1 T −1 where F = (A2 ) EA1 is the fundamental matrix. F depends only on the relative position of the two cameras, as on their internal parameters. It has rank two, depending on seven real constants.

Matrix F can be estimated by the configurations of at least eight points, pa,i, a = 1, 2, i = 1,...,k, k ≥ 8, whose stacked homogeneous coordinates are the k 3 matrices ya, a = 1, 2. The linear system is × T y2 Fy1 = 0. (3.12)

Generally, f, the vectorized form of F , is more frequently used,

f T Y = 0. (3.13)

17 If k is large the linear homogeneous system is overdetermined and the optimal estimated solution fˆ can be obtained using a simple least squares algorithm by minimizing Y T f 2 subject to f = 1. k k k k The problem of the reconstruction of a configuration of points in 3D from two ideal noncalibrated camera images with unknown camera parameters, is equivalent to the following: given two camera images R 2 R 2 P1 , P2 of unknown relative position and internal camera parameters and two matching sets of labeled 2 points pa, ,...,pa,k RP ,a = 1, 2, find all the sets of points in space p ,...,pk such that there { 1 } ⊂ a 1 R 2 R 2 exist two positions of the planes P1 , P2 and internal parameters of the two cameras ca,a = 1, 2 with the property that the ca-image of pj is pa,j, a = 1, 2,j = 1,...,k. The reconstruction problem for two ∀ noncalibrated camera images has a solution in terms of the realization of the fundamental matrix F . Any two solutions can be obtained from each other by a projective transformation in RP 3. In absence of obstructions, the reconstructed 3D configuration ′ from a pair of matched configurations C in uncalibrated cameras views of a 3D configuration , and have the same projective shape. In other words, C C given a number of uncalibrated digital camera images of a 3D scene, the best one can do, is to retrieve the projective shape of that scene.

3.2.2 Examples of 3D Reconstruction

Figure 3.6 displays the 3D reconstructions of the polyhedral in Figure 2.3, based on the 2D coordinates in Table A.1. Here, the first and second images in Figure 2.3 are used to reconstruct the first image in Figure 3.6, and so on for the rest 14 images in Figure 2.3.

Figure 3.6: 3D Reconstructions for configurations of corners in Figure 2.3.

Next a reconstruction similar to that of the polyhedral is considered, and 816 landmarks are utilized to reconstruct the 3D structure of the leave contours in Figure 2.9.

18 Figure 3.7: Leaf A 3D reconstruction (top 10 figures) and leaf B 3D reconstruction (bottom 10 figures).

3.3 Projective Geometry 3.3.1 Methodology of Projective Geometry

The real projective space RP m is the set of the axes going through the origin of Rm+1. Unlike similar- ities or affine transformations or Rm, projective transformations (see (3.18)) do not have a group structure under composition, since the domain of definition of a projective transformation depends on the transfor- mation, and the maximal domain of a composition has to be restricted accordingly. To avoid such unwanted situations, rather than considering projective shapes of configurations in Rm, one may consider configura- tions in the real projective space RP m, with the projective general linear group (GL) action. If x =(x1,...,xm+1) Rm+1 0 , then ∈ \{ }

[x]=[x1 : x2 : ... : xm+1]= λx,λ = 0 (3.14) { 6 } is a projective point in RP m. Alternatively, a point p RP m is given by the normalized form p =[z1 : z2 : ∈ : zm+1], where m+1(zi)2 = 1. In particular, a projective line l is a set associated with a vector plane ··· i=1 V in Rm+1, l = [xP],x V 0 . A number of points in RP m are colinear if they lie on a projective line. { ∈ \ }

19 The Euclidean space Rm can be embedded in RP m, while preserving colinearity. Such a standard affine embedding, missing only a hyperplane at infinity, is

x =(x1,...,xm) [x1 : : xm : 1]. (3.15) → ···

Then for a point p = [X]=[X1 : : Xm : Xm+1] RP m, Xm+1 = 0, the affine coordinate ··· ∈ 6 (imhomogeneous coordinate) is

X1 Xm (x1,x2,...,xm)= ,..., . (3.16) Xm+1 Xm+1   conversely, the homogeneous coordinate of p is (X1,...,Xm+1), which is defined up to a multiplicative constant only. However, the coordinates of interest in projective shape analysis are neither affine nor ho- mogeneous. We need coordinates that are invariant with respect to the group of projective (general linear) transformationPGL(m).A projective transformation α of RP m is defined in terms of an (m+1) (m+1) × nonsingular matrix A GL(m + 1, R) by ∈

α([x1 : : xm+1])=[A(x1,...,xm+1)T ]. (3.17) ···

j There is an nonsingular matrix A = ((ai )i,j=1,..., m+1)) (j denotes for rows, i for columns), for which the projective transformation in Rm is given as

aj + m ajxi vj = m+1 i=1 i , j = 1, . . . , m. (3.18) m+1 m m+1 i ∀ am+1 + Pi=1 ai x The linear span of a subset of RP m is the smallestP linear variety containing that subset. Note that k points in RP m with k m + 2 are in general position if their linear span is RP m. ≥ A projective frame, also known as projective basis, in RP m is an ordered (m + 2) tuple of points π = − 2 (p1,...,pm+2), any m+1 of which are in general position. For example, in RP , 4 points define a projective frame, where any 3 of them can not be colinear; for RP 3 case, 5 points define a projective frame when any 4 of them are not coplanar. There is one-to-one correspondence between projective frame and projective

m+1 transformation. Let (e1,...,em+1) be the standard basis of R , then the standard projective frame is ([e ],..., [em ], [e + + em ]), where the last point is the unit point. Given a projective frame 1 +1 1 ··· +1 π =(p ,...,pm ), there is a unique α PGL(m) with α([ej]) = pj, j = 1, . . . , m + 1, α([e + + 1 +2 ∈ 1 ··· em+1]) = pm+2. Buibas el. al. (2012)[9] gave an example of projective data registration in 2D.

20 Figure 3.8: Original 2D scene and its registered projective coordinates in their spherical representation.

1 m Suppose x1,...,xm+2 are points in general position where x = (x ,...,x ) is an arbitrary point in Rm, set x˜ = (x1,...,xm, 1). In this notation, the superscripts are reserved for the components of a point, m whereas the subscripts are for the labels of points. As such (x1,...,xm+2) is a projective frame in R only if m (˜x ,..., x˜m ) is the projective frame in RP . Consider the (m+1) (m+1) matrix Um = [˜x ,..., x˜m ], 1 +2 × 1 +1 T whose jth column is x˜j = (xj, 1) , j = 1, . . . , m + 1. Define an intermediate system of homogeneous coordinates −1 v(x)= Um x,˜ (3.19) and write v(x)=(v1(x),...,vm+1(x))T . Set

j j j y (x)= v (x)/v (xm+2), j = 1, . . . , m + 1, (3.20)

zj(x)= yj(x)/ y(x) , j = 1, . . . , m + 1, (3.21) k k where [z1(x): z2(x): : zm+1] is the projective axis of a point x. If zm+1(x) = 0, the affine representa- ··· 6 tive of this point with respect to the last coordinate is (ξ1(x),...,ξm(x)), where

zj(x) ξj(x)= ,j = 1, ..., m. (3.22) zm+1(x)

21 Figure 3.9: Projective coordinates for m=2.

By way of example, in Figure 3.9, there are 5 points p ,...,p R2, we choose projective frame 1 5 ∈ T π = (p1,...,p4), where each of the three points are not colinear, and p˜j = (pj, 1) , j = 1,..., 5. [˜p , p˜ , p˜ ] construct the 3 3 matrix 1 2 3 × 1 2 4 U = 3 2 5 ,   1 1 1   then by (3.19), (3.20), (3.21) and (3.22) one could obtain the affine coordinate for each point. Also after the projective transformation, the projective frame becomes the standard basis.

Let G(k, m) denote the set of all ordered systems of k points (p1,p2,...,pk) for which p1,p2,...,pm+2 is a projective frame, k > m + 2. PGL(m) acts on G(k, m) by α(p1,p2,...,pk)=(αp1,...,αpk). The projective shape space P Σk is a manifold homeomorphic with (RP m)q, where q = k m 2. Assume m − − x(t),t I is a curve in Rm, such that t I, zm+1(x(t)) = 0. Such framed curves will be said to be in a ∈ ∀ ∈ 6 convenient position relative to the projective frame π associated with (x1,...,xm+2).

THEOREM 3.3.1. (Faugeras (1992)[16], Hartley et al. (1992)[22]) There is a one to one correspondence between the set of projective shapes of framed curves x(t),t I in a convenient position relative to π, and ∈ curves in Rm. In this correspondence, framed closed curves in a convenient position relative to π correspond to closed curves in Rm.

For instance, in two dimensions we consider framed closed curves in the planar projective plane, avoid- ing a projective line. In particular if we assume that the (x(t),y(t)),t [0, 1] is a closed planar curve, ∈ then [x(t) : y(t) : 1],t [0, 1] is such a projective curve, and using a projective frame π we associate ∈ with this curve the affine affine representative (ξ(t),η(t)),t [0, 1] of its curve of projective coordinates ∈ [x(t) : y(t) : 1]π, which yield another planar curve. If two curves are obtained from a planar curve viewed from different perspective points, then the associated affine curves are the same. This affine representative of the projective curve of a ( closed ) curve is used in this dissertation. Here we are concerned with closed curve recognition, that is a closed curve

γ(t)=(ξ(t),η(t)),t [0, 1], (ξ(0),η(0)) = (ξ(1),η(1)) (3.23) ∈

22 that is observed with random errors

Γ(t)=(ξ(t),η(t))+(ǫX (t),ǫY (t)),t [0, 1], (3.24) ∈ where ǫX (t),ǫY (t) are stochastic independent error processes, (ǫX (0),ǫY (0)) = (ǫX (1),ǫY (1)). so that the observed curve can, for instance be considered as a random element of the Hilbert space H = L2(S1, R2). The problem of the reconstruction of a configuration of points in 3D from two ideal uncalibrated ca- mera images with unknown camera parameters, is equivalent to the following: given two camera images R 2 R 2 P1 , P2 of unknown relative position and internal camera parameters and two matching sets of labeled 2 points pa, ,...,pa,k RP ,a = 1, 2, find all the sets of points in space p ,...,pk in such that there { 1 } ⊂ a 1 R 2 R 2 exist two positions of the planes P1 , P2 and internal parameters of the two cameras ca,a = 1, 2 with the property that the ca-image of pj is pa,j, a = 1, 2,j = 1,...,k. ∀

m DEFINITION 3.3.1. Two sets of labeled points pa, ,...,pa,k RP ,a = 1, 2, have the same projec- { 1 } ⊂ a m m tive shape if there is a projective transformation β : RP RP , such that β(p ,j)= p ,j, j = 1,...k. → 1 2 ∀ The reconstruction algorithm was therefore reformulated as follows by Patrangenaru et al. (2010)[42]:

THEOREM 3.3.2. (Patrangenaru et al. (2010)[42]) In absence of occlusions, any two 3D reconstructed configurations , ′ obtained from a pair of 2D matched configurations in uncalibrated cameras images of R R a 3D configuration , have the same projective shape. C Note that the solution of the reconstruction problem, from a pair of 2D images depends on a landmark correspondence. Kent and Mardia (2012)[29] considered a different approach to the projective shape analysis, which is an extension of statistical shape analysis of Dryden and Mardia (1998)[11]. They use the cross-ratio τ to T identify the feature of the landmarks that are projective invariant. For 4 landmarks x = (x1, x2, x3, x4) in 1 dimension, their cross-ratio is

(x x )(x x ) τ = τ(x)= 2 − 1 4 − 3 . (3.25) (x x )(x x ) 3 − 1 4 − 2 In order to compare two different projective shapes quantitatively, it is important to have a concept of distance. However, the Euclidean distance between cross-ratios is not an efficient, because it is sensitive to

23 the landmark relabeling. Kent and Mardia also defined the Procrustes distance between the shapes of any two configuratoins, which is invariant under relabeling the points. There are 3 steps to define the distance. Step 1 Standardization. Suppose X is a k m configuration matrix, there is p = m + 1 that extend 0 × X to a m p augmented configuration X˜ = [X 1k], denote X as the configuration after standardization, 0 × 0 which is known as Taylor standardization.

DEFINITION 3.3.2. An augment configuration matrix X˜ is Tyler regular if vector space V Rp, with ∀ ⊆ 1 dim(V ) p, the number of points from X lying in V is less than (k/p)dim(V ). ≤ ≤ If X˜ is in general position, it is Tyler regular.

THEOREM 3.3.3. (Kent and Mardia (2012)[29]) If X˜ is Tyler regular, it is possible to choose B and Q so that X = DXB˜ satisfies k xT x =1(i = 1,...,k), XT X = I . (3.26) i i p p A configuration matrix satisfying (3.26) is Tyler standardized. And for any two configuration matrices,

X1 and X2 are related by

X2 = SX1Q, (3.27) where the sign matrix S is a diagonal matrix with 1 in each diagonal entry representing a sign change for ± each row, and Q (p p) is an orthogonal matrix. Such standardized representations are known as preshapes. × Step 2 Embedding. An embedding is a map from the space of preshapes to a Euclidean space that removes either some, in the case of a partial embedding, or all, in the case of a full embedding, of the transformation parameters remaining after Step 1. An embedding must be injective so that preshapes from different equivalence classes remain different after embedding. Kent and Mardia defined partial embedding via inner products, angle doubling, square and absolute inner products. Different kinds embedding lead to various Procrustes distances. Step 3 Optimization. The Euclidean distance between two embedded preshapes is minimized over any transformation parameters remaining after Step 2. One example of the Procrustes distances is by the inner product embedding, suppose M = XXT is a (k k) matrix, the inner product distance is ×

2 p 2 p dM = min M1 SM2S = k tr(M1SM2S). (3.28) 2k S k − k − k

24 Note that other projective Procrustes distances defined in Ken and Mardia requires the minimization of the space of orthogonal matrices, as well as over the space of sign matrices, and as such are computationally intensive to compute. While Mardia and Patrangenaru analyzed the projective shape by projective frame, Kent and Mardia analyzed it with the cross-ratio and Procrustes distance, which is named the geometric approach. The ad- vantages of the projective frame comparing to the geometric approach are (1) they are easy to compute; (2) the points after projective transformation are visualizable; (3) it is easier to add additional landmarks. Consequently, the main disadvantage of the projective frame approach is that when the landmarks are re- labeled, it may fail. For example, for points (x ,...,x ) R2 0 , suppose π = (x ,...,x ) is the 1 5 ∈ \{ } 1 4 projective frame, which means any three of the points are not collinear. When we relabel the points, the new x1new ,...,x4new is possibly x1,x2,x4,x5 with the previous label. Maybe there are three of the points are collinear, which make the four points not suitable to construct a projective frame.

3.3.2 Example of Projective Transformation

In this example, landmark 1, 102, 360, 596 and 767 of the leaf contours are selected as the projec- tive frame, see Figure 3.10. Comparing with the 3D reconstruction of the leaf contour in Figure 3.7, the projective transformed 3D contours have more similar shape to the original leaf contours.

25 Figure 3.10: Leaf A (top 10 figures) and leaf B (bottom 10 figures) contours after projective transformation.

26 CHAPTER 4

STATISTICS ON MANIFOLDS

4.1 Veronese-Whitney Means

A finite dimensional differentiable manifold is a metric space M with the following property: if x M, ∈ then there is some neighborhood U of x and some integer m 0 such that U is homeomorphic to Rm and ≥ the transition maps are differentiable [49].

DEFINITION 4.1.1. A chart on a separable metric space ( , ρ) is a one to one ϕ : U M → ϕ(U) defined on an open subset U of to a Hilbert space H. A Hilbert manifold is a separable metric M −1 space , that admits an open covering by domain of charts, such that the transition maps ϕV ϕ : M ◦ U ϕU (U V ) ϕV (U V ) are differentiable. ∩ → ∩ Assume J : M RN is an embedding of the d-dimensional complete manifold M. In [6] Bhattacharya → and Patrangenaru defined the extrinsic mean µJ of J-nonfocal random object Y on M is defined as

−1 µJ =: J (PJ (µ)), (4.1)

c where µ = E(J(Y )) is the mean vector of J(Y ) and PJ : F J(M) is the ortho-projection on → J(M) defined on the complement of the set F of focal points of J(M). If f : M M is a 1 → 2 differentiable function defined from the manifold M to the manifold M and x M , the differential of 1 2 ∈ 1 the function f at x is labeled Dxf. Consequently, the extrinsic covariance matrix of Y was defined by

Bhattacharya and Patrangenaru (2005) [6], with respect to a local frame field y (f (y),...,fd(y)) for → 1 N which (DyJ(f1(y),...,DyJ(fd(y))) are othonormal vectors in R . The extrinsic covariance mathrix of Y with respect to (f1(µJ ),...,fd(µJ )) is

d d T ΣE = DµPJ (eb) ea(PJ (µ))ea(PJ (µ)) Σ DµPJ (eb) ea(PJ (µ))ea(PJ (µ)) , · · "a=1 #b=1,...,N "a=1 #b=1,...,N X X (4.2) where Σ is the covariance matrix of J(Y ); PJ is differentiable at µ; (e1(p),e2(p)....eN (p)) is a local N orthoframe field defined on an open neighborhood U R of PJ (M), which is adapted to the embedding ⊆

27 −1 k m q J if y J (U), er(J(y)) = DyJ(fr(y)), r = 1,...,d. Since P Σ is homeomorphic to (RP ) , q = ∀ ∈ m k m 2, and RP m is equivariantly embedded in the space S(m + 1) of (m + 1) (m + 1) symmetric − − × matrices via j : RP m S(m + 1), → j([x]) = xxT . (4.3)

Mardia and Patrangenaru (2005) [33] expended j to jk, where

k m q q J = jk : P Σ = (RP ) (S(m + 1)) m ∼ → defined by

jk([x1],..., [xq])=(j([x1]),...,j([xq])), (4.4)

m+1 T where xs R , x xs = 1, s = 1,...,q. ∈ s A random projective shape Y of a k-ad in RP m is given in axial representation by the multivariate random axes

(Y 1,...,Y q), Y s =[Xs], (Xs)T Xs = 1, s = 1,...,q = k m 2. (4.5) ∀ − −

Then the extrinsic mean projective shape of (Y 1,...,Y q) exists if s = 1,...,q, the largest eigenvalue of ∀ s s T E(X (X ) ) is simple. The mean µjk is given by

µjk = ([γ1(m + 1)],..., [γq(m + 1)]), (4.6) where λs(a) and γs(a), a = 1, . . . , m + 1, are the eigenvalues in increasing order and the corresponding unit eigenvector of E(Xs(Xs)T ).

Let Yr, r = 1,...,n be the independent identically distributed random objects (i.i.d.r.o’s) from a pop- ulation of projective shapes (in the multi-axial representaion), for which the mean shape µjk exists. From ¯ [6], the extrinsic sample mean Yjk,n is a strongly consistent estimator of µjk . In the multivariate axial representation, 1 q s T s Yr = ([Xr ,...,Xr ]), (Xr ) Xr = 1, s = 1,...,q. (4.7)

Let Js be the random symmetric matrix given by

n −1 s s T Js = n Xr (Xr ) , s = 1,...,q. (4.8) r=1 X

28 Figure 4.1: Extrinsic mean.

For Js, let ds(a) and gs(a) be the eigenvalues in increasing order and the corresponding unit eigenvector of

Js, a = 1, . . . , m + 1, and ¯ Yjk,n = ([g1(m + 1)],..., [gq(m + 1)]). (4.9)

Figure 4.1 illustrates the fact that points Xs are projected to an Euclidean matrix space by j, for which one compute the mean of E(j(X)) = E(Xs(Xs)T ), where the extrinsic mean is the eigenvector which corresponds to the largest eigenvalue of E(j(X)). The extrinsic sample covariance matrix is a (mq) (mq) symmetric matrix × n −1 −1 −1 T s T t T s T t G = n (ds(m+1) ds(a)) (dt(m+1) dt(b)) (gs(a) X )(gt(b) X )(gs(m+1) X )(gt(m+1) X ), n(s,a)(t,b) − − × r r r r r=1 X (4.10) where s,t = 1,...,q; a,b = 1, . . . , m. The extrinsic sample covariance matrix is the estimator of extrinsic covariance ΣE.

29 4.2 Hilbert Manifold

Let (Ω, , P) be an underlying probability space, H a separable Hilbert space over the real numbers W with inner product , and norm , and H the σ-field generated by the open subsets of H. A random h• •i ||•|| B element in H is a mapping X : Ω H which is ( , H ) - measurable. Let us write PX = P for the → W B induced probability measure on (H, H). B The probability measure P is uniquely determined by its characteristic functional

P˜(x)= E eihx,Xi = eihx,yiP (dy),x H. (4.11) H ∈ Z DEFINITION 4.2.1. A function f defined on an U of a Hilbert space H is Frechet´ differentiable at a point x U, if there is a linear bounded operator T : H H, such that if we set ∈ →

ωx(h)= f(x + h) f(x) T (h), − − then

ωx(h) lim k k = 0 h→0 h k k

T in Definition 4.2.1 is unique, it is called the differential of f at x and is denote by dxf.

DEFINITION 4.2.2. An embedding of a Hilbert manifold in a Hilbert space H is a one-to-one differ- M entiable function j : H, such that for each x , the differential dxj is one to one, and the range M → ∈ M j( ) is a closed subset of H and the topology of is induced via j by the topology of H. M M

We consider the Veronese-Whitney (VW) embedding j : P (H) in HS = H H, introduced in the L ⊗ finite dimensional case by Kent (1992)[30], given by

1 j([γ]) = γ γ∗, [γ] P (H). (4.12) γ 2 ⊗ ∈ k k DEFINITION 4.2.3. If j : H is an embedding and given a random object X on , the associated M → M 2 Frechet´ function is j(x) = E( j(X) j(x) ). The set of all minimizers of j is the extrinsic mean set F k − k F of X. If the extrinsic mean set has one element only, that element is called the extrinsic mean and is labeled

µj.

30 LEMMA 4.2.1. (Patrangenaru (1998)[46]) Consider a random object X on and assume j(X) has the M mean vector µ. Then the extrinsic mean set is the set of all points x , such that j(x) is at minimum ∈ M distance from µ. (iii) In particular, µj exists if there is a unique point on j(M) at minimum distance from −1 µ, the projection Pj(µ) of µ on j(M), and in this case µj = j (Pj(µ)).

The Veronese-Whitney mean ( VW mean) is the extrinsic mean for a random object X = [Γ] on P (H) with respect to the VW embedding, and it exists if and only if E( 1 Γ Γ∗) has a simple largest eigenvalue. kΓk2 ⊗ In this case, the VW mean is µj =[γ], where γ is an eigenvector for this eigenvalue.

31 CHAPTER 5

HYPOTHESIS TESTS FOR FINITE DIMENSIONAL CONFIGURATIONS

Most organisms including insects, cephalopods, and vertebrates have bilateral vision. Light from the sur- rounding environment is gathered and projected onto two retinas, where it is translated into electrical im- pulses and sent to brain regions for further processing. The mechanism of 3D vision within the bounded domain of a brain or of a machine is very complex and, until recently, has been poorly understood. In the Geometry and Optics, progress has been made with results in Computer Vision by Longuet-Higgins (1981)[31], Faugeras (1992)[16], Hartley et al. (1992)[22] and others. These results, and human visual perception led Patrangenaru et al. (2012)[38] to conclude that all we see are 3D projective shapes. As a follow up to this result, the main objective of this chapter is to develop one- and two-sample tests statistics for extrinsic mean 3D projective shapes, based on data extracted from digital camera images. Our methodology is a form nonparametric bootstrap on projective shape spaces.

5.1 One-sample Hypothesis Test for Finite Dimensional Configurations

Asymptotic test statistics for the equality of the extrinsic means of independent random objects on a manifold embedded in the Euclidean space on manifolds were first considered in Hendriks and Landsman (1998)[24] and more recently in Bhattacharya (2008)[5] and Bhattacharya and Bhattacharya (2011)[4]. In an attempt to address matched pairs, Bhattacharya (2008)[5] also derived a large sample test statistic under the more general assumption of correlated pairs of r. o.s when the sample sizes are equal.

5.1.1 Methodology of One-sample Hypothesis Test for Finite Dimensional Configurations

Consider the simple one-sample hypothesis test for µ

H : µj = µ vs. Ha : µj = µ . (5.1) 0 k 0 k 6 0

2 Set Ds =(gs(1),...,gs(m)), s = 1,...,q, the Hotelling’s T -type statistic is

¯ T T −1 T T T T (Yjk,n; µ)= n(γ1 D1,...,γq Dq)Gn (γ1 D1,...,γq Dq) . (5.2)

32 ¯ 2 From [42], T (Yjk,n; µ) converges weakly to χmq. 1 q s T s Let Yr = ([Xr ],..., [Xr ]), (Xr ) Xr = 1, s = 1,...,q, r = 1,...,n be the i.i.d.r.o’s from a jk- m q nonfocal distribution on (RP ) which has a nonzero absolutely continuous component, and with ΣE > 0. ∗ ∗ 1 n s∗ s∗ T For a random resample (Y1 ,...,Yn ) from (Y1,...,Yn), consider the eigenvalues of n r=1 Xr (Xr ) ∗ ∗ ∗ in increasing order and corresponding unit eigenvalues ds(a) and gs (a), a = 1, . . . , mP+ 1. Let Gn be the matrix obtained from Gn, by substituting all the entries with *-bootstrap entries. Then the bootstrap distribution function of the statistic

¯ ∗ ¯ T ∗ T ∗ ∗−1 T ∗ T ∗ T T (Yjk ; Yjk )= n(g1(m + 1) D1,...,gq(m + 1) Dq )Gn (g1(m + 1) D1,...,gq(m + 1) Dq ) (5.3)

¯ −2 approximates the true distribution of T (Yjk,n; µ) in (5.2), with an error order op(n ). From bootstrap, the

100(1 α)% confidence region for µj is − k

∗ −1 ∗ Cn,α := jk (Un,α), (5.4)

∗ with Un,α given by ∗ m q ∗ U = µ jk((RP ) ): T (Y¯j ,n; µ) c , (5.5) n,α ∈ k ≤ 1−α ∗ ∗ −2 where c is the upper 100(1 α)% point of the values of T (Y¯ ; Y¯j ) with order op(n ). 1−α − jk k If ΣE is singular and all the marginal axial distributions have positive definite extrinsic covariance matri- 1 q s T s ces, one may use simultaneous confidence ellipsoids to estimate µjk . Let Yr = ([Xr ],..., [Xr ]), (Xr ) Xr = m q 1, s = 1,...,q, r = 1,...,n be the i.i.d.r.o’s from a jk-nonfocal distribution on (RP ) . For each s ¯ s s = 1,...,q, let Σs be the extrinsic covariance matrix of Yr and let Yj,n and Gs,n be the extrinsic sample mean and the the extrinsic sample covariance matrix of the s-th marginal axial and assume the probability s R m measure of Yr has a nonzero-absolutely continuous component w.r.t the volume measure on P . Then one may consider ¯ s∗ T −1 T Ts = Ts(Yj,n, [γs]) = nγs DsGs,nDs γs, (5.6) and the corresponding bootstrap distribution:

∗ ¯ s∗ ¯ s T ∗ ∗−1 ∗T Ts = Ts(Yj , Yj,n)= ngs(m + 1) Ds Gs,n Ds gs(m + 1). (5.7)

2 Ts has asymptotically a χm distribution. Now let c∗ be the upper 100(1 β)% point of value of T ∗ in (5.7). Set s,1−β − s

∗ m q s ∗ C := µ (RP ) , µ =(µ ,...µq): Ts(Y¯ ; µs) c (5.8) s,n,β { ∈ 1 j,n ≤ s,1−β}

33 Then q ∗ ∗ Rn,α = C α (5.9) s,n, q s=1 \ −2 is the region of approximately at least 100(1 α)% confidence for µj , with convergence error op(n ). − k 5.1.2 Example - Analysis of Image Data for One Polyhedral

In quality control, there is always a blueprint of an object, and one has to test if a final product has the same length, width and height as the blueprint. This application illustrates that there is no need to physically measure a finished product to assess whether it is close enough to its blueprint. There are 16 polyhedral images, taken from different positions; 1 blueprint(Figure 2.1) for a polyhedral; and 19 landmarks selected for the hypothesis test (Figure 3.1). The coordinates of the blueprint are given in

Table 5.1, denote as µ0. Using the Ma et al. (2006)[32] reconstruction algorithm to obtain the 3D projective reconstruction, we select landmarks 8, 12, 17, 18 and 19 to construct the projective frame, the rest landmarks yield a sample points in (RP 3)14, where 14 = 19 3 2. The goal of the hypothesis test is if the polyhedral − − in the 16 images is identical to the blueprint. This yields the null and alternative hypothesis as

H : µj = u vs. Ha : µj = u . (5.10) 0 19 0 19 6 0 we compute the extrinsic sample mean for 19 landmarks of the 16 polyhedral images, and using nonpara- metric bootstrap to achieve a reliable conclusion with 20,000 resamples from the origin sample. In [42], Patrangenaru el. al. presented the confidence intervals for the 14 landmarks after the projective transfor- mation. For α = 0.05, µ0 is in the 95% confidence region for µj19 if for each s = 1,..., 14, the value ¯ s of Ts = T (Yj,n; µ0) in (5.2), corresponding to the s-th marginal is between the 72nd ranked and the 19 ∗ 928th ranked observation of the corresponding bootstrap distribution (values of degenerated Gs,n have been omitted). The results are as follows:

Table 5.1: Coordinates of the blueprint.

Landmark No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 x 0 4 4 0 4 4 0 6 6 0 6 6 0 10 10 0 10 10 0 y 0 0 4 4 0 4 4 0 6 6 0 6 6 0 10 10 0 10 10 z 20 20 20 20 16 16 16 16 16 16 10 10 10 10 10 10 0 0 0

34 ∗ -First marginal (Landmark 1): T1 = 3.0279647168E + 00 is between 6301 (T = 3.0243210949E + 00) and 6302 (T ∗ = 3.0294218108E + 00). ∗ -Second marginal (Landmark 2): T2 = 2.6459766362E +00 is between 3942 (T = 3.0243210949E +00) and 3943 (T ∗ = 2.6920988816E + 00). -Third marginal (Landmark 3): T = 1.5175491E 01 is between 397 (T ∗ = 1.4510789E 01) and 398 3 − − (T ∗ = 1.5271147E 01). − ∗ -Fourth marginal (Landmark 4): T4 = 3.7407490E + 00 is between 7379 (T = 3.7288447E + 00) and 7380 (T ∗ = 3.7464786E + 00). ∗ -Fifth marginal (Landmark 5): T5 = 2.6168385E + 00 is between 5355 (T = 2.6166643704E + 00) and 5356 (T ∗ = 2.6216985741E + 00). ∗ -Sixth marginal (Landmark 6): T6 = 1.7898784E + 00 is between 3294 (T = 1.7859106E + 00) and 3295 (T ∗ = 1.7914946E + 00). ∗ -Seventh marginal (Landmark 7): T7 = 3.9364703E + 00 is between 7194 (T = 3.9191776E + 00) and 7195 (T ∗ = 3.9388019E + 00). ∗ -Eighth marginal (Landmark 9): T8 = 1.5700171E + 00 is between 4432 (T = 1.5687626E + 00) and 4433 (T ∗ = 1.5748148E + 00). ∗ -Nineth marginal (Landmark 10): T9 = 5.0491394E + 00 is between 8507 (T = 5.0407173E + 00) and 8508 (T ∗ = 5.0521943E + 00). ∗ -Tenth marginal (Landmark 11): T10 = 1.3735517E + 01 is between 15155 (T = 1.3706638E + 01) and 15156 (T ∗ = 1.3750192E + 01). ∗ -Eleventh marginal (Landmark 13): T11 = 2.0352336E + 00 is between 4198 (T = 2.0327469E + 00) and 4199 (T ∗ = 2.0412641E + 00). ∗ -Twelfth marginal (Landmark 14): T12 = 3.9488573860E + 00 is between 6837 (T = 3.9442688E + 00) and 6838 (T ∗ = 3.9573099E + 00). ∗ -Thirteenth marginal (Landmark 15): T13 = 3.6973595941E+00 is between 7857 (T = 3.6946285E+00) and 7858 (T ∗ = 3.6986865E + 00). ∗ -Fourteenth marginal (Landmark 16): T14 = 2.8730067E + 00 is between 5065 (T = 2.8723605E + 00) and 5066 (T ∗ = 2.8770536E + 00).

These results show that we fail to reject H0 for any reasonable level α, thus proving that the projective shape of the object is following the projective shape of the blueprint closely.

35 5.2 Two-sample Hypothesis Test on Lie Group 5.2.1 Methodology of Two-sample Hypothesis Test on Lie Group

For the two-sample hypothesis test, the methodology is different to the regular multivariate case. Recall that a group ( , ) that has in addition an m dimensional manifold structure, such that the group multiplica- G ⊙ tion : , and the inversion I : ,I(g)= g−1 are differentiable maps between manifolds, is ⊙ G×G →G G→G called a Lie group. For a comprehensive introduction to manifolds and Lie groups, including tangent spaces and tangent maps associated to differentiable maps between manifolds, embedding, the tangent and normal component of a vector, relative to the tangent space at the point of a manifold, vector field on a manifold, Lie algebra of a Lie group, exponential map in a Lie group, et see Spivak (2005)[49]. Recall that a local frame field on an m dimensional manifold , is an ordered set of local vector fields (e ,...,em) on an open G 1 subset U, ei : U T U, such that for any point u U, (e (u),...,em(u)) is a basis of TuU = Tu . → ∈ 1 G For a large sample of observations from a matched pair (X,Y ) of random vectors in Rm, one may estimate the difference vector D = Y X to eliminate much of the influence of extraneous unit to unit − variation (Johnson and Wichern (2007, p. 274)), without increasing the dimensionality. Crane and Patrange- naru (2011)[10] extended this technique to paired r.o.’s on an embedded Lie group that is not necessarily commutative. Assuming X and Y are paired r.o.’s on a Lie group ( , ). The change from X to Y was G ⊙ defined to be r. o. C =: X−1 Y. A test for no mean change from X to Y is one for the null hypothesis ⊙

H0 : µJ = 1G, (5.11)

N where 1 is the identity of and µJ is the extrinsic mean of C with respect to an embedding J : R , G G G → −1 which is given by (4.1) as µJ =: J (PJ (µ)). Here µ is the mean of J(C) and P (µ) is its projection onto J( ), the point on J( ) that is closest to µ, which is assumed to be unique (C is J-nonfocal). We assume G G J(C) has finite moments of sufficiently high order. If C ,...,Cn are i.i.d.r.o. s on , their extrinsic sample 1 G mean is the extrinsic mean of their empirical distribution. Given an embedding J : RN , we consider the chord distance on , given by d(x ,x ) = M → M 1 2 RN d0(J(x1),J(x2)), where d0 is the Euclidean distance in . If Xaja : ja = 1,...,na,a = 1, 2 are i.i.d.r.o.’s drawn from distributions Qa,a = 1, 2 on a , if we denote by µa the mean of the induced M −1 probability Qa J and by Σa its covariance matrix (a = 1, 2), then the extrinsic mean of Qa is µa,J = ◦ −1 −1 J (P (µa)), assuming Qa is J-nonfocal, and the extrinsic sample mean is X¯a,J = J (P (Y¯a)). Here,

36 N again, P is the projection from R to J( ), and we write Yaj = J(Xaj )ja = 1,...,na,a = 1, 2 then M a a Y¯a,a = 1, 2 is the corresponding sample mean.

DEFINITION 5.2.1. An action of a Lie group on a manifold , is a differentiable function α : G M G×M → , such that M

α(1 ,x)= x, x , G ∀ ∈M α(g,α(h,x)) = α(g h,x), g , h , x . (5.12) ⊙ ∀ ∈G ∀ ∈G ∀ ∈M

has a simply transitive Lie group of isometries , if there is a Lie group action α : by M G G×M→M isometries with the property that given x , for any object y , there is a unique g such that ∈ M ∈ M ∈ G α(g,x)= y.

Then let H : 2 , defined by M →G

x1 −1 H(x1,x2)=(α ) (x2). (5.13)

For the projective points, a chart ϕ can be taken to be

ϕ([x1],..., [xq])=(ϕm+1([x1]),...,ϕm+1([xq])). (5.14)

THEOREM 5.2.1. (Patrangenaru et al. (2014)[45]) Assume Xa,ja ,ja = 1,...,na are identically inde- pendent distributed random objects (i.i.d.r.o.’s) from the independent probability measures Qa,a = 1, 2 with finite extrinsic moments of order s,s 4 on the m dimensional manifold on which the Lie group acts ≤ M G n1 simply transitively. Let n = n1 + n2 and assume lim π (0, 1). Let ϕ : g and Lδ be respec- n→∞ n → ∈ →G tively, a chart with ϕ(1 ) = 0g, and the left translation by δ . Then under H , G ∈G 0 (i.) The sequence of random vectors

−1 √n(ϕ L (H(X¯n ,J , X¯n ,J ))) (5.15) ◦ δ 1 2 converges weakly to m(0m, ΣJ ), for some covariance matrix ΣJ that depends linearly on the extrinsic N covariance matrices Σa,J of Qa,a = 1, 2.

(ii.) If (i.) holds and ΣJ is positive definite, then the sequence

−1 T −1 −1 n(ϕ L (H(X¯n ,J , X¯n ,J ))) Σ (ϕ L (H(X¯n ,J , X¯n ,J ))) (5.16) ◦ δ 1 2 J ◦ δ 1 2

2 converges weakly to χm distribution.

37 If the distributions are unknown and the samples are small, an alternative approach is to use Efron’s nonparametric bootstrap (Efron (1982)[13]).

THEOREM 5.2.2. (Patrangenaru et al. (2014)[45]) Under the hypotheses of Theorem 5.2.1i., assume in addition, that for a = 1, 2 the support of the distribution of Xa,1 and the extrinsic mean µa,J are included in the domain of the chart ϕ and ϕ(Xa,1) has an absolutely continuous component and finite moments of sufficiently high order. Then the joint distribution of

−1 V = √n(ϕ L (H(X¯n ,J , X¯n ,J ))) ◦ δ 1 2 can be approximated by the bootstrap joint distribution of

V ∗ = √n(ϕ L−1(H(X¯ ∗ , X¯ ∗ )) (5.17) ◦ δ n1,J n2,J

− 1 ∗ with an error O 2 where, for , ¯ are the extrinsic means of the bootstrap resamples p(n ), a = 1, 2 Xna,J ∗ given Xa,ja ,ja = 1,...,na, Xa,ja ,ja = 1,...,na.

Note that, as shown by Crane and Patrangenaru (2011)[10], unlike in other dimensions, the projec- tive shape manifold P Σk,k 5, has a Lie group structure, derived from the quaternion multiplica- 3 ≥ tion. Recall that if a real number x is identified with (0, 0, 0,x) R4, and if we label the quadruples ∈ (1, 0, 0, 0), (0, 1, 0, 0), respectively (0, 0, 1, 0) by −→i , −→j , respectively −→k, then the multiplication table given by

−→i −→j −→k ⊙ −→i -1 −→k −→j − −→j −→k -1 −→i − −→k −→j −→i -1 − where a b product of a on the first column with b on the top row, is listed on the row of a and column of b, ⊙ extends by linearity to a multiplication of R4. Note that (R4, +, ) has a structure of a noncommutative ⊙ ⊙ field, the field of quaternions, usually labeled by H. Note that if h,h′ H, then h h′ = h h′ , and ∈ k ⊙ k k kk k the three dimensional sphere inherits a group structure, the group of quaternions of norm one. Moreover, since RP 3 is the quotient S3/x x ∼−

[x] [y]=:[x y], (5.18) ⊙ ⊙

38 is a well defined Lie group operator on RP 3, called the group of p-quaternions. Note that if h = t + x−→i + y−→j + z−→k, its conjugate is h¯ = t x−→i y−→j z−→k, and the inverse of h is given by − − − h−1 = h −2h,¯ (5.19) k k As manifold, P Σk is diffeomorphic to (RP 3)q, where q = k 5. With this identification, P Σk (RP 3)q 3 − 3 ∼ inherits a Lie group structure from the group structure p-quaternions RP 3 with the multiplication given by

′ ′ ′ ′ ′ ′ ([h ],..., [hq]) ([h ],..., [h ]) := ([h ] [h ],..., [hq] [h ]) = ([h h ],..., [hq h ]). (5.20) 1 ⊙ 1 q 1 ⊙ 1 ⊙ q 1 ⊙ 1 ⊙ q

The identity element is given by 1(RP 3)q =([0:0:0:1],..., [0 : 0 : 0 : 1]), and given a point 3 q −1 h = ([h ],..., [hq]) (RP ) , from (5.19), its inverse is h = h = ([h¯ ],..., [h¯q]). 1 ∈ 1 For the two-sample hypothesis test, let element ([0 : 0 : 0 : 1],..., [0 : 0 : 0 : 1]) in (RP 3)q, be the unit 3 q in our group and be labeled 1q. Given two paired r.o.’s, H1,H2 in their spherical representation on (RP ) , ¯ we set Y = H1H2, and let µjk be the extrinsic mean of Y. Then testing the existence of mean 3D projective shape change from H1 to H2 amounts to the hypothesis testing problem

H : µj = 1q vs. H : µj = 1q. (5.21) 0 k 1 k 6 3 q Assume (H1,r,H2,r)r=1,...,n are i.i.d.r.o.’s from paired distributions on (RP ) , such that Y1 = H¯1,1H2,1 3 q has a jk-nonfocal probability distribution on (RP ) . Testing hypothesis 5.21 in the case m = 3, at level

α, amounts to finding a 1 α confidence region for µj as in one-sample hypothesis test case, and, if − k the sample is small and the extrinsic sample covariance matrix is degenerate, checking if 1q is in a 1 α − α confidence region, amounts to finding the upper q cutoffs for the bootstrap distributions of the test statistics ∗ T ,s = 1,...,k 5, and checking if the values of Ts, for µj = 1q are all in the corresponding confidence s − k intervals. That is

REMARK 5.2.1. For m = 3,s = 1,...,q = k 5 let c∗ be the upper 100(1 β)% point of the values − s,1−β − ∗ of Ts given by (5.7). We set ∗ −1 ∗ Cs,n,β := j (Us,n,β) (5.22) where ∗ 3 s ∗ U = µs RP : Ts(Y ; µs) c . (5.23) s,n,β { ∈ j,n ≤ s,1−β} Then q ∗ ∗ Rn,α = C α , (5.24) s,n, q s=1 Y

39 with C∗ ,U ∗ given by (5.22)-(5.23), is a region of approximately at least 100(1 α)% confidence s,n,β s,n,β − for µjk . Then we fail to reject at level α the hypothesis that there is a nontrivial mean change in the 3D ∗ projective shapes H ,H if 1q R . 1 2 ∈ n,α However, for the two-sample hypothesis test, the sample sizes are not necessary to be the same. In a more general test H0 : µ1,jk = µ2,jk , we could consider its equivalent testing problem

−1 H0 : µ µ2,j = 1q. (5.25) 1,jk ⊙ k

If the sample sizes n1,n2 are small, it suffices to compute the bootstrap distribution of

∗ ¯ ∗ ¯ ∗ D = ϕ(H(Xn1,jk , Xn2,jk )), (5.26) where H(h,k)= h−1 k and ϕ is given by (5.14). ⊙ m q REMARK 5.2.2. Given that ϕ(1q) = 0 (R ) , testing the hypothesis (5.25) at level α is equivalent to ∈ testing if 0 is inside a 100(1 α)% bootstrap confidence region for ϕ(µ). Since the group multiplication in − ((RP )q, ) is a product of projective quaternion multiplications (5.18), one may use simultaneous bootstrap ⊙ ∗ ∗ ∗ confidence intervals, based on the q affine marginal bootstrap distributions (D1,...,Dq ) = D in (5.26). From the Bonferroni inequalities, for each j = 1,...,q, we obtain a 100(1 α )% confidence region C∗, − q j that can be visualized as a 3D box, product of three 100(1 α )% simultaneous confidence intervals. − 3q If max(n ,n ) m , one may use a nonpivotal bootstrap methodology for the two sample problem H 1 2 ≤ 2 0 (see Bhattacharya and Ghosh (1978)[7], Hall and Hart (1990)[21], Fisher et al. (1996)[17] or Hall (1997)).

5.2.2 Examples of Two-sample Hypothesis Tests on Lie Group

Analysis of Image Data of Two Polyhedral. We consider an application for matched pairs of 3D projective shapes from digital images. The theory for such a two sample test ( test for mean projective shape change ) is developed in Crane and Patrangenaru (2011)[10], where it was applied to stereo medical imaging. Here we consider a toy example consisting in two random samples of polyhedral objects. The first sample, was considered in Patrangenaru et al. (2010)[42], and consists in 16 digital images of a polyhedral surface taken by a uncalibrated digital camera ( see Figure 2.2 ). A second data set consists of 16 digital images of a related polyhedral scene, that was obtained by a slight modification of the first polyhedral object, is displayed in Figure 2.3. Using the Ma el. al (2006)[32] algo- rithm, we obtained the 3D reconstructions from the uncalibrated camera images of the polyhedral surface (Figure 3.6).

40 There are 19 landmarks (visible corners), with labels as in Patrangenaru et al. (2010)[42], 5 of which 19 R 3 14 form a projective frame. Therefore, in this example the projective shape data is on P Σ3 =( P ) . Using Crane and Patrangenaru (2011)[10], from the bootstrap distribution of Veronese-Whitney sample means, we compute the 14 marginal T ∗ statistics on the Lie group RP 3.

For s = 1,..., 14, the values of the statistics Ts under the null hypothesis are all larger than the correspon- ∗ ding Ts for the 95% simultaneous confidence sets (showed in Figure 5.1 as cutoffs) are:

T1 = 1735771.3, T2 = 2234801.4, T3 = 24260037.4, T4 = 949014.2, T5 = 942757.9, T6 = 148967185.2,

T7 = 15847127.4, T8 = 3342761.1, T9 = 8042772.6, T10 = 15528559.7, T11 = 3800842.3, T12 =

35097853.3, T13 = 24107515.0, T14 = 7085996.9. ∗ On the other hand, the corresponding values of the bootstrap cutoffs Ts ,s = 1,..., 14 are: ∗ ∗ ∗ ∗ ∗ ∗ T1 = 23.9831, T2 = 38.9948, T3 = 441.3134, T4 = 44.4325, T5 = 25.1901, T6 = 305.9000, ∗ ∗ ∗ ∗ ∗ ∗ T7 = 74.7575, T8 = 24.2130, T9 = 35.1296, T10 = 204.4511, T11 = 42.3008, T12 = 40.7353, ∗ ∗ T13 = 113.6289, T14 = 26.3761. The equality of the two population mean projective shapes is rejected at level α = 0.05. We infer that the two polyhedral objects are not the same.

Cutoff 3=441.3134 Cut ff 4=44.4325 Cutoff 1=23.9831 Cutoff 2=38.9948 o Cutoff 5= 25.1901 800 900 800 450 800

800 400 700 700 700

700 350 600 600 600

600 300 500 500 500

500 250 400 400 400 400 200

300 300 300 300 150

200 200 200 200 100

100 100 100 50 100

0 0 0 0 0 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160 180 0 50 100 150 200 250 300 350 400 450 500 0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200

Cutoff 8=24.2130 Cutoff 10=204.4511 Cuoff 6= 305.9000 Cutoff 7=74.7575 Cutoff 9=35.1296 2000 1800 2000 600 2500

1800 1600 1800

500 1600 1600 2000 1400

1400 1400 1200 400

1200 1200 1500 1000 1000 1000 300 800 800 800 1000

600 200 600 600

400 400 400 500 100

200 200 200

0 0 0 0 0 0 50 100 150 200 250 300 350 0 50 100 150 200 250 0 20 40 60 80 100 120 140 160 180 0 10 20 30 40 50 60 70 80 90 100 0 50 100 150 200 250 300 350 400

Cutoff 11= 42.3008 Cutoff 12=40.7353 Cutoff 13=113.6289 Cutoff 14=26.3761 2500 1500 6000 1800

2000 5000 1600

1400 1000 4000 1500 1200

1000 3000 1000 800 500 2000 600

500 400 1000 200

0 0 0 0 0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200 0 50 100 150 200 250 300 350 400 450 500 0 20 40 60 80 100 120 140 160 180

Figure 5.1: Cutoffs of the T ∗ marginal bootstrap statistics .

41 Two-sample Test for Means of Independent Pairs. Our data consists in 14 digital camera images of an artist in different disguises ( BBC data ). The face images data set was used in the context of 2D projective shape in Mardia and Patrangenaru (2005)[33]; the need for a 3D projective shape analysis of this data was mentioned in Patrangenaru et al. (2010)[42], where the authors gave a graphic argument for the equality of the mean 3D projective shapes of a group of facial landmarks, from the frontal images, respectively from the one quarter images. We used 8 frontal images, and 6 one quarter images, as shown in figure 2.4. The goal is to test if indeed the projective shape of a 3D configuration of landmarks on the actor’s face as extracted from frontal images, is, on average, the same as its projective shape when one quarter views are used in its 3D reconstruction. Figure 3.2 in the Appendix displays the actor image with the landmarks used in our analysis, numbered from 1 to 10. The reconstructed 3D configurations are posted at www.stat.fsu.edu/ vic/MCAP. ∼ From the 3D configuration of 10 facial landmarks, we selected the landmark 1-5 to construct the projec- tive frame and computed the nonparametric bootstrap distribution of ¯ ∗ ¯ ∗ in equation (5.17). H(Xn1,jk , Xn2,jk ) The affine coordinates of the 10 - 5 = 5 projective coordinates of the bootstrap VW means for 350 bootstrap resamples are displayed in figure 5.2, and the 95% confident intervals based on 20,000 bootstrap resamples of each coordinate are displayed in Table 5.2.

2 2 2

1 1 1

0 0 0

−1 −1 −1

−2 −2 −2 2 2 2 1 2 1 2 1 2 0 1 0 1 0 1 0 0 0 −1 −1 −1 −1 −1 −1 −2 −2 −2 −2 −2 −2

2 2

1 1

0 0

−1 −1

−2 −2 2 2 1 2 1 2 0 1 0 1 0 0 −1 −1 −1 −1 −2 −2 −2 −2

Figure 5.2: Simultaneous confidence regions for the mean of the five axial marginals (affine coordinates).

42 Recall from Remark 5.2.2 that the null hypothesis in this case amounts to all affine coordinates of µ being zero. Since (0, 0, 0) is inside all the 95% simultaneous marginal confidence affine intervals, listed in Table A.2, thus there is insignificant mean projective shape change of the facial landmark configurations between the frontal and one quarter views images. Therefore, we cannot reject the hypothesis that the frontal images and quarter views are from the same person. In order to demonstrate that this allows to accurately identify unknown persons, one would need to calculate or simulate the power of the test under the alternative.

Table 5.2: Simultaneous confidence affine intervals for mean projective shape change - BBC actor data

Bootstrap simultaneous confidence intervals for facial landmarks 6 to 10 Coordinate 6 7 8 9 10 x ( 1.26, 1.37) ( 1.28, 1.33) ( 1.56, 1.70) ( 1.29, 1.22) ( 1.43, 1.36) − − − − − y ( 1.30, 1.28) ( 1.27, 1.27) ( 1.71, 2.18) ( 1.41, 1.36) ( 1.35, 1.28) − − − − − z ( 1.34, 1.18) ( 1.31, 1.20) ( 1.62, 1.86) ( 1.19, 1.27) ( 1.48, 1.23) − − − − −

Two-sample Test for Means of Half Bust Landmark Configurations.. In this example the data consists in twenty four photos taken of the busts of the Greek philosopher Epicurus. These are displayed in Figure 2.5. Sixteen of the images are from a one-head statue, others, in the third row are from a double-head statue, including one disciple of Epicurus. Nine landmarks, displayed in Figure 3.2 were selected from the right half of the face of the statues. The landmark coordinates and the reconstructed 3D configurations obtained from 2D matched configurations in pairs of images are posted at www.stat.fsu.edu/ vic/MCAP. ∼ Landmarks 1, 4, 5, 6, 8 were utilized to construct the projective frame. For the confidence region, we computed 2,000,000 bootstrap VW sample means, based on landmarks 2, 3, 7, 9.

43 2 2

1 1

0 0

−1 −1

−2 −2 2 2 1 2 1 2 0 1 0 1 0 0 −1 −1 −1 −1 −2 −2 −2 −2

2 2

1 1

0 0

−1 −1

−2 −2 2 2 1 2 1 2 0 1 0 1 0 0 −1 −1 −1 −1 −2 −2 −2 −2

Figure 5.3: Simultaneous confidence regions for the statue data.

Table 5.3: Simultaneous confidence affine intervals for mean projective shape change - statue data

Bootstrap simultaneous confidence intervals for statue landmarks 2, 3, 7, 9 Coordinate 2 3 7 9 x ( 11.52, 11.61) ( 28.65, 30.81) ( 1.96, 1.86) ( 46.36, 42.62) − − − − y ( 11.29, 12.04) ( 32.15, 32.57) ( 1.84, 1.88) ( 47.20, 46.36) − − − − z ( 12.32, 12.49) ( 24.10, 26.31) ( 1.22, 1.41) ( 40.08, 37.08) − − − −

For the 4 landmarks, the point (0, 0, 0) is in the 12 simultaneous confidence intervals. Therefore based on the sample pictures, we fail to reject the null hypothesis that on average the projective shapes of the selected landmark configurations are the same.

44 CHAPTER 6

HYPOTHESIS TESTS FOR INFINITE DIMENSIONAL CONFIGURATIONS

The dimension of the data obtained from digital images nowadays can in fact be very large and this dimen- sion can even be far more greater than the sample size. In contrast, most existing statistical hypothesis test- ing methods work better for data sets whose dimension is much smaller than the sample size. Furthermore, Munk et al. (2008)[36] show that the well known Hotelling T 2 statistic breaks down in infinite dimensional settings. In addition, they point out that for infinite dimensional data, the mean shapes from two populations can never be identical. As a result, they develop one- and multi-sample neighborhood hypothesis testing methodologies. Here a neighborhood or equivalence hypothesis replaces the usual “point” null hypothe- sis which states that two or more infinite dimensional mean shapes are identical. The more appropriate neighborhood approach has the additional benefit of a much simpler asymptotic analysis. While the mean and covariance used in Munk et al. (2008)[36] were for real-valued variables, Ellingson et al. (2013)[14] extended their one-sample neighborhood hypothesis testing method by employing extrinsic parameters on manifolds. Subsequently, Qiu et al.(2014)[48] applied this methodology to test how far is corpus callosum of an average individual from Albert Einstein’s. Motivated by [10], [14] and [36], Qiu et al.[47] derived the neighborhood hypothesis testing for mean change on infinite dimensional Lie groups for matched contours.

6.1 One-sample Hypothesis Test for Infinite Dimensional Configurations 6.1.1 Test Statistic for the One-sample Neighborhood Hypothesis

Assume Σj is the extrinsic covariance operator of a random object X on the Hilbert manifold , with M respect to the embedding j : H (see Ellingson et al.(2013)[14]). Let M be a compact M → 0 of . Let ϕ : R be the function M 0 M → 2 ϕ0(p)= min j(p) j(p0) , (6.1) p0∈M0 k − k Mδ Bδ and let 0, 0 be given respectively by

Mδ = p ,ϕ (p) δ2 , Bδ = p ,ϕ (p)= δ2 . (6.2) 0 { ∈M 0 ≤ } 0 { ∈M 0 }

45 Since ϕ0 is Frechet´ differentiable and small enough δ > 0 are regular values of ϕ0, it follows that δ δ B is a Hilbert submanifold of codimension one in . Let νp be the normal space at a points p B , 0 M ∈ 0 Bδ Bδ,X orthocomplement of the tangent space to 0 at p. We define 0

δ,X δ B = p B , Σj ν is positive definite . (6.3) 0 { ∈ 0 | p }

DEFINITION 6.1.1. The neighborhood hypothesis consists in the following two hypotheses:

δ δ,X δ c δ,X c H : µj M B vs. Ha : µj (M ) (B ) . (6.4) 0 ∈ 0 ∪ 0 ∈ 0 ∩ 0

Munk et al. (2008)[36] show that, in the case of random objects on Hilbert spaces, the test statistic for these types of hypotheses has an asymptotically standard normal distribution for large sample sizes. Here, we consider neighborhood hypothesis testing for the particular situation in which the submanifold

M consists of a point m on . Set ϕ = ϕm , and since Tm m = 0 Ellingson et al. (2013) [14] 0 0 M 0 0 0 { 0} proved the following theorem.

THEOREM 6.1.1. (Ellingson et al. (2013)[14]) If M = m , the test statistic for the hypotheses speci- 0 { 0} fied in (6.4) has an asymptotically standard normal distribution and is given by:

2 2 Tn = √n ϕm (ˆµj) δ /sn, s = 4 ν,Sˆ j,nνˆ where (6.5) { 0 − } n h i n 1 Sj,n = (tanµ˜ˆ dj X Pj(j(Xi) j(X)n)) (tanµ˜ˆ dj X Pj(j(Xi) j(X)n)) (6.6) n ( )n − ⊗ ( )n − i X=1 n is the extrinsic sample covariance operator for Xi , and { }i=1

−1 νˆ =(dµ j) tan (j(m ) j(ˆµj,n)). (6.7) ˆj,n j(ˆµj,n) 0 − 6.1.2 Neighborhood hypothesis for thed mean shape of an Euclidean contour

Given any VW-nonfocal probability measure Q on P (H), from Section 6.1.1 we see that if γ1,...,γn is a sample from Γ, then µˆj,n is the projective point of the eigenvector corresponding to the largest eigenvalue 1 n 1 ∗ H of 2 γi γ . Given n i.i.d.r.o.’s from a VW-nonfocal distribution on P ( ), the asymptotic n i=1 kγik ⊗ i distributionP of j(X)n is converges as follows

√n(j(X) µ) d as n , (6.8) n − → G → ∞

46 where has a Gaussian distribution N (0, Σ) on HS a zero mean and covariance operator Σ. It follows G LHS L that the projection Pj : HS j(P (H)) HS is given by L → ⊂L

∗ Pj(A)= νA ν , (6.9) ⊗ A

2 where νA is the eigenvector of norm 1 corresponding to the largest eigenvalue δ1 of A, Pj(µ)= j(µj), and

Pj(j(X)n)= j(ˆµj,n) Applying the delta method to (6.8), Ellingson et al.(2013)[14] derived a CLT for the VW extrinsic sample mean µˆj,n. Because of the infinite dimensionality, in practice, a sample estimate for the covariance operator is always degenerate, so studentization does not work. We then reduce the dimensionality via the neighborhood hypothesis methodology. Suppose that j : P (H) HS is the VW embedding in (4.12) → L and δ > 0 is a given positive number. Using the notation in Section 6.1.1, we now can apply the results above to random shapes of regular contours. Assume xr =[γr], γr = 1,r = 1,...,n is a random sample k k from a VW-nonfocal probability measure Q. Asymptotically the tangential component of the VW-sample mean around the VW-population mean has a complex multivariate normal distribution. In particular, if we extend the CLT for VW-extrinsic sample mean Kendall shapes in Bhattacharya and Patrangenaru (2005)[6], to the infinite dimensional case, the j-extrinsic sample covariance operator Sj,n, when regarded as an infinite Hermitian complex matrix has the following entries

−1 ˆ2 ˆ2 −1 ˆ2 ˆ2 −1 Sj,n,ab = n (δ1 δa) (δ1 δb ) (6.10) n − − ∗ 2 < eb,γr > ,a,b = 2, 3,... | 1 | r=1 X H with respect to the complex orthobasis e2,e3,e4,... of unit eigenvectors in the tangent space Tµˆj,n P ( ). C Recall that this orthobasis corresponds via the differential dµˆj,n with an orthobasis (over ) in the tangent H a space Tj(ˆµj,n)j(P ( )), therefore one can compute the components νˆ of νˆ from equation (6.7) with respect 2 to e2,e3,e4,..., and derive for sn in (6.5) the following expression ∞ 2 a b sn = 4 SE,n,abνˆ νˆ , (6.11) a,bX=2 where SE,n,ab given in equation (6.10) are regarded as entries of a Hermitian matrix. One method for performing inference, is through nonparametric nonpivotal bootstrap (Efron (1979)[12]). The nonparametric bootstrap algorithm for constructing a confidence region for extrinsic mean contour is given below.

47 Algorithm 6.1.1. INPUT x: the normalized coordinates of the contours (x is k n complex matrix ); k: × number of matched points on contours ; n: number of contours (columns); N: number of bootstraps OUTPUT CR: confidence region

Step 1 Compute extrinsic mean of x

For i=1:n ∗ Xi = xi x ∗ i End

X¯ = sum(X)/n

µVW = eigenvector corresponds to the largest eigenvalue of X¯

Step 2 Bootstrap

For j=1:N

u ,...,un = random integer uniform(1,n) 1 ∼ y1:n = xu1 ,...,xun

For i=1:n ∗ Yi = yi y ∗ i End Y¯ = sum(Y )/n

µBVW =eigenvector corresponds to the largest eigenvalue of Y¯ T φBVW =real part of trace((Y¯ X¯)(Y¯ X¯) ) − − End

cutoff=95%quantile of φ

CR= µBVW ((φBVW > cutoff) µBVW ) (µBVW = 0), j = 1,...,N { | ∗ ∩ 6 } 6.1.3 Example - How far is the Corpus Callosum of an Average Individual from Albert Einstein’s?

Test for 2D Similarity Shape. Einstein’s brain was removed shortly after his death (most likely with- out prior family consent ), weighted, dissected and photographed by a pathologist. Among other pictures, a digital scan of a picture of Einstein’s half brain taken at the autopsy is displayed below; we extracted the contour of the CC from this Einstein’s brain image, the shape of which would be set as a null hypothesis in our testing problem (see Figure 2.6(a)).

48 Fletcher (2013)[18] extracted contours of CC midsagittal sections from MRI images, to study possible age related changes in this part of the human brain. His study points out to certain age related shape changes, in the corpus callosum. Given that Einstein passed at 76, we consider a subsample of corpus callosum brain contours from Fletcher(2013)[18], in the age group 64-83, to test how far is the average CC contour from Einstein’s. The data is displayed in Figure 2.7. We will use the neighborhood hypothesis test on the manifold of planar contours to test if the average shape of the CC in a population of sixty five to eighty three years old people is close to the shape of Einstein’s CC. Data in Figure 2.7 was used to test the hypothesis that the mean CC shape is in a small neighborhood around the shape of Einstein’s CC ( see Figure 6.2). The closest representatives of the VW sample mean of the shapes of contours of the CC midsections vs the shape of Einstein’s CC midsection are displayed in Figure 6.1. The overlaps of the two contours are rare, which visually shows that the average CC contour shape is significantly different from Einstein’s.

Figure 6.1: Superimposed icons for 2D direct similarity shapes of CC midsections : sample mean (red) vs Albert Einstein’s (blue).

The maximum value for δ where the test is significant was found to be 0.1367, which is quite large taking into account the fact that the diameter of any finite dimensional complex projective space with the V-W embedding is √2. The corresponding neighborhood for this value of δ is shown in Figure 6.2.

49 Figure 6.2: 95% bootstrap confidence region for the extrinsic mean CC contour by 1000 resamples.

Although other recent studies focussed more on size, rather then shape, they tell the same story: an average brain is not close to A. Einstein’s brain, although it weighted less than the average ( see Falk et al. (2013)[15] and Men et al. (2014)[35]). Our results, while from a different perspective than in Man et al.(2014)[35], point as well to the fact that A.Einstein’s corpus callosum, being relatively thicker, allowed for a better connectivity between the left and right hemispheres of the brain, than the connectivity in the average subject in his group age.

6.1.4 Note

For validation purpose, we set M0 to be the similarity shape of the 7th corpus callosum contour in Figure 2.7, whose age closest to 76, the dead age of Einstein, and set the rest as the sample. The δ corresponds to this one-sample neighborhood hypothesis testing problem is 0.1096, 20% smaller than 0.1367. By this validation step, we confirm that the distance between Einstein’s corpus callosum and the mean 2D similarity shape of the sample is far.

6.2 Two-sample Hypothesis Test for Infinite Dimensional Configurations 6.2.1 Methodology of Two-sample Hypothesis Test for Infinite Dimensional Configurations

Ellingson el. al. (2013)[14] developed a one-sample neighborhood hypothesis testing methodology for infinite dimensional r.o.’s. In order to take advantage of their method, we test if a matched pair r.o.’s are similar by testing for a significant difference in their random change. As in section 5.2.1, we suppose X and Y are paired r.o.’s on an (possibly) infinite dimensional Lie group , which is also a Hilbert manifold. G Assume the group structure on is . We extend the notion of change between two random objects on a G ⊙ finite dimensional manifold from Crane and Patrangenaru (2011)[10] to the infinite dimensional case. The

50 change from X to Y, r.o.’s on is defined to be the r.o. C = X−1 Y. In the case of 3D projective G ⊙ shapes of contours in general position, once a projective frame is selected, the other points on the contour are represented in terms of their projective coordinates w.r.t. this projective frame. Since the projective transformation is uniquely determined by its action on a projective frame, w.l.o.g. we may assume that this projective transformation takes the given projective frame to the standard projective frame. Moreover we will consider only contours in general position, which are contours, for the chosen projective frame which are matched within and between populations, with the additional property, after applying this projective transformation, there is a unique center of point on the contour at the largest distance from the center of mass of the projectively transformed contour. The infinite dimensional Lie group is in this case is the completion of the factor group = (S1, RP 3)/Diff(S1), where Diff(S1) denotes the group of H G C from S1 to itself, and given a standard parametrization z γ(z) that is matching the within and between → points on the contour, the group multiplication defined as

([γ ] [γ ])(z)=:[γ (z)] [γ (z)], z S1. (6.12) 1 ⊙ 2 1 ⊙ 2 ∀ ∈

We identify two functions from S1 to RP 3 that differ by a smooth reparametrization, therefore we have to factor (S1, RP 3) by Diff(S1), i.e. γ γ ϕ, where ϕ Diff(S1). A finite dimensional approximation C ∼ ◦ ∈ may be obtained by selecting k matched landmarks on S1 that are assumed to densely fill S1, as k . → ∞ When this is the case one may approximate the change from X to Y by system C(q) of q points on the projective shape of the contour C, where q = k 5, that, along with the 5 points of the projective frame − are the “consecutive” k points on the contour C, relative to this selected projective frame on it. Given that a image of a 3D contour, may stretch or contract at different rates along the contour, to start with we select a convenient number, say r, of matched consecutive points on contours, and apply the marching point algorithm from Ellingson et al. (2013)[14] in between these consecutive points. Ultimately we get the set C(q) of consecutive q matched points (excluding projective frame points), C(q) = ([c1],..., [cq]), for q = k 5. Let µE,k be the extrinsic mean of C(q), G be a compact submanifold of , ϕg : R be the − 0 G 0 G → distance function 2 ϕg0 (p)= min j(p) j(p0) , (6.13) p0∈G0 k − k δ δ and let G0, B0 be given respectively by

Gδ = p ,ϕ (p) <δ2 , (6.14) 0 { ∈G 0 }

51 Bδ = p ,ϕ (p)= δ2 . (6.15) 0 { ∈G 0 } The neighborhood null and alternative hypothesis test hypotheses are

δ δ,C δ c δ,C c Hδ : µE G B ,Aδ : µE (G ) (B ) . (6.16) ∈ 0 ∪ 0 ∈ 0 ∩ 0

For v j(P (H)), tan(v) is the tangential component of v with respect to orthobasis ea(Pj(µ)). Consider ∈ the neighborhood hypothesis test for the particular situation in which the submanifold G0 consists of a point g on . Setting ϕ = ϕg , guarantees we have an asymptotic distribution. This follows from a general 0 G 0 0 result for the asymptotic distribution of the square distance from the extrinsic sample mean on a Hilbert manifold to the hypothesized true mean, given in Ellingson et al. (2013)[14]. In the particular case of we H have the following:

THEOREM 6.2.1. (Qiu et al. (2015)[47]) If G = g , where g is the identity 1 ([0 : 0 : 0 : 1]T ), 0 { 0} 0 { G}≡ where ([0 : 0 : 0 : 1]T ) is a constant function from S1 to RP 3, then the test statistic for the hypotheses specified in (6.16) has an asymptotic standard normal distribution and is given by:

2 Tn = √n ϕg (ˆµE) δ /sn, (6.17) { 0 − } where 2 s = 4 ν,Sˆ E,nνˆ (6.18) n h i and

n 1 SE,n = (tanµ˜ˆdj C Pj(j(Ci) j(C)n)) n ( )n − ⊗ i X=1 (tanµ˜ˆdj C Pj(j(Ci) j(C)n)) (6.19) ( )n −

n is the extrinsic sample covariance operator for Ci , and { }i=1

−1 νˆ =(dµ j) tan (j(m ) j(ˆµE,n)). (6.20) ˆE,n j(ˆµE,n) 0 −

Here tan (v) is the tangential component of v H with respect to the orthobasis ea(j(ˆµE,n)) a = j(ˆµE,n) ∈ 1, 2,..., and is given as ∞ ∞ tan (v)= (ea(j(ˆµE,n)) v)ea(j(ˆµE,n)). (6.21) j(ˆµE,n) · a=1 X We compute the ((s,a)(t,b))th element of SE,n as

52 n −1 −1 −1 T s T t T s T t S = n (ds(4) ds(a)) (dt(4) dt(b)) (gs(a) C )(gt(b) C )(gs(4) C )(gt(4) C ), E,n,(s,a)(t,b) − − × r r r r r=1 X (6.22) where s,t = 1,...,q; a,b = 1, 2, 3, ds and gs are the unit eigenvectors corresponding to eigenvalues of 1 n T Cr C , ranked from the smallest to the largest, in their lexicographic order. n r=1 ⊙ r PIf the sample size is small, we will take a nonparametric bootstrap approach to developing the neighbor- hood hypothesis test. When δ is fixed, the corresponding bootstrap distribution of Tn in (6.17) is

∗ ∗ 2 ∗ T = √n ϕm (ˆµ ) δ /s , (6.23) n { 0 E − } n where s∗2 = 4 νˆ∗,G∗νˆ∗ , (6.24) n h i n S∗ = n−1(d∗(4) d∗(a))−1(d∗(4) d∗(b))−1 (g∗(a)T Cs∗)(g∗(b)T C∗t)(g∗(4)T C∗s)(g∗(4)T C∗t), E,n,(s,a)(t,b) s − s t − t × s r t r s r t r r=1 X (6.25) is a finite landmark based approximation of the extrinsic sample covariance operator for the resample C∗ n , where s,t = 1,...,q; a,b = 1, 2, 3, d∗ and g∗ are the unit eigenvectors corresponding to the { i }i=1 s s eigenvalues of 1 n c∗ c∗, ranked from the smallest to the largest and n r=1 r ⊙ r P ∗ −1 ∗ νˆ =(dµˆ∗ j) tanj(ˆµ∗ )(j(m0) j(ˆµ )). (6.26) E,n E,n − E,n

We also compute the confidence region viad a nonparametric nonpivotal bootstrap (Efron (1979)[12]) method. The nonparametric bootstrap algorithm for constructing a confidence region for extrinsic mean contour is an extension of the one given in Qiu et al. (2014)[48].

6.2.2 Example - Two-sample Neighborhood Hypothesis Test for the Leaf Data

We consider the leaf image data set, which consisting of two leaves, call them leaf A and leaf B, from the same plum tree. Twenty pictures are taken of each leaf from different perspectives. Our goal here is to apply the neighborhood hypothesis test to the resulting set of images. As it is difficult to pair a large number of landmarks by hand, the algorithm of Ellingson et al. (2013)[14] is used to automatically pair the landmarks along the contours of two leaves.

53 Coplanarity Test. Shape analysis of leaf data, is considered in Huckemann (2011)[26] and some of the references therein. To start with in the example considered there, it is assumed that poplar leaves are flat, a very strong assumption that may or may not be true. Certainly this flat leaf assumption, drastically simplifies the analysis, since for 2D shape data, the analysis could be easily performed on a Kendall shape space (see Kendall (1984)[28]), or on a space of shapes of planar contours (see Ellingson et al. (2013)[14]). In our analysis nevertheless we do not make any assumptions about the intrinsic dimensionality of the leaf shape datum. In fact given a number of digital camera images of a 3D scene, one may select five landmarks on this 3D scene, and, if the landmarks are coplanar, the 2D projective shape of this configuration would be a constant, thus having no variability on the projective space PΣ5 RP 2 (see Patrangenaru (2001)[40] 2 ≡ and Bhattacharya (2008)[5]). Therefore for our data, we select 5 matched landmarks from the 20 images of each of the two leaves to test if they are flat. This test is necessary for, if any leaf is flat, its 3D projective transformation reconstruction will be trivial. Among those 5 landmarks, the first 4 ones serve as a 2D projective frame; and landmark 5 is used to run the test for zero extrinsic total variance (see Patrangenaru (2001)[40]).

0.02

2 3 0.01

0 1

−0.01 4 5

−0.02

−0.05 0 0.05

Figure 6.3: 5 selected landmarks of Leaf A (left) and Leaf B (right).

Table 6.1: The confidence intervals for the total sample variance of landmark 5 of Leaf A and B

Leaf A B Confidence interval [3.31 10−4, 3.32 10−4] [3.93 10−4, 3.94 10−4] × × × ×

54 Table 6.1 indicates that both Leaf A and B are not flat (possess non-zero extrinsic total variance) and so we may continue with the analysis.

Test for 3D Projective Shape Change. For a total of 40 original images of leaves A and B (Figure 2.8), we extract the contour of each leaf, see Figure 2.9. Next, we select points within and between samples, while guaranteeing they are matched. Here, a ’match’ means that the landmark for one contour should have corresponding landmarks in all other contours with the same label. Using the method of Ellingson et al. (2013)[14] we obtained 816 matched landmarks on each leaf contour. Figure 3.5 shows that these points are very well matched between leaves. For example, in all the 40 contours, landmark number 1 is located at the tip of the leaf. With these 816 paired landmarks acquired from each leaf image, we performed a 3D projective reconstruction with the two matched landmark sets. After reconstruction, we transformed each contour to have the same basis via a projective transformation (Figure 3.10). Then we computed the change between the projectively transformed contours of leaves A and B, see Figure 6.4. Recall that the closer the change is to the origin, the more likely that leaf A and leaf B are the same leaf. Figure 6.4 indicates that the change is relatively close to the origin.

10 2 2 15 3

1.5 1.5 10 2 5 1 1 5 1 0.5 0.5

0 0 0 0 3528167135281671352816713528167135281671 0 3528167135281671352816713528167135281671 3528167135281671352816713528167135281671 3528167135281671352816713528167135281671 3528167135281671352816713528167135281671 −0.5 −0.5 −5 −1 −5 −1 −1 −10 −2 −1.5 −1.5

−10 −2 −2 −15 −3 10 1 1.5 10 1.5 1 1 5 10 0.5 1.5 1 5 6 2 1 0.5 4 0.5 5 0.5 0 1 0 0 0.5 0 2 0 −5 0 0 0 −0.5 0 −2 −0.5 0 −5 −0.5 −0.5 −5 −1 −0.5 −10 −4 −1 −1 −1 −6 −10 −10 −1 −1.5 −1.5 −1 −15 −8 −1.5 −2

20 2 8 6 4

15 1.5 6 3 4 2 10 1 4 2 1 5 0.5 2 0 0 0 0 0 3528167135281671352816713528167135281671 3528167135281671352816713528167135281671 3528167135281671352816713528167135281671 3528167135281671352816713528167135281671 3528167135281671352816713528167135281671 −1 −5 −0.5 −2 −2 −2 −10 −1 −4 −3 −4 −15 −1.5 −6 −4

−20 −2 −8 −6 −5 10 1.5 3 4 5 1 2 5 6 2 4 2 6 3 4 0.5 1 4 2 1 2 0 2 0 0 0 2 0 1 0 −0.5 0 −1 0 0 0 −5 −2 −2 −2 −1 −1 −2 −4 −1 −2 −4 −2 −10 −6 −1.5 −2 −3 −4 −4 −6 −5 −3

Figure 6.4: Icons of changes between the projectively transformed leaves A and B.

We utilize the paired neighborhood hypothesis test to confirm this result. Since we constructed the projective frame with 5 landmarks, there are 816 5 = 811 points left for the hypothesis test. It was − computationally prohibitive to compute the extrinsic covariance matrix with 811 points since said matrix has (3 811) (3 811) entries. Therefore, we randomly selected 100 points from the change in contours × × × for the test. For α = 0.05 and test statistic (6.17) we obtain δ = 12.9353 which is actually relatively small in light of the fact that the diameter of the projective shape space approximation used in our computations is 100 √2. ×

55 6.2.3 Note

In order to confirm that δ = 12.9353 is a relatively small value, we separately conduct the two sample hypothesis tests for leaf A and leaf B.

4 0.4 3 8 0.2

3 0.3 6 0.15 2 2 0.2 4 0.1 1 1 0.1 2 0.05

0 0 0 0 0 3528167135281671352816713528167135281671 35281671352816713528167135281671 352816713528167135281671 −1 −0.1 352816713528167135281671 −2 −0.05 −1 35281671352816713528167135281671 −2 −0.2 −4 −0.1 −2 −3 −0.3 −6 −0.15

−4 −0.4 −3 −8 −0.2 10 0.4 5 6 0.3 4 5 2 0.2 2 4 5 0.2 0.2 1.5 2 1 2 0.1 0 0 1 0 0 0.1 0.5 0 0 0 −2 0 0 −5 −0.2 0 −1 −0.5 −2 −4 −0.1 −1 −10 −2 −0.4 −1.5 −5 −4 −6 −5 −0.1 −0.2

5 1.5 3 1 0.8

0.6 1 2 0.5 0.4 0.5 1 0.2

0 0 0 0 0 352816713528167135281671 352816713528167135281671 3528167135281671352816713528167135281671 35281671352816713528167135281671 35281671 −0.2 −0.5 −1 −0.5 −0.4 −1 −2 −0.6

−5 −1.5 −3 −1 −0.8 3 1 1.5 3 1.5 2 1 2 1 2 0.5 1.5 1.5 1 2 1 1 0.5 1 1 0.5 1 0.5 1 0 0 0.5 0 0.5 0 0 −1 0 0 −0.5 0 −1 0 −0.5 0 −0.5 −0.5 −0.5 −1 −0.5 −1 −2 −1 −1 −1 −2 −1 −3 −2 −1 −1.5 −1.5 −1.5 −3 −1 −1.5 −2

Figure 6.5: Icons of changes within the projectively transformed leaf A (top 5) and leaf B (bottom 5).

In order to make the δ’s in this section comparable to the one in Section 6.2.2, we also randomly selected 100 points from the change in contours for the test.

For α = 0.05 and test statistic (6.17) we obtain δA = 11.8015 for leaf A, and δB = 8.4770 for leaf B.

Comparing to δA and δB, the δ = 12.9353 is still larger than them, but not far more larger. This objectively reflect the truth, leaf A and leaf B are taken from the same tree, even though their mean 3D projective shape is different, but the difference should not be very large.

56 APPENDIX A

COORDINATES OF THE LANDMARKS

Table A.1 displays the registered coordinates of the 19 landmarks in each of sixteen camera images. Camera images paired, each pair was used to reconstruct the 3D polyhedral surface in the Figures 2.3. Table A.2 shows the coordinates of the 10 landmarks of the BBC actor in Figure 2.4. And Table A.3 is the coordinates of the 9 landmarks in the statue pictures in Figure 2.5.

Table A.1: Corner coordinates from paired images in Figure 2.3

Landmark No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 x 2147 1737 1864 2279 1717 1839 2249 1636 1788 2279 1616 1747 2244 1302 1519 2345 1286 1494 2279 Image 01 y 162 203 425 380 588 815 765 603 876 810 1023 1302 1241 1074 1580 1454 1697 2218 2092 x 2289 1879 2310 2687 1854 2264 2649 1763 2254 2730 1737 2213 2659 1428 2228 2973 1403 2142 2851 Image 02 y 253 476 775 512 937 1231 983 1003 1347 1038 1023 1302 1241 1074 1580 1454 1697 2218 2092 x 2350 1909 2289 2735 1920 2289 2740 1828 2289 2811 1854 2294 2806 1504 2249 3110 1550 2259 3084 Image 03 y 198 395 633 436 912 1140 937 962 1251 998 1535 1839 1560 1712 2259 1778 2598 3160 2659 x 2304 1925 2360 2740 1920 2345 2715 1839 2355 2786 1839 2325 2755 1535 2375 3084 1540 2320 3003 Image 04 y 304 527 775 527 1013 1241 1003 1059 1362 1064 1600 1889 1580 1808 234 178 2613 3145 2573 x 2132 1671 1955 2421 1681 1960 2421 1585 1925 2477 1606 1920 2461 1251 1778 2694 1286 1783 2664 Image 05 y 152 294 542 395 770 1028 866 810 1119 927 1337 1651 1459 1464 2041 1687 2279 2867 2487 x 1940 1560 1925 2304 1570 1925 2294 1499 1925 2355 1514 1914 2340 1231 1894 2608 1266 1899 2548 Image 06 y 309 522 775 557 927 1175 942 988 1292 1008 1418 1732 1443 1616 2198 1646 2244 2781 2304 x 2477 1970 2269 2781 1965 2264 2765 1864 2223 2826 1879 2208 2801 1489 2046 3049 1519 2051 3003 Image 07 y 56 208 486 324 729 1008 836 760 1109 907 1337 1687 1479 1469 2107 1727 2335 2983 2598 x 2396 2021 2467 2846 2051 2487 2836 1965 2497 2917 2011 2502 2902 1732 2568 3241 1793 2563 3176 Image 08 y 512 820 1069 760 1241 1489 1180 1307 1621 1236 1752 2061 1687 2006 2598 1904 2634 3191 2517 x 2436 1914 2112 2649 1869 2082 2603 1768 2011 2644 1732 1950 2588 1332 1687 2755 1276 1631 2659 Image 09 y 46 111 375 304 623 891 826 638 972 886 1226 1570 1474 1292 1899 1722 2198 2836 2639 x 2340 1980 2269 2624 1970 2249 2608 1889 2228 2659 1889 2213 2634 1611 2147 2857 1611 2122 2796 Image 10 y 360 491 699 562 891 1109 957 932 1195 1003 1373 1636 1449 1504 1996 1646 2168 2659 2299 x 2396 1864 2006 2553 1808 1950 2492 1697 1864 2517 1646 1808 2446 1231 1464 2568 1160 1388 2456 Image 11 y 248 268 547 517 780 1048 1028 785 1124 1089 1347 1707 1661 1393 2011 1920 2274 2927 2826 x 2153 1773 1965 2350 1757 1940 2330 1676 1904 2370 1676 1879 2340 1383 1722 2492 1383 1712 2446 Image 12 y 263 334 562 466 729 937 856 750 1008 896 1170 1433 1327 1241 1727 1519 1894 2391 2158 x 2325 1884 2234 2669 1864 2198 2634 1773 2183 2700 1768 2137 2639 1423 2056 2907 1418 2001 2806 Image 13 y 370 557 861 663 1003 1297 1109 1043 1423 1175 1524 1894 1636 1687 2365 1899 2391 3059 2578 x 2259 1722 1909 2456 1712 1894 2446 1606 1828 2477 1600 1813 2456 1190 1514 2603 1185 1509 2563 Image 14 y 122 198 456 370 714 977 896 734 1059 952 1317 1656 1550 1393 1996 1793 2310 2938 2725 x 2624 2193 2502 2943 2158 2451 2897 2071 2436 2953 2036 2380 2892 1697 2269 3130 1651 2198 3029 Image 15 y 334 456 704 572 927 1170 1038 962 1266 1099 1484 1798 1621 1606 2168 1849 2421 2998 2659 x 2122 1788 2228 2558 1803 2249 2548 1737 2254 2629 1752 2259 2618 1499 2360 2958 1524 2345 2907 Image 16 y 506 755 942 689 1200 1398 1135 1276 1499 1190 1763 2006 1676 1980 2446 1854 2745 3196 2598

57 Table A.2: Coordinates of facial landmarks for BBC actor data in Figure 2.4

Landmark → 1 2 3 4 5 6 7 8 9 10 Image↓ 1 265, 236 302, 232 341, 227 377, 221 302, 179 315, 186 328, 176 279, 149 338, 143 303, 103 2 299, 214 334, 210 368, 206 400, 203 334, 161 344, 169 358, 161 315, 134 371, 131 338, 90 3 266, 253 305, 251 330, 248 365, 242 296, 209 311, 212 321, 204 286, 185 336, 181 308, 125 4 201, 254 221, 250 250, 247 228, 242 209, 203 215, 212 233, 204 208, 179 257, 176 234, 131 5 264, 234 298, 234 332, 236 361, 235 302, 194 312, 200 321, 193 285, 165 338, 165 302, 120 6 367, 237 395, 227 422, 218 449, 209 393, 182 404, 184 412, 176 356, 167 406, 147 365, 119 7 327, 246 354, 244 387, 239 416, 241 358, 206 367, 212 377, 206 343, 176 399, 176 365, 132 8 273, 248 290, 248 319, 247 353, 246 288, 211 284, 219 305, 209 284, 179 332, 181 304, 140 9 269, 218 300, 219 332, 217 365, 218 302, 170 312, 174 328, 170 287, 150 347, 149 314, 109 10 197, 227 217, 232 239, 236 275, 238 215, 188 211, 196 231, 191 220, 159 266, 167 244, 121 11 304, 221 339, 216 371, 212 401, 210 335, 169 344, 173 359, 167 316, 149 374, 146 338, 106 12 247, 253 274, 252 302, 251 335, 249 265, 211 271, 219 284, 209 260, 182 310, 180 281, 132 13 282, 238 305, 236 337, 232 367, 232 298, 191 301, 200 316, 191 294, 161 344, 158 314, 120 14 346, 222 373, 218 403, 218 435, 218 374, 172 386, 180 398, 170 359, 146 413, 142 378, 98

Table A.3: Statue data landmark coordinates Figure 2.5

Landmark → 1 2 3 4 5 6 7 8 9 One-head image↓ 1 (772, 1594) (852, 1578) (930, 1622) (852, 1636) (1074, 1624) (962, 1962) (1040, 1938) (892, 2088) (1036, 2089) 2 (1246, 1392) (1340, 1372) (1392, 1406) (1340, 1436) (1586, 1438) (1460, 1762) (1604, 1732) (1302, 1900) (1525, 1860) 3 (580, 1794) (618, 1788) (720, 1830) (632, 1854) (772, 1926) (680, 2250) (696, 2282) (698, 2398) (853, 2403) 4 (1134, 1230) (1210, 1220) (1278, 1252) (1218, 1272) (1422, 1288) (1306, 1548) (1404, 1534) (1222, 1650) (1356, 1649) 5 (606, 1486) (720, 1490) (810, 1512) (734, 1544) (1032, 1552) (860, 1944) (1022, 1946) (718, 2070) (847, 2046) 6 (1340, 1006) (1454, 980) (1516, 1042) (1454, 1046) (1732, 1062) (1600, 1402) (1744, 1348) (1446, 1574) (1624, 1564) 7 (1144, 1596) (1264, 1576) (1342, 1624) (1272, 1642) (1578, 1660) (1416, 2056) (1588, 2018) (1252, 2204) (1441, 2198) 8 (936, 1788) (1028, 1776) (1168, 1818) (1038, 1840) (1350, 1868) (1206, 2286) (1334, 2284) (1120, 2424) (1312, 2413) 9 (1030, 992) (1114, 956) (1228, 1006) (1112, 1024) (1322, 1002) (1256, 1388) (1296, 1314) (1202, 1574) (1360, 1542) 10 (1252, 1108) (1356, 1078) (1400, 1142) (1344, 1144) (1606, 1176) (1468, 1472) (1596, 1428) (1322, 1620) (1488, 1621) 11 (1062, 698) (1128, 676) (1198, 722) (1130, 726) (1284, 712) (1222, 952) (1262, 900) (1156, 1082) (1271, 1066) 12 (898, 2216) (1006, 2212) (1078, 2220) (1006, 2258) (1330, 2292) (1148, 2610) (1300, 2610) (980, 2708) (1170, 2664) 13 (880, 1462) (996, 1458) (1088, 1496) (1000, 1518) (1304, 1558) (1136, 1916) (1266, 1914) (1006, 2040) (1199, 2019) 14 (842, 1542) (892, 1538) (964, 1574) (894, 1580) (1052, 1618) (970, 1844) (1032, 1836) (916, 1918) (1031, 1921) 15 (1330, 1708) (1420, 1696) (1464, 1706) (1406, 1736) (1644, 1756) (1492, 2024) (1636, 2012) (1340, 2140) (1521, 2089) 16 (1426, 1306) (1518, 1288) (1544, 1330) (1502, 1342) (1730, 1366) (1618, 1640) (1748, 1616) (1456, 1778) (1623, 1753) Two-head image↓ 17 (1364, 1358) (1432, 1346) (1440, 1386) (1420, 1385) (1585, 1410) (1503, 1633) (1601, 1673) (1395, 1737) (1488, 1709) 18 (1144, 1204) (1228, 1182) (1276, 1216) (1223, 1228) (1449, 1248) (1352, 1512) (1443, 1533) (1216, 1616) (1384, 1576) 19 (804, 1497) (923, 1481) (1020, 1533) (922, 1542) (1150, 1518) (1059, 1942) (1144, 1991) (962, 2021) (1128, 2028) 20 (955, 1566) (1029, 1636) (1119, 1690) (1022, 1658) (1249, 1772) (1056, 2139) (1107, 2303) (1031, 2200) (1256, 2276) 21 (1473, 1066) (1568, 1054) (1565, 1112) (1540, 1112) (1755, 1163) (1658, 1466) (1797, 1497) (1486, 1615) (1609, 1599) 22 (1413, 1707) (1464, 1687) (1456, 1736) (1438, 1754) (1583, 1754) (1488, 2064) (1607, 2070) (1350, 2239) (1371, 2221) 23 (1280, 1413) (1389, 1376) (1444, 1433) (1362, 1481) (1628, 1440) (1532, 1737) (1646, 1761) (1407, 1882) (1580, 1868) 24 (576, 1195) (668, 1115) (796, 1167) (671, 1243) (911, 1116) (847, 1513) (923, 1504) (792, 1710) (938, 1667)

58 REFERENCES

[1] Getulio J.A. Amaral, Ian L. Dryden, Vic Patrangenaru, and Andrew T.A. Wood. Bootstrap confidence regions for the planar mean shape. Journal of Statistical Planning and Inference, 140(11):3026–3034, 2010.

[2] Rudolf Beran and Nicholas I. Fisher. Nonparametric comparison of mean directions or mean axes. Annals of statistics, pages 472–493, 1998.

[3] Rikard Berthilsson and Anders Heyden. Recognition of planar objects using the density of affine shape. Computer Vision and Image Understanding, 76(2):135–145, 1999.

[4] A Bhattacharya and R Bhattacharya. Nonparametric statistics on manifolds with applications to shape spaces, IMS Monograph Series, 2011.

[5] Abhishek Bhattacharya. Statistical analysis on manifolds: A nonparametric approach for inference on shape spaces. Sankhya:¯ The Indian Journal of Statistics, Series A (2008-), pages 223–266, 2008.

[6] Rabi Bhattacharya and Vic Patrangenaru. Large sample theory of intrinsic and extrinsic sample means on manifolds. ii. Annals of Statistics, pages 1225–1259, 2005.

[7] Rabi N. Bhattacharya and Jayanta K. Ghosh. On the validity of the formal Edgeworth expansion. Annals of Statistics, 6(2):434–451, 1978.

[8] Fred L Bookstein. Morphometric tools for landmark data: geometry and biology. Cambridge Univer- sity Press, 1997.

[9] M. Buibas, M. Crane, L. Ellingson, and V. Patrangenaru. A projective frame based shape analysis of a rigid scene from noncalibrated digital camera imaging outputs. In JSM Proceedings, pages 4730–4744, 2011.

[10] M. Crane and Vic Patrangenaru. Random change on a Lie group and mean glaucomatous projective shape change detection from stereo pair images. Journal of Multivariate Analysis, 102(2):225–237, 2011.

[11] Ian L. Dryden and Kanti V. Mardia. Statistical shape analysis, volume 4. John Wiley & Sons New York, 1998.

[12] Bradley Efron. Bootstrap methods: another look at the jackknife. The Annals of Statistics, pages 1–26, 1979.

[13] Bradley Efron. The jackknife, the bootstrap and other resampling plans, volume 38. SIAM, 1982.

59 [14] Leif Ellingson, Vic Patrangenaru, and Frits Ruymgaart. Nonparametric estimation of means on Hilbert manifolds and extrinsic analysis of mean shapes of contours. Journal of Multivariate Ana- lysis, 122:317–333, 2013.

[15] Dean Falk, Frederick E Lepore, and Adrianne Noe. The cerebral cortex of Albert Einstein: a descrip- tion and preliminary analysis of unpublished photographs. Brain, 136(4):1304–1327, 2013.

[16] Olivier D. Faugeras. What can be seen in three dimensions with an uncalibrated stereo rig? In Computer Vision-ECCV’92, pages 563–578. Springer, 1992.

[17] Nicholas I Fisher, Peter Hall, Bing-Yi Jing, and Andrew TA Wood. Improved pivotal methods for constructing confidence regions with directional data. Journal of the American Statistical Association, 91(435):1062–1070, 1996.

[18] P Thomas Fletcher. Geodesic regression and the theory of least squares on Riemannian manifolds. International Journal of Computer Vision, 105(2):171–185, 2013.

[19] Maurice Frechet.´ Les el´ ements´ aleatoires´ de nature quelconque dans un espace distancie.´ In Annales de l’institut Henri Poincare´, volume 10, pages 215–310, 1948.

[20] Colin R. Goodall and Kanti V. Mardia. Projective shape analysis. Journal of Computational and Graphical Statistics, 8(2):143–168, 1999.

[21] Peter Hall and Jeffrey D. Hart. Bootstrap test for difference between means in nonparametric regres- sion. Journal of the American Statistical Association, 85(412):1039–1049, 1990.

[22] Richard Hartley, Rajiv Gupta, and Tom Chang. Stereo from uncalibrated cameras. In Computer Vision and Pattern Recognition, 1992. Proceedings CVPR’92., 1992 IEEE Computer Society Conference on, pages 761–764. IEEE, 1992.

[23] Richard Hartley and Andrew Zisserman. Multiple view geometry in computer vision. Cambridge University Press, 2 edition, 2004.

[24] Harrie Hendriks and Zinoviy Landsman. Mean location and sample mean location on manifolds: Asymptotics, tests, confidence regions. Journal of Multivariate Analysis, 67(2):227–243, 1998.

[25] Anders Heyden. Geometry and algebra of multiple projective transformations. Lund Institute of Technology, Department of Mathematics, 1995.

[26] Stephan F Huckemann et al. Intrinsic inference on the mean geodesic of planar shapes and tree dis- crimination by leaf growth. The Annals of Statistics, 39(2):1098–1124, 2011.

[27] Dalong Jiang, Yuxiao Hu, Shuicheng Yan, Lei Zhang, Hongjiang Zhang, and Wen Gao. Efficient 3D reconstruction for face recognition. Pattern Recognition, 38(6):787–798, 2005.

60 [28] David G Kendall. Shape manifolds, Procrustean metrics, and complex projective spaces. Bulletin of the London Mathematical Society, 16(2):81–121, 1984.

[29] John T. Kent and Kanti V. Mardia. A geometric approach to projective shape and the cross ratio. Biometrika, 99(4):833–849, 2012.

[30] J.T. Kent. New directions in shape analysis. The art of statistical science, pages 115–127, 1992.

[31] H Christopher Longuet-Higgins. A computer algorithm for reconstructing a scene from two projec- tions. Nature, 293:133–135, 1981.

[32] Yi Ma, Stefano Soatto, Jana Kosecka, and S. Shankar Sastry. An invitation to 3-d vision: from images to geometric models, volume 26. Springer Science & Business Media, 2012.

[33] Kanti V. Mardia and Vic Patrangenaru. Directions and projective shapes. Annals of Statistics, pages 1666–1699, 2005.

[34] Stephen J. Maybank. Classification based on the cross ratio. In Applications of Invariance in Computer Vision, pages 451–472. Springer, 1994.

[35] Weiwei Men, Dean Falk, Tao Sun, Weibo Chen, Jianqi Li, Dazhi Yin, Lili Zang, and Mingxia Fan. The corpus callosum of Albert Einstein’s brain: another clue to his high intelligence? Brain, page awt252, 2013.

[36] Axel Munk, Robert Paige, J Pang, Vic Patrangenaru, and F Ruymgaart. The one-and multi-sample problem for functional data with application to projective shape analysis. Journal of Multivariate Analysis, 99(5):815–833, 2008.

[37] Daniel E. Osborne, Vic Patrangenaru, Mingfei Qiu, and Hilary W. Thompson. Nonparametric data analysis methods in medical imaging. Geometry Driven Statistics, page 182, 2015.

[38] V. Patrangenaru, M.A. Crane, X. Liu, X. Descombes, G. Derado, W. Liu, V. Balan, V.P. Patrangenaru, and H.W. Thompson. Methodology for 3D scene reconstruction from digital camera images. In Proceedings of the international conference of Differential Geometry and Dynamical Systems (DGDS- 2011), volume 19, pages 110–124, 2012.

[39] V. Patrangenaru and M. Qiu. Neighborhood tests for mean change on infinite dimensional Lie groups and 3D projective shape analysis of contours. Technical report, Florida State University, Department of Statistics, 2014.

[40] Vic Patrangenaru. New large sample and bootstrap methods on shape spaces in high level analysis of natural images. Communications in Statistics-Theory and Methods, 30(8-9):1675–1693, 2001.

[41] Vic Patrangenaru and Leif Ellingson. Nonparametric Statistics on Manifolds and Their Applications to Object Data Analysis. Chapman&Hall/CRC. ISBN-13: 978-1439820506, 2015.

61 [42] Vic Patrangenaru, Xiwen Liu, and Samanmalee Sugathadasa. Nonparametric 3Dprojective shape esti- mation from pairs of 2D images - i, in memory of W.P. Dayawansa. Journal of Multivariate Analysis, 2010.

[43] Vic Patrangenaru and Kanti V Mardia. Affine shape analysis and image analysis. In 22nd Leeds Annual Statistics Research Workshop, 2003.

[44] Vic Patrangenaru, Robert Paige, K. David Yao, Mingfei Qiu, and David Lester. Projective shape analysis of contours and finite 3d configurations from digital camera images. Submitted, 2015.

[45] Vic Patrangenaru, Mingfei Qiu, and Marius Buibas. Two sample tests for mean 3D projective shapes from digital camera images. Methodology and Computing in Applied Probability, 16(2):485–506, 2014.

[46] Victor Patrangenaru. Asymptotic statistics on manifolds and their applications. Ph. D. dissertation, Indiana University, 1998.

[47] Mingfei Qiu, Robert Paige, and Vic Patrangenaru. Neighborhood hypothesis testing for mean change on infinite dimensional Lie groups and 3D projective shape analysis of matched contours. Submitted, 2015.

[48] Mingfei Qiu, Vic Patrangenaru, and Leif Ellingson. How far is the corpus callosum of an average individual from Albert Einstein’s? In COMPSTAT Proceedings, pages 403–410, 2014.

[49] Michael Spivak. A comprehensive introduction to differential geometry. Publish or perish, I, 2005.

[50] MS Sugathadasa. Affine and projective shape analysis with applications. PhD dissertation, Texas Tech University, 2006.

[51] Haonan Wang and JS Marron. Object oriented data analysis: Sets of trees. The Annals of Statistics, 35(5):1849–1873, 2007.

62 BIOGRAPHICAL SKETCH

The author was born in Fangchenggang, a young and energetic harbor city in China. She joined Florida State University in 2011 as a graduate student. With four and half years’ study, she is expected to graduate by December 2015 with a Ph.D. in Statistics and an M.S. in Financial Mathematics degrees at the same time. The research interests of the author are Hypothesis Testing, Nonparametric Statistics, Image Processing and Shape Analysis. Her work with Vic Patrangenaru and Marius Buibas ’Two Sample Tests for Mean 3D Projective Shapes from Digital Camera Images’ was published by Methodology and Computing in Applied Probability in 2014 as a journal paper; the paper coauthored with Vic Patrangenaru and Leif Ellingson ’How Far is the Corpus Callosum of an Average Individual from Albert Einstein’s?’ was published by COMPSTAT in 2014 as a proceeding paper; and the paper ’Nonparametric Data Analysis Methods in Medical Imaging’ was published by Wiley Series in Probability and Statistics as a chapter in the book ’Geometry Driven Statistics’, which was co-worked with Daniel Osborne, Vic Patrangenaru and Hillary Thompson. The author has two submitted papers: ’Neighborhood Hypothesis Testing for Mean Change on Infinite Dimensional Lie Groups and 3D Projective Shape Analysis of Matched Contours’ (joint work with Robert Paige and Vic Patrangenaru), and ’Projective Shape Analysis of 3D Scenes from Digital Camera Images’ (coauthored with Vic Patrangenaru, Robert Paige, K. David Yao and David Lester). She helped with the editing of two books ’Nonparametric Statistics on Manifolds and Their Applications to Object Data Analysis’ and ’A Course in Mathematical Statistics and Large Sample Theory, Springer, Statistics Series’. The author also has 13 oral/poster presentations in domestic and international conferences. Moreover, the author was the member of ’Data analysis on Hilbert manifolds and their applications’, which is a working group of the Low-dimensional Structure in High-dimensional Systems (LDHD) program in the Statistical and Applied Mathematical Sciences Institute (SAMSI). Under the kindly help of her committee, the author was awarded ’Yongyuan and Anna Li Award’ by FSU Department of Statistics for the best graduate presentation in May 2015, ’Travel Award’ by American Mathematical Society for the outstanding contribution to the ’Statistics on Manifold’ area in April 2014 and ’Outstanding Scientific Presentation’ by FSU Department of Statistics for the oral and poster presentation skill in November 2013. Moreover, the author had presented her doctoral research in 3 minutes to 300+ au- diences and 5+ media with various backgrounds in the FSU ’Three Minute Thesis’ research communication competition in February 2014.

63