PROJECTIVE VISUAL HULLS Beckman CVR Technical Report 2002–01
Svetlana Lazebnik, M.S. Department of Computer Science University of Illinois at Urbana-Champaign, 2002 Jean Ponce, Advisor
This thesis presents an image-based method for computing the visual hull of an object bounded by a smooth surface and observed by a finite number of perspective cameras.
The essential structure of the visual hull is projective: to compute an exact topological
(combinatorial) description of its boundary, we do not need to know the Euclidean properties of the input cameras or of the scene. Unlike most existing visual hull computation methods, ours requires only a projective reconstruction of the camera matrices, or equivalently, the epipolar geometry between each pair of cameras in the scene. Starting with a rigorous theoretical framework of oriented projective geometry and projective differential geometry, we develop a suite of algorithms to construct the visual hull and associated data structures.
The thesis discusses our implementation of the algorithms, and presents experimental results on synthetic and real data sets. PROJECTIVE VISUAL HULLS BECKMAN CVR TECHNICAL REPORT 2002–01
BY SVETLANA LAZEBNIK
B.S., DePaul University, 2000
THESIS
Submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science in the Graduate College of the University of Illinois at Urbana-Champaign, 2002
Urbana, Illinois c Copyright by Svetlana Lazebnik, 2002 ABSTRACT
This thesis presents an image-based method for computing the visual hull of an object bounded by a smooth surface and observed by a finite number of perspective cameras. The essential structure of the visual hull is projective: to compute an exact topological
(combinatorial) description of its boundary, we do not need to know the Euclidean properties of the input cameras or of the scene. Unlike most existing visual hull computation methods, ours requires only a projective reconstruction of the camera matrices, or equivalently, the epipolar geometry between each pair of cameras in the scene. Starting with a rigorous theoretical framework of oriented projective geometry and projective differential geometry, we develop a suite of algorithms to construct the visual hull and associated data structures. The thesis discusses our implementation of the algorithms, and presents experimental results on synthetic and real data sets.
iii To Max
iv ACKNOWLEDGMENTS
First and foremost, thanks are due to my advisor, Jean Ponce, for finding my research interesting, holding it to an exacting standard, and constantly telling me to be more positive. I gratefully acknowledge the National Science Foundation for supporting this research under the grant IRI-990709, and the Computer Science Department and the College of Engineering for supporting me with the SURGE fellowship and various awards.
I would also like to thank Edmond Boyer for providing the gourd data set, and Steve Sullivan for providing the squash and the Steve data sets. Both Edmond and Steve were responsible for inspiring the research that eventually developed into this thesis. Fred Roth- ganger also deserves a mention for taking time for idle conversations and for carrying on the constant uphill battle to keep the lab machines up and running.
Thanks are due to my family: my grandma and parents for constantly asking when the thesis will be done, and my sister Maria for her silent brand of commiseration. Good luck in grad school, Maria! Finally, I must say that this thesis would never have been completed on time without the help and loving care of my husband, Dr. Max Raginsky. I hope you will be there for me when I am writing my Ph.D. thesis!
v TABLE OF CONTENTS
CHAPTER PAGE
1 Introduction ...... 1 1.1 DefiningtheVisualHull...... 2 1.2 PreviousWork:ComputingDiscreteVisualHulls ...... 6 1.2.1 VolumeIntersection...... 6 1.2.2 ShapefromDeformingContours...... 8 1.2.3 ApplicationsofVisualHulls...... 10 1.3 MathematicalIngredients...... 12 1.4 Overview...... 13
2 Oriented Projective Geometry ...... 15 2.1 Basics ...... 16 2.1.1 OrientedProjectiveSpace ...... 16 2.1.2 Flats...... 18 2.1.3 Join,Meet,andRelativeOrientation ...... 20 2.1.4 OrientedProjectiveTransformations ...... 23 2.2 ComputingwithFlats...... 25 2.2.1 OrientedProjectiveFrames...... 25 2.2.2 SimplexOrientation...... 27 2.2.3 RepresentingGeneralFlats ...... 28 2.2.4 RepresentingProjectiveTransformations ...... 30 2.3 ImagingGeometryofaSingleCamera...... 32 2.4 OrientedMulti-ViewGeometry...... 39 2.4.1 Fundamental Matrix ...... 39 2.4.2 OrientedTrifocalTensor ...... 45 2.4.3 OrientedTransfer ...... 48 2.4.3.1 TransferUsingEpipolarGeometry...... 49 2.4.3.2 TransferUsingtheTrifocalTensor...... 51 2.5 OrientedProjectiveReconstruction...... 54
3 Projective Differential Geometry ...... 58 3.1 Curves...... 59 3.1.1 DifferentialEquationsofCurves...... 59 3.1.2 OsculatingSpaces ...... 62
vi 3.1.3 OrderofContact...... 64 3.2 Surfaces...... 65 3.2.1 OrderofContactofSurfaces ...... 67 3.2.2 DevelopableSurfaces...... 67 3.2.3 ConjugateNets...... 69 3.2.4 AsymptoticDirections...... 72 3.2.5 AlternativeDefinitionsofConjugacy ...... 75 3.2.6 LocalShape...... 79 3.3 OrientingCurvesandSurfaces ...... 83 3.3.1 OrientingPlaneCurves...... 84 3.3.2 OrientingSurfaces...... 88
4 Visual Hulls ...... 93 4.1 PropertiesofRimsandApparentContours...... 93 4.2 FrontierPoints...... 102 4.3 TheRimMesh...... 109 4.3.1 OrientedStructureoftheRimMesh...... 111 4.3.2 ReconstructingtheRimMesh ...... 112 4.3.3 CombinatorialComplexityoftheRimMesh...... 119 4.4 IntersectionCurves...... 120 4.4.1 GeometricPropertiesofIntersectionCurves...... 122 4.4.2 TracingIntersectionCurves...... 130 4.5 The1-SkeletonoftheVisualHull...... 142 4.5.1 ClippingIntersectionCurves ...... 142 4.5.2 IntersectionPoints...... 144 4.5.3 AnIncrementalAlgorithm...... 147 4.6 ComputingtheFacesoftheVisualHull ...... 151 4.6.1 RayIntervals...... 152 4.6.2 VerticalDecomposition ...... 158 4.6.3 ConvexObjects:TheVisualHullandtheRimMesh ...... 162
5 Implementation and Results ...... 166 5.1 ImplementationDetails...... 166 5.1.1 DiscreteContourRepresentation ...... 166 5.1.2 General Position Assumptions ...... 169 5.1.3 3DReconstruction...... 173 5.1.4 Efficiency ...... 174 5.2 RimMeshResults ...... 175 5.3 VisualHullResults...... 180
6 Conclusion ...... 194 6.1 Summary...... 194 6.2 FutureWork...... 196
vii APPENDIX A Oriented Formulas ...... 201 A.1 Formulas for T2 ...... 201 A.2 Formulas for T3 ...... 202 A.3 Algebraic and Infinitesimal Properties of Join and Meet ...... 205
REFERENCES ...... 207
viii CHAPTER 1
Introduction
Suppose that we have taken a few snapshots of an object from a few known camera viewpoints and then extracted the silhouette of the object from each photograph. Clearly, we have lost most of the information about the 3D shape of the object. However, we can still try our best to reverse the imaging process by reconstructing this shape. We can imagine each camera as a slide projector emitting a cone of rays from each pinhole through the silhouette on its image plane. The object is then constrained to lie in the region of space that falls inside the cone due to each camera. This region, called the visual hull, is the least-committed estimate of the shape of the object based on silhouette data alone. We can ask, what is the accuracy of the reconstruction provided by the visual hull? From a theoretical point of view, we may be interested in reconstruction in the limit: if we can observe all possible silhouettes of the object, does the visual hull reconstruct its shape exactly? As it turns out the answer is negative: for instance, “dents” or concave parts of the surface will never appear on its silhouettes. From the practical point of view, we are concerned with discretization:sincewe can take only a finite number of pictures, what can we say about the shape of the computed discrete visual hull? How does it depend on the input camera positions? What does it tell us about the shape of the actual 3D object? How does it differ from the theoretical limiting case? In this thesis, we will attempt to answer some of these questions.
1 1.1 Defining the Visual Hull
Our point of departure is the standard theoretical definition of a visual hull introduced by Laurentini [29]:
Definition 1.1 (Visual Hull). The visual hull V of an object Ω relative to a viewing region R is a region of space consisting of all points X such that for each viewpoint (camera center) O in R, the visual ray L starting at O and passing through X contains at least one point of the object Ω.
Even though we are interested in reconstructing Ω based on the images taken by the cameras with centers in R, Definition 1.1 does not actually mention the image planes of the cameras. To arrive at an image-based definition of the visual hull (and also to clarify
Definition 1.1), consider a point X that belongs to V .Letx be the projection of X in the image plane of the camera with center O. Because the ray L passes through Ω, the image point x must also be a projection of some point of Ω, so it must belong to the silhouette,or the two-dimensional region on the image plane occupied by the projection of Ω. Therefore, we can rephrase Definition 1.1 to state that V is the set of all points X that project inside the silhouette of Ω in the image plane of a perspective camera centered at any point O in
R. In other words, the visual hull is the maximal (largest) shape that produces the same silhouettes as Ω when seen from all points in R. In the opening, we mentioned the limiting visual hull, which can be obtained by observing the silhouettes of Ω from every possible viewpoint (for technical reasons, we have to require that the viewing region R be outside the convex hull of S [29]). This is the external visual hull, denoted V ∗, and it is contained in any other visual hull of Ω. In theoretical study, then, the external visual hull can be thought of as the visual hull. There exist algorithms for computing the external visual hull of polyhedral and smooth objects, in 2D and in 3D [29, 30, 48] but they assume that an exact model of the target object is available.
2 Figure 1.1 illustrates the concept of visual hulls on a simple 2D example. On the left, we see the imaging setup for a bean-shaped object Ω and a three-camera viewing region
R = {O1, O2, O3}. Note that in 2D, the retinas of the cameras are lines, and the projections of Ω (silhouettes) are one-dimensional. On the right, we see the discrete visual hull V formed by intersecting three back-projected cones K1, K2,andK3. Figure 1.2 shows the relationship between the object Ω, the external visual hull V ∗, and the discrete visual hull V .Notethat
V ∗ is contained in V , but is larger than Ω itself.
W
K3
O K1 O 3 3 K2
O O 1 1
O O 2 2 (a) (b)
Figure 1.1 (a) Observing a 2D object Ω from a viewing region R = {O1, O2, O3}.(b) Reconstructing Ω: the three visual cones K1,K2,K3 and the discrete hull V as their inter- section.
As mentioned earlier, in this work we are concerned with the discrete visual hull, where R is a finite set of isolated viewpoints. For this reason, we find it useful to state a definition of the visual hull based on volume intersection:
Definition 1.2. Let R = {O1, O2,...,On} be the viewing region consisting of a finite number of camera centers. Consider the solid visual cone Ki of rays that originate at Oi
3 W É V* É V
Figure 1.2 The relationship between the object Ω, the external visual hull V ∗,andthe ∗ discrete visual hull V where R = {O1, O2, O3} as in figure 1.1. Note that V does not capture the concavity on one side of Ω (the subset of the boundary of V ∗ that does not coincide with Ω is actually a bitangent to the boundary of Ω). and pass through any point on the surface Ω. Then the visual hull V is equal to the (possibly unbounded) solid formed by the intersection of the viewing cones: n V = Ki. i=1
Although the visual hull as an intersection of visual cones is a familiar notion in computer vision, the issue of computing its exact geometric and topological structure has received little attention in previous work. In the current thesis, we attempt to fill this gap. To this end, we represent the visual hull as a generalized or topological polyhedron [17, 23]:
Definition 1.3. A generalized polyhedron M is a 3D solid (bounded or unbounded) whose boundary ∂M is a union of faces, edges,andvertices subject to the following constraints:
• Each face is a 2-manifold with boundary;
• The intersection of two faces is either an edge, a vertex, or empty;
• The intersection of two edges is either a vertex or empty;
• Each vertex has a neighborhood of ∂M homeomorphic to a disk.
4 W
Apparent contour (outline)
Visual cone
Silhouette O
Figure 1.3 Terminology introduced so far: visual cone, silhouette, outline.
If we regard V as a topological polyhedron formed by the intersection of a finite number of visual cones, then the three components of its boundary description are characterized as follows:
Face: a maximal region of ∂V that belongs to a single visual cone and has the topology of
a 2-manifold with boundary.
Edge: a maximal connected subset of an intersection curve between two cones that has the
topology of a 1-manifold with boundary.
Vertex: an isolated point on ∂V that is the intersection of three or more faces.
To avoid confusion, we explain here our usage of the term exact.Namely,wesaythata representation of the visual hull is exact when it correctly captures (up to arbitrary projec- tive transformations) the topological and geometric features of the solid that results from intersecting a finite number of viewing cones, as defined above. By contrast, from a theoret- ical point of view, the external visual hull would be considered exact since it represents the
“limit” of all possible visual hulls.
5 1.2 Previous Work: Computing Discrete Visual Hulls
There exist two conceptually different strategies to computing discrete visual hulls. The
first strategy, which is the oldest, is to directly implement volume intersection, while the second one is to assume a smooth object and a continuous camera motion and to reconstruct the object as the envelope of its tangent planes.
1.2.1 Volume Intersection
The main advantage of volume intersection algorithms is that they (in principle) work with any combination of calibrated input viewpoints and make no assumptions about the object shape, e.g., smoothness or topology. One common way to implement volume in- tersection is to approximate visual cones by polyhedra. The oldest such algorithm dates back to Baumgart’s 1974 PhD thesis [2]. In this work, a polyhedral visual hull is con- structed by intersecting the viewing cones associated with polygonal approximations to the extracted silhouettes. Since 1974, many volume intersection systems have continued to rely on 3D polyhedral intersections, which can be tricky to implement. The main difficulties involve handling degenerate special cases and dealing with numerical instabilities that arise when intersecting polygons that are nearly tangent. Some reconstruction systems implement volume intersection using commercial software packages or publicly available generic solid modeling libraries [34].
Possibly the most popular technique for visual hull construction is to approximate visual cones by voxel volumes. For example, Szeliski [57] has introduced an efficient voxel-based algorithm that relies on hierarchical spatial data structures. Because of its robustness, simplicity, and speed, this basic approach remains popular. A practical modeling system today is likely to be similar to the one outlined by Wong and Cipolla [62], which uses octree- based carving, combined with the Marching Cubes algorithm [37] to extract a triangle mesh out of the voxel volume. Voxel carving is not susceptible to numerical difficulties that can
6 arise in exact computations with polyhedra. Moreover, its running time depends only on the number of input cameras and on the resolution of the volumetric representation, not on the intrinsic complexity of the visual hull. However, volumetric methods suffer from artifacts resulting from the quantization of the voxel volume.
For decades, 3D polyhedral intersections and voxel carving have remained the only ap- proaches to computing visual hulls. In the last few years, vision and graphics researchers have become more interested in image-based rendering and have become more aware of multi-view geometry [14, 21]. This “paradigm shift” has resulted in new efficient algorithms that avoid general 3D intersections by taking advantage of epipolar geometry. Matusik et. al. [38] describe a fast, simple algorithm that involves image-based sampling along visual rays emanating from a virtual camera. The main limitation of this algorithm is that its output is view-dependent: if one wishes to render the visual hull from a new virtual viewpoint, one must re-run the construction algorithm. Another disadvantage is that image-based visual hulls require custom-built rendering routines. For many applications, it is preferable to gen- erate a standard polygonal model that can be displayed using standard graphics hardware.
Two later papers from the same research group [39, 54] extend the idea behind image-based visual hulls to produce view-independent polyhedral models. The most important contribu- tion of these newer algorithms is the reduction of 3D polyhedral intersections to 2D. This is an important algorithmic advance, one which we extend in this thesis. Perhaps the most important way in which we go beyond state-of-the-art visual hull algorithms is in producing a representation of the visual hull in terms of its intrinsic topological features, instead of artifacts of discretization like polyhedra, voxels, or irregularly sampled points. We will also show how to compute visual hulls while making minimal assumptions about the form of the input and the imaging geometry. A common thread in recent research involves using the visual hull as an initial step in surface optimization based on criteria other than silhouette-consistency. Sullivan and
7 Ponce [56] describe a system that creates a rough model using polyhedral intersection, and then optimizes this model using triangular splines. In this work, the optimization component seeks to minimize the average ray-surface distance in 3D (as opposed to image-plane distance between the input silhouettes and the reprojected model). An important optimization crite- rion introduced recently is photo-consistency, embodied in the photo hull [27]. Whereas the visual hull is required to conform only to the silhouettes in the original pictures, the photo hull must exactly reproduce all input images of the target object. Cross and Zisserman [10] discuss an “optimal” reconstruction algorithm that uses the visual hull as the initial input to an optimization routine based on photo-consistency. The idea of combining silhouette- and photo-consistency criteria in automatic object reconstruction holds much promise for future research. The algorithm proposed in this thesis relies on epipolar geometry, uses only two-dimensional computations, and if necessary, constructs a completely image-based representation of the visual hull. For these reasons, it provides a good starting point for optimization using photo-consistency constraints.
1.2.2 Shape from Deforming Contours
Volume intersection is the most general approach to computing discrete visual hulls, requiring no assumptions about the configuration of the input viewpoints, the geometry of the surface, or its topology. However, this approach is rather ill-adapted to handling large numbers of almost coincident visual cones — a situation that arises when we wish to reconstruct an object from a video clip taken by a camera following a continuous trajectory.
If we assume that the object is smooth, we can use the elegant differential techniques of shape from deforming contours [9]. One if the key ideas is to consider the apparent contour as the image of a critical set of points on the surface at which the projection map folds [16].
The critical set is also known as the contour generator or the rim (see figure 1.4). Observing a smooth point on the apparent contour in the image allows us to reconstruct the tangent
8 plane to the object in space. If the camera moves continuously, the rim gradually “slips” along the surface, and the surface may be reconstructed as the envelope of its tangent planes.
This is volume intersection in an infinitesimal sense, and the shape obtained as a result of such an algorithm is an approximation to the visual hull. This approach typically assumes that the target objects are smooth, and requires all outlines to be non-singular (conceptual difficulties arise when some parts of apparent contours become occluded by the surface, or when contours change topology as they evolve). In practice, finite-difference approximations have to replace derivative computations, introducing numerical instability.
W
X
Rim
x
Apparent contour
Tangent plane
O
Figure 1.4 The rim is the set of all points X on the surface for which the tangent plane passes through the camera center. The apparent contour in the image is formed by all points x that are projections of rim points.
Koenderink [24] was among the first to elucidate the relationship between the local ge- ometry of the 3D surface on the geometry of the 2D contour in a single image. Giblin and Weiss [16] have introduced a mathematical framework for the problem of smooth surface reconstruction from apparent contours, along with a reconstruction algorithm for three-
9 dimensional objects assuming orthographic projection and camera motion in a great circle.
Subsequent approaches generalize this framework to handle perspective projection and gen- eral (known) camera motions. Cipolla and Blake [7] derive formulas for depth and Gaussian curvature of the object surface along the evolving apparent contours. Based on extensive experimental results, the authors conclude that computing depth and curvature by taking
finite-difference approximations of first and second derivatives of the spatiotemporal surface
(the surface formed by apparent contours that evolve in time as the camera moves) is inher- ently sensitive to noise and computation error. Vaillant and Faugeras [58] present another treatment of the reconstruction problem, using a surface parametrization based on the Gauss map. They also note that computing depth and curvature requires derivatives of image mea- surements up to second order, and propose more robust solutions. Boyer and Berger [4] report a discrete algorithm based on a local approximation of the surface by an osculat- ing quadric. All three approaches assume known camera motion and perspective projection
(fully calibrated), make extensive use of differential geometry, and focus on estimating local properties of the surface, such as Gaussian curvature. The present work inherits some of these features. We assume smooth surfaces and use techniques of differential geometry to establish local properties of visual hulls. However, our reconstruction algorithm does not rely on approximations of local surface shape or curvature — the entity we compute is exactly the discrete visual hull.
1.2.3 Applications of Visual Hulls
Though applications per se are not the subject of the current thesis, it must be mentioned that visual hulls have considerable practical utility. Visual hulls are conceptually simple, robust, can be built from a very small number of views (≈ 10), and require as input only silhouettes and camera parameters — information that can be readily obtained in controlled
10 lab or studio environments. We are aware of at least one commercial system [44] that has used visual hulls for automatic 3D model construction.
One of the most popular applications for visual hulls is virtual reality. Lok [35, 36] describes a system that relies on visual hulls to render avatars, or graphical representations of users in immersive virtual spaces. Leibe et al. [34] use visual hulls to automatically reconstruct objects placed on a special table by the users. The idea is that such objects can serve as natural components of a novel “wire-free” interface for human/computer interaction. In this work, the visual hull is constructed not from silhouettes in images taken by several cameras, but from shadows cast on the table by several strategically placed infrared light sources. Though this setup may seem unusual, the principle behind visual hull construction remains exactly the same.
If visual hull technology is to be useful for spontaneous human interaction with virtual environments, it needs to be fast enough to run in real time. This issue is addressed by
Matusik et al. [38], who speed up the reconstruction process by introducing a view-dependent sampled representation and by applying clever optimizations to the computationally intensive operation of intersecting reprojected visual rays with silhouettes.
Besides virtual reality, another important application of visual hulls is 3D photography, or acquisition of high-quality geometric and photometric models of real-world objects. For example, Shlyakhter et al. [54] use visual hulls to build models of trees. A recent state-of-the- art system for 3D photography is based on opacity hulls [40], which are essentially visual hulls augmented with transparency information. In this work, silhouettes are extracted using so- phisticated matting techniques that involve photographing the object against a background of plasma monitors displaying specially calibrated sinusoidal patterns, and visual hulls are computed using a variant of the real-time image-based algorithm [38]. In the future, 3D photography is likely to be extended to dynamic objects and increasingly complex environ- ments.
11 1.3 Mathematical Ingredients
In this thesis, our first task is to discover the “true” nature of the visual hull — that is, to
find the largest set of transformations that does not change the structure of its surface as a topological polyhedron (recall Section 1.1). The key features of the visual hull surface are its edges and vertices. As already mentioned earlier, edges of the visual hull lie on intersection curves between pairs of cones. An edge point is a point common to two visual cones, or equivalently, the intersection of two visual rays formed by back-projecting two points on two different outlines. In Chapter 4, we will show that visual hull vertices can be of two types:
Intersection Point: a point common to three visual cones. Equivalently, it is the point of intersection of three visual rays formed by back-projecting three outline points in
three different views.
Frontier Point: a point where a plane passing through two of the camera centers is tangent to the object.
Though the technical details of the definitions of edges and vertices are not important at this stage, it is easy to see that these definitions are based on the notion of contact of lines and planes with a surface in 3D. Such contacts are the domain of projective geometry,as they remain invariant under all (smooth) transformations that leave lines and planes intact.
Having concluded that the true nature of the visual hull is projective, we state our primary goal: to develop algorithms for reconstructing the visual hull based on projective information alone. However, things are not so simple: projective geometry happens to lack many basic notions without which one cannot compute the visual hull. In a purely projective framework, we cannot define a ray or a line segment; such intuitive relations as front/back and inside/outside have no meaning. Thus, in Chapter 2 we are forced to explore oriented projective geometry [55], an elegant extension of standard projective geometry that will give us just enough additional expressive power to describe the structure of visual hulls.
12 Besides oriented projective geometry, we will also need to explore projective differential geometry, an area of mathematics that deals with projectively-invariant properties of smooth curves and surfaces. Chapter 3 will deal with the fundamentals of this subject, and state several important results which will be used in our main derivations in Chapter 4.
Finally, let us say a word about the key assumptions underlying our work. We assume that the object Ω is bounded by a smooth surface. The justification for the smoothness assumption comes from Koenderink who insists that “real-life tolerances make everything smooth” [25]. In addition, we assume that Ω is generic. Informally, an object is generic with respect to some property if this property also holds for any object that is obtained by an infinitesimal perturbation. The genericity assumption also extends to the outlines seen by all input cameras: namely, outlines must remain topologically stable under infinites- imal movements of the camera center. This requirement rules out viewpoints located on one-dimensional critical event surfaces that form the cell boundaries in perspective aspect graphs [47]. Finally, a general position assumption applies to the configuration of the cameras in the viewing region. Specifically, no four cameras can be coplanar and no three cameras can be located in the same tangent plane to the surface.
The main purpose of the genericity assumption is to restrict our attention to properties that we expect to find in “typical” situations. For our purposes, generic objects and camera configurations are precisely those that persist under small perturbations that can easily be introduced by noise and numerical error in real-world applications.
1.4 Overview
The rest of the thesis is organized as follows. Chapter 2 presents the basics of oriented projective geometry and applies these basics to analyze the geometry of single- and multi- view configurations of perspective cameras. Appendix A includes a reference table of useful formulas derived using the framework of Chapter 2. Chapter 3 is a survey of projective
13 differential geometry for curves in 2D and 3D, and surfaces in 3D. Chapter 4 is the heart of the thesis: it applies the mathematical framework of Chapters 2 and 3 to describe the geometric properties of visual hulls. In addition to several results of mostly theoretical interest, this chapter gives algorithms for constructing the visual hull and related data structures. The implementation of these algorithms is described in Chapter 5, which also contains results on four different data sets (one synthetic and three real). The thesis concludes in Chapter 5 with a summary and a discussion of future research directions. A preliminary version of this research [32] was presented at the IEEE Conference on
Computer Vision and Pattern Recognition in December 2001.
14 CHAPTER 2
Oriented Projective Geometry
The study of oriented projective geometry (OPG) has been inaugurated by Stolfi [55], who has argued for the adoption of this framework in the fields of computational geometry and computer graphics. Laveau and Faugeras [31] were probably the first to show an interest in OPG in the field of computer vision. Hartley [20] has built on oriented ideas to develop his ideas of quasi-affine reconstruction and chirality (these will be briefly mentioned in Section 2.5). Werner and Pajdla [60, 61] have described oriented matching constraints that are mathematically equivalent to the epipolar consistency constraints described in Section 2.4.
This chapter consists of two parts. The first part, Sections 2.1-2.2, presents a summary of Stolfi’s framework. This summary, though necessarily brief and dense, is intended to be self-contained (at least for the purposes of this thesis). The second part applies OPG to perspective cameras. Section 2.3 deals with the “anatomy” of a single camera, and Section
2.4 deals with multi-view geometry, in particular, the fundamental matrix and the trifocal tensor. To our knowledge, the formulas for oriented transfer (Section 2.4.3) are novel. Section
2.5 briefly considers the subject of oriented reconstruction. Also, refer to Appendix A for a list of oriented formulas for computing with flats in 2D and 3D.
15 2.1 Basics
2.1.1 Oriented Projective Space
Let us begin by recalling the standard definition of an unoriented projective space:
Definition 2.1 (Projective Space). The n-dimensional projective space Pn is formed from the real vector space Rn+1 by taking away the null vector 0 and identifying all vectors that are non-zero scalar multiples of each other. In other words, Pn is the quotient of Rn+1 \{0} under the relation ∼, defined as
x ∼ y iff x = λy for some λ =0 .
Using the usual notation for a quotient, we can write the above briefly as
Pn =(Rn+1 \{0})/ ∼ .
An oriented projective space Tn is defined in the same way, except that we only identify vectors that are positive multiples of each other:
Definition 2.2 (Oriented Projective Space). The n-dimensional oriented projective space Tn is the quotient
Tn =(Rn+1 \{0})/ ,where
x y iff x = λy for some λ>0 .
In the following, we will use the analytic or vector space model of Tn [55, p. 16]. In this model, we can refer to points in Tn by their representative vectors in Rn+1. The statement
X =p(x)saysthatX is the unique point of Tn that is the equivalence class consisting of the vector x and all its positive multiples. For any two vectors x and x in Rn+1,wehave p(x)=p(x) if and only if x x. The points represented by vectors x and −x are called antipodal. We will denote the antipode of point X as ¬X.
16 Vector space Oriented projective space Ray Point Linear subspace Flat set Direct sum of subspaces Join of flats Intersection of subspaces Meet of flats Linear map Projective map
Table 2.1 Structures and operations in the vector space Rn+1, together with the structures and operations they induce in the oriented projective space Tn.
In the vector space model, the geometric structure of the oriented projective space is induced by the structure of the underlying vector space (see Table 2.1). In the following section, we will define flats of Tn, which correspond to linear subspaces of Rn+1.
A Note On Topology. In the rest of this document, we will assume that Rn+1 has the standard topology induced by the Euclidean metric1. Equivalently, this is the topology whose basis consists of sets of the form A1 × ...× An+1,whereeachAi is an open inverval of the real line R [43]. Then the topology of Pn (resp. Tn) is simply the quotient topology induced on Rn+1 by the quotient map that takes each vector of Rn+1 to its equivalence class under the relation ∼ (resp. ). That is, the open sets of Pn (resp. Tn) are exactly the sets whose pre-images under the respective quotient map are open in Rn+1. An important topological fact is that both Pn and Tn are n-dimensional manifolds — that is, every point in these spaces has a neighborhood homeomorphic to an open set of
Rn. Even though Pn is more familiar to us, the oriented space Tn is actually simpler: it is homeomorphic to Sn,then-dimensional sphere embedded in Rn+1.
1Actually, the metric induced by any p-norm will do.
17 2.1.2 Flats
Before defining an oriented flat, we will first introduce a simpler concept of unoriented
flat sets.
Definition 2.3 (Flat Set). AsetF of points of Tn is a d-dimensional flat set of Tn if there exists a (d +1)-dimensional vector subspace S of Rn+1 such that
F = {p(x) | x ∈ S \{0}} .
Flat sets of dimensions 0, 1, and 2 are unoriented points, lines, and planes, respectively. A line is uniquely determined by two non-coincident points, and a plane is uniquely determined by three non-collinear points. Overall, a set of d+1 points in general position, called a proper simplex, forms a basis for a d-dimensional flat set, in the same way that a set of d+1 linearly independent vectors forms a basis for a (d + 1)-dimensional vector subspace.
Definition 2.4 (Simplex). A d-simplex is an ordered tuple of d+1 vertices,orpointsinTn. The simplex is proper if the vectors representing these points are linearly independent; oth- erwise, it is degenerate.Thespan of a simplex is the unique flat set of minimum dimension that contains it.
If a flat set is spanned by a simplex with vertices (X1,...,Xd+1), we will denote that
flat set as [X1,...,Xd+1]. Next, we need to formulate a notion of equivalence for any two simplices spanning the same flat set. Intuitively, two simplices are equivalent if one can be continuously transformed into the other such that all intermediate simplices span the same flat set. To make this idea more precise, we first need to define basis equivalence for vector spaces.
Definition 2.5 (Basis Equivalence). Let x1,...,xd+1 and y1,...,yd+1 be two ordered bases for a d +1-dimensional vector subspace V of Rn+1. Then there exists a unique linear trans- formation L : V → V such that yi = Lxi for i =1,...,d+1. The two bases are called equivalent (equivalently oriented) if L has positive determinant.
18 The above equivalence relation partitions the set of all ordered bases for the same vector subspace into two classes (that there are exactly two classes can be shown using the product rule for determinants and the fact that the determinant of L must be either positive or negative).
Definition 2.6 (Simplex Equivalence). Let (X1,...,Xd+1) and (Y1,...,Yd+1) be two proper
n simplices spanning the same d-dimensional flat set of T .Letx1,...,xd+1 and y1,...,yd+1 be any two sets of vectors such that Xi =p(xi) and Yi =p(yi) for i =1,...,d+1.Then
(X1,...,Xd+1) and (Y1,...,Yd+1) are equivalently oriented if x1,...,xd+1 and y1,...,yd+1 are equivalently oriented.
It is easy to verify that multiplying any vector xi or yi by a positive scale factor does not change the determinant of the transformation L in Definition 2.5. Therefore, Definition
2.6 of simplex equivalence does not depend on the particular choice of vector representatives for the simplices (X1,...,Xd+1)and(Y1,...,Yd+1).
Proposition 2.1. Let σ be a permutation of the integers 1,...,d+1.Then(X1,...,Xd+1) and (Xσ(1),...,Xσ(d+1)) are in the same equivalence class if and only if σ is an even permuta- tion. Also, the simplices (X1,...,Xi,...,Xd+1) and (X1,...,¬Xi,...,Xd+1) are in different equivalence classes.
One particularly useful oriented notion is the interior of a simplex. Without this no- tion, we cannot define what it means for a point to belong to a segment, a triangle, or a tetrahedron.
Definition 2.7 (Simplex Interior). The interior of a simplex (X1,...,Xd+1) is the set of all points X that produce an equivalent simplex when substituted for any of the vertices
Xi. That is, the simplex (X1,...,Xi−1,X,Xi+1,...,Xd+1) is equivalent to (X1,...,Xd+1) for i =1,...,d+1.
19 All the simplices spanning a given flat set of Tn form two equivalence classes, identified with two orientations of that flat set. In this way, each flat set gives rise to two oppositely oriented flats, corresponding to the two simplex classes.
Definition 2.8 (Flat). An oriented flat is a flat set to which an orientation has been assigned by naming one equivalence class of simplices that span it. Given any oriented flat X,there exists an opposite flat, denoted ¬X, consisting of the same points, only taken with a different orientation.
Let F denote the set of all oriented flats of Tn,andF d the set of flats of dimension d.
We will postulate the existence of the set F −1, consisting of two flats Λ, the positive vacuum, and ¬Λ, the negative vacuum (the bases for Λ and ¬Λ consist of zero points). Similarly, the set F n consists of two flats Υ, the positive universe,and¬Υ, the negative universe.An
(n + 1)-simplex will be called positive if it spans Υ, and negative if it spans ¬Υ.
An important comment about the members of F 0 is in order. By definition, a zero- dimensional flat set is an unordered set of antipodal points. This flat set gives rise to two oriented flats, each of which consists of the same two antipodal points, but with a different point singled out as the “positive” (spanning) simplex. Thus, zero-dimensional flats can be identified with the points of Tn. Nevertheless, it is important to keep in mind the technical distinction between the two types of entities. For instance, the operations of join, meet, and relative orientation, which will be presented in the next section, are defined for flats, not for points.
2.1.3 Join, Meet, and Relative Orientation
In the following, F =[X1,...,Xd+1]andG =[Y1,...,Ye+1] will denote d-ande- dimensional oriented flats of Tn, respectively2.
2Even though we will denote unoriented flat sets and oriented flats in the same way, the context will always make it clear which entity is being discussed.
20 Definition 2.9 (Join). Thejoinoftwodisjoint flats F and G is the flat spanned by a simplex formed by concatenating the simplices that span F and G:
F ∨ G =[X1,...,Xd+1] ∨ [Y1,...,Ye+1]
=[X1,...,Xd+1,Y1,...,Ye+1] .
Note that the join operation is not defined if (X1,...,Xd+1,Y1,...,Ye+1)isnotaproper simplex. Whenever F ∨ G is undefined, we write F ∨ G = 0 (recall that 0 does not actu- ally exist in the oriented projective space Tn, so we can use the null symbol to denote an “undefined” result of an operation).
Proposition 2.2. Join has the following properties:
1. dim(F ∨ G)=d + e +1.
2. (¬F ) ∨ G = F ∨ (¬G)=¬(F ∨ G).
3. Identity: Λ ∨ F = F ∨ Λ=F , (¬Λ) ∨ F = F ∨ (¬Λ) = F .
4. Associativity: (F ∨ G) ∨ H = F ∨ (G ∨ H).
5. Commutativity: G ∨ F, if (d +1)(e +1)is even. F ∨ G = ¬(G ∨ F ) , if (d +1)(e +1)is odd.
Briefly, we write F ∨ G = ¬(d+1)(e+1)(G ∨ F ).
The definition of the meet operator is a bit more tricky. In particular, the meet of two
flats F and G is defined only with respect to a flat U of smallest dimension that contains both F and G.
Definition 2.10 (Meet). Let P , Q, R,andU be flats such that F = P ∨ Q, G = Q ∨ R, and P ∨ Q ∨ R = U.ThenQ is the result of the meet of F and G with respect to U:
F ∧U G = Q.
21 In other words, for any three flats P , Q,andR such that P ∨ Q ∨ R = U, we have by definition
(P ∨ Q) ∧U (Q ∨ R)=Q.
In the future, we will write F ∧ G instead of F ∧Υ G.
Proposition 2.3. Meet has the following properties:
1. dim(F ∧ G)=d + e − n .
2. (¬F ) ∧ G = F ∧ (¬G)=¬(F ∧ G).
3. Identity: Υ ∧ F = F ∧ Υ=F , (¬Υ) ∧ F = F ∧ (¬Υ) = F .
4. Associativity: (F ∧ G) ∧ H = F ∧ (G ∧ H).
5. Commutativity: F ∧ G = ¬(n−d)(n−e)(G ∧ F ).
In two dimensions, the meet operation is defined only for two lines, and in three dimen- sions, it is defined only for two lines, or a line and a plane.
One of the most useful things about an oriented projective space is the ability to define relative orientation of various flats with respect to one another. We can answer questions like, does a given point lie to the left or the right of a given line? In the unoriented projective plane P2, this question is meaningless, since P2 is unorientable, and lines have no “sides”. In fact, if one removes a line from of P2, the remaining set of points is homeomorphic to a disk.
On the other hand, T2 is topologically equivalent to a sphere, and lines are great circles that partition the sphere into two connected components, or “sides”.
In general relative orientation is defined for two flats F and G if d + e +1=n,thatis, if F ∨ G is equal to the positive or negative universe. Relative orientation, denoted F G, can be defined either in terms of join or meet operations, as follows:
22 Definition 2.11 (Relative Orientation). ⎧ ⎫ ⎧ ⎫ ⎧ ⎫ ⎨ +1 ⎬ ⎨ Υ ⎬ ⎨ Λ ⎬ ∨ ∧ F G = ⎩ 0 ⎭ iff F G = ⎩ 0 ⎭ iff F G = ⎩ 0 ⎭ . −1 ¬Υ ¬Λ In 2D, relative orientation is defined for points and lines, and in 3D, it is defined for points and planes, or for pairs of lines. The following result [55, p. 66, Theorem 2] will be useful to us:
Proposition 2.4. AflatF of dimension d is uniquely characterized by the sign-valued
n−d−1 function σF : F →{−1, 0, 1}, defined by σF (G)=F G, for any (n−d−1)-dimensional flat G.
2.1.4 Oriented Projective Transformations
Definition 2.12 (Oriented Projective Transformation). A function M : Tn → Tn is an oriented projective transformation of Tn if it takes positive (n +1)-simplices to positive (n +1)-simplices, and there exists a linear map L : Rn+1 → Rn+1 such that
p(L(x)) = M(p(x)) for all x ∈ Rn+1 \{0} .
Two linear maps L and L give rise to the same oriented projective transformation if and only if L = λL,λ > 0. Oriented projective maps can be extended from maps on points to maps on flats as follows:
M(Λ) = Λ ,
M(¬Λ) = ¬Λ ,
M([X1,...,Xd+1]) = [M(X1),...M(Xd+1)] .
Proposition 2.5. Let M : Tn → Tn be an oriented projective transformation. Then M has the following properties:
1. M(F ∨ G)=M(F ) ∨ M(G) .
23 2. M(F ∧ G)=M(F ) ∧ M(G) .
3. M(¬F )=¬M(F ) .
4. M(F ) M(G)=F G .
Apart from full-rank maps of the space Tn to itself that preserve the relative orientation of simplices, we are also interested in projective maps between oriented spaces of different dimensions. Of particular interest will be the camera projection map taking points of T3 to points of T2.
Definition 2.13 (Generalized Projective Transformation). A function M : Tm → Tn is a generalized projective transformation if there exists a linear map L : Rm+1 → Rn+1 such that
p(L(x)) = M(p(x)) for all x ∈ Rm+1 \{0} .
The null space and the range of M are two flat sets defined as follows:
Null(M)={X ∈ Tm | M(X)=0} ,
Range(M)={Y ∈ Tn | Y = M(X) for some x ∈ Tm} .
A generalized projective transformation extends to flats in the same way as an oriented projective transformation. Note that if F is a flat that is not disjoint from Null(M), then
M(F )=0. Unlike a true oriented projective map, a generalized map can return an undefined result even if the argument is a well-defined flat.
Proposition 2.6. Let M : Tm → Tn be a generalized projective transformation. M has the following properties:
1. M(F ∨ G)=M(F ) ∨ M(G) .
24 2. Let U be any flat of Tm disjoint from Null(M),andV = M(U) be the well-defined flat of Tn that is the image of U under M.IfF and G are both contained in U,then
M(F ∧U G)=M(F ) ∧V M(G) .
3. M(¬F )=¬M(F ) .
Note that property 4 of oriented projective maps has no analogue in the list above. In fact, the requirement that the orientation of simplices be preserved does not even make sense for a generalized map, which may take a full-rank simplex of the domain into a degenerate simplex of the range.
One interesting class of generalized projective maps are invertible maps from Tn to itself that map positive (n + 1)-simplices onto negative (n + 1)-simplices. Such maps are called orientation-reversing. By constrast, oriented projective transformations are constrained by definition to be orientation-preserving.
2.2 Computing with Flats
In this section, we will introduce the representation for flats used in the rest of the thesis, and present a complete list of oriented formulas for join, meet, and relative orientation.
2.2.1 Oriented Projective Frames
Points will be represented using signed homogeneous coordinates, defined with respect to an oriented projective basis. Intuitively, to define a projective basis it would seem to
n+1 be sufficient to select a basis a1,...,an+1 of R and to take the resulting n +1points
Ai =p(ai). However, the knowledge of these n + 1 points is insufficient to determine a unique coordinate vector of a new point: even though we can unambiguously go from a vector basis a1,...,an+1 to a set of points A1,...,An+1, we cannot go backwards. After all, each point Ai is equally well represented not only by the basis vector ai, but by any positive
25 multiple λiai. Thus, we simply don’t know how to select a representative vector for each point Ai to form a unique linear combination. The solution is to add one extra point to the projective basis.
Definition 2.14 (Oriented Projective Basis). An oriented projective basis for Tn consists of n +2points A1,...,An+2 such that the following conditions are met:
1. (A1,...,An+1) is a proper positive simplex. That is, [A1,...,An+1]=Υ.
2. The point An+2 is in the interior of the simplex (A1,...,An+1) (recall Definition 2.7).
n+1 3. There exists a basis a1,...,an+1 of R such that
Ai =p(ai),i=1,...,n+1, and An+2 =p(a1 + ...+ an+1) .
(A1,...,An+1) is called the main simplex,andAn+2 is called the unit point.
Note that the traditional definition of a basis for an unoriented projective space includes only the third requirement above.
With the additional constraint imposed by the unit point, the ambiguity in selecting a vector basis a1,...,an+1 effectively disappears: the n + 1 independent scale factors λi associated with all ai are now reduced to a single factor λ common to all the vectors.
n Proposition 2.7. Let A1,...,An+2 be an oriented projective basis of T ,anda1,...,an+1
n+1 and a1,...,an+1 be two bases of R satisfying the conditions of Definition 2.14. Then
there exists a scalar λ>0 such that ai = λai, i =1,...,n+1.
Now, given an oriented projective basis A1,...,An+2 and an underlying vector basis a1,...,an+1, we can easily determine homogeneous coordinates of new points. The vector
T x =(x1,...,xn+1) is the signed homogeneous coordinate vector of a point X if and only if
X =p(x1a1 + ...+ xn+1an+1).
26 2.2.2 Simplex Orientation
In this section, we describe an algebraic criterion that allows us to designate a simplex as positive or negative based on the homogeneous coordinates of its points.
Definition 2.15 (Simplex Orientation Using Determinants). Let (X1,...,Xn+1) be a proper
(n+1)-simplex and x1,...,xn+1 be the homogeneous coordinate vectors of its points in some fixed oriented projective coordinate system. Then the simplex is defined as positive if and only if the determinant of the coordinate vectors, denoted |x1,...,xn+1|, is positive.
We can actually show that the above convention is independent of the particular oriented frame used. That is, if a simplex is designated as positive in one frame, it will also be positive in any other frame that meets the criteria of Definition 2.14.
To simplify working with projective coordinates, we can define certain canonical frames.
n In T , the canonical frame consists of points Ei =p(εi), i =1,...,n+2:
T ε1 =(1, 0,...,0) ,
T ε2 =(0, 1,...,0) ,
... (2.1)
T εn+1 =(0, 0,...,1) ,
T εn+2 =(1, 1,...,1) .
n We will designate (E1,...,En+1)asthecanonical positive simplex of T .
For purposes of computation, it is sometimes convenient to represent coordinate vectors in n-dimensional Euclidean space, En, as homogeneous coordinate vectors in Tn. For example, a
T T Euclidean coordinate vector (x1,...,xn) becomes a homogeneous vector w (x1,...,xn, 1) , where w is some positive scalar. In the Euclidean interpretation, we consider the last entry of each vector to be the homogeneous coordinate. A positive value indicates a finite point, and a value of 0 indicates a point at infinity, or a vector with direction given by the first two entries.
27 The canonical oriented projective frame (2.1) has a convenient Euclidean interpretation.
In two dimensions, we can interpret E1 as the point at infinity along the positive x-axis,
E2 as the point at infinity along the positive y-axis, and E3 as the origin. To visualize the orientation of this triad, we can picture a circular arrow pointing counterclockwise from E1 to E2 to E3.
Similarly, in three dimensions, we can think of E1, E2 and E3 as the points at infinity along the positive x-, y-, and z-axes, respectively, and E4 as the origin. To visualize the orientation of this “tetrahedron”, we can imagine curling the fingers of the left hand clockwise from E1
3 to E2 to E3, with the thumb pointing towards E4 .
2.2.3 Representing General Flats
One common way to represent flats of general dimension is by using Pl¨ucker coordi-
Tn n+1 nates [55, Chapter 19]. A d-dimensional flat in is represented using a vector of d+1 homogeneous coordinates. The Pl¨ucker coordinates of a point4 are simply its signed homo- geneous coordinates, as defined in Section 2.2.1. Letustakethecased = n.ThetwomembersofF n are represented by scalars: Υ by posi- tive numbers, and ¬Υ by negative numbers. This system fits rather well with the orientation convention described in the previous section. For instance, given homogeneous coordinate vectors of n + 1 points, we can compute their span simply by taking their determinant (see
Definition 2.15). Also, given any two flats F =[X1,...,Xd+1]andG =[Y1,...,Ye+1]such that d + e +1=n, we can compute F ∨ G as follows: ⎧ ⎫ ⎧ ⎫ ⎨ Υ ⎬ ⎨ +1 ⎬ ∨ | | F G = ⎩ 0 ⎭ iff sgn x1,...,xd+1, y1,...,ye+1 = ⎩ 0 ⎭ , ¬Υ −1
3Ideally, we would like to orient tetrahedra using the right-hand rule. But then, the interpretation of the canonical simplex has to be changed to make E1 the origin, etc. This can be done consistently in all dimensions by considering the homogeneous coordinate to be the first entry of each vector. Unfortunately, in vision and graphics literature it is customary to put the homogeneous coordinate last, and we have to stay faithful to this convention. 4Here and in the future, when we say “point”, we mean “zero-dimensional flat”. Refer to the discussion at the end of Section 2.1.2.
28 where xi are coordinates of Xi,andyi are coordinates of Yi. A comparison of the above formula with the abstract formula for relative orientation (Definition 2.11), makes it im- mediately obvious how to compute the relative orientation of two flats of complementary dimension given coordinates of their representative simplices.
Next, let us consider hyperplanes, flats of dimension d = n − 1 (lines in T2 and planes in T3). For convenience of computation, we will represent hyperplanes not by their (n +1)- dimensional Pl¨ucker coordinate vectors, but by hyperplane coefficients, which can be obtained from Pl¨ucker coordinate vectors by reversing their entries and flipping certain signs. The purpose of this change is to simplify relative orientation formulas by converting them to dot products. The coefficients of a hyperplane H are defined (up to positive scale) by a vector h such that, for any point X with homogeneous coordinates x,wehave ⎧ ⎫ ⎧ ⎫ ⎨ +1 ⎬ ⎨ +1 ⎬ T H X = ⎩ 0 ⎭ iff sgn(h x)=⎩ 0 ⎭ . −1 −1
Finally, lines in T3 will be represented using a vector of six Pl¨ucker coordinates,also
T defined up to positive scale. A Pl¨ucker coordinate vector L =(l12,l13,l14,l23,l24,l34) of a line L can be computed given coordinates of two points that lie on L,orcoefficientsoftwo planes that contain L. Significantly, not every sextuple of numbers represents valid Pl¨ucker coordinates of some line in space. For any line L,wemusthaveL ∨ L = 0, and this places a quadratic constraint on the coordinate vector L, expressed in equation (A.18). To simplify the formula for relative orientation of two lines, we can also define a coefficient vector of the line, as shown in (A.17).
In the rest of this document, we will simplify notation by identifying flats with their coordinate (coefficient, Pl¨ucker) vectors. For example, we will generally make no distinction between a point X and its coordinate vector x. Lowercase letters will denote flats in T2, and uppercase letters will denote flats in T3. In the next section, we will show formulas for manipulating flats in T2 and T3 using their vector representation. The general method for
29 deriving these formulas may be found in Stolfi [55, Chapter 20]. Even though join, meet, and relative orientation are defined only on flats, we will abuse notation by using the symbols
∨, ∧,and to denote operations on vectors. For example, if x and y are the homogeneous coordinates of two 2D points X and Y , we will write x ∨ y to denote the coefficient vector of the line X ∨ Y . This notation is potentially ambiguous. For example, if we do not know whether the 3-vectors x and y are supposed to represent points or lines, we cannot evaluate the statement x ∨ y. For example, if x and y are two points, then x ∨ y = x × y (A.1). However, if x is a point and y is a line, then x∨y = xT y (A.4). We will avoid such problems by always clearly stating which flats the coordinate vectors are supposed to represent.
For a complete reference sheet of formulas for computing with flats in 2D and 3D, refer to Appendix A.
2.2.4 Representing Projective Transformations
Oriented projective transformations in Tn correspond to (n +1)× (n + 1) matrices with positive determinant. We will also refer to oriented projective transformations as orientation- preserving transformations. Orientation-reversing transformations are invertible transforma- tions from Tn to Tn that map positive (n + 1)-simplices to negative (n + 1)-simplices (recall
Section 2.1.4). Such transformations correspond to (n +1)× (n + 1) matrices with negative determinant.
If we know the action of a given transformation on points, we can derive the induced action of the same transformation on higher-dimensional flats. The following result, which will be used in Section 2.5, shows how the coefficient vector of a plane changes under a transformation that acts on points:
Proposition 2.8. Let P = X∨Y ∨Z, and suppose that the points X, Y ,andZ undergo an orientation-preserving or orientation-reversing projective transformation M,asX˜ = MX,
30 etc. Then the coefficient vector P transforms as follows:
P˜ = X˜ ∨ X˜ ∨ Z˜ = |M| M −T P , (2.2) where |M| denotes the determinant of the matrix M.
T Proof. Let W =(w1,w2,w3,w4) be the coordinate vector of an arbitrary point, and
T W˜ = MW =(˜w1, w˜2, w˜3, w˜4) be the coordinates of the transformed point. According to Proposition 2.4, a flat is uniquely characterized by the function giving its relative ori- entation with respect to every other flat of complementary dimension. That is, if we have
P˜ W˜ = P˜ W˜ (equivalently, P˜ ∨W˜ = P˜ ∨W˜ ) for any arbitrary point W˜ ,thenP˜ = P˜. In the following, A, B, C, D will denote the columns of the inverse transformation matrix, M −1.
P˜ ∨ W˜ = |X˜ , Y˜ , Z˜, W˜ | = |MX,MY ,MZ,MW | = |M||X, Y , Z,M−1W˜ |
= |M||X, Y , Z, w˜1A +˜w2B +˜w3C +˜w4D|
=˜w1 |M||X, Y , Z, A| +˜w2 |M||X, Y , Z, B| +
w˜3 |M||X, Y , Z, C| +˜w4 |M||X, Y , Z, D|
T T T T = |M| (˜w1P A +˜w2P B +˜w3P C +˜w4P D)
= |M| P T M −1W˜ =(|M| M −T P ) ∨ W˜ .
Therefore, P˜ = |M| M −T P .
Proposition 2.9 (Corollary). Let X = P ∧ Q ∧ R, and suppose that the planes are trans- formed as P˜ = M −T P , etc. Then X transforms as follows:
X˜ = P˜ ∧ Q˜ ∧ R˜ = |M −1| MX . (2.3)
We will also be interested in generalized projective transformations between spaces of different dimensions, represented by non-square matrices. The main such transformation, camera projection, will be explained in detail in Section 2.3.
31 Finally, we must mention that matrices of projective transformations, just like signed homogeneous coordinate vectors, are defined up to a positive scale factor. Two matrices M and M represent the same transformation if and only if M = λM , λ>0.
2.3 Imaging Geometry of a Single Camera
The 3D scene is the oriented space T3, and the image plane is modeled by T2. The imaging process of a perspective camera can be described as a generalized projective transformation from the scene to the image plane. This transformation is computed using a 3 × 4 camera projection matrix P : ⎛ ⎞ P T x P X = ⎝ QT ⎠ X . (2.4) RT The rows P , Q, R of the matrix P can be geometrically interpreted as coefficients of three
T projection planes of the camera [14]. Letting x =(x1,x2,x3) and using formula (A.14), we can rewrite the projection equation (2.4) as ⎛ ⎞ ⎛ ⎞ x1 P ∨ X ⎝ ⎠ ⎝ ⎠ x2 Q ∨ X . x3 R ∨ X
If X lies on the plane P ,wehaveP X =sgn(P ∨ X)=0,sox1 = 0. Similarly, if X lies on Q,thenx2 =0,andifX lies on R,thenx3 = 0. The third plane, R, is usually accorded special status as the focal plane of the camera. This special status is not inherent in the mathematics of camera projection, but reflects the constraint that any point seen by an actual physical camera must lie in front of its focal plane: ⎧ ⎨ R X =+1, ∨ x is a visible point in the image iff ⎩ R X > 0 , (2.5) x3 > 0 .
A natural model for the image plane is the two-sided plane [55, p. 13]. The front range of the plane consists of points with x3 > 0, and the back range consists of points with x3 < 0.
32 The two sides are “glued together” by a line at infinity, made up of points with x3 =0(see
figure, after [55, Figure 3, p. 14]).
The null space of the camera projection transformation P is the point O such that P O = 0. In standard projective geometry, O determines, up to an arbitrary scale factor, the center of the camera. To give a well-defined orientation to O,wedefineitasthemeet of the three projection planes:
O P ∧ Q ∧ R . (2.6)
For P to represent a valid camera transformation, its range has to be the whole im- age plane, T2. This means that the coordinate vectors P , Q,andR have to be linearly independent (equivalently, the rank of P hastobethree).
Proposition 2.10. Let X, Y ,andZ be three points in space with images x P X, y P Y ,andz P Z.Then
|x, y, z||O, X, Y , Z| . (2.7)
Proof. Let P˜ denote the 4×4 matrix formed by appending the row vector OT to the camera matrix P : ⎛ ⎞ P T ⎜ ⎟ ⎜ QT ⎟ P˜ = ⎜ ⎟ . ⎝ RT ⎠ OT It is easy to find the expression for the determinant of P˜ by expanding along the last row and plugging in formula (A.11): |P˜| = −OT O .
Let S denote the 4 × 4 matrix formed by concatenating the column vectors O, X, Y ,and Z:
S = OXY Z .
33 The determinant of the product PS˜ can be derived as follows:
|PS˜ | = |P˜||S| = −OT O |S|−|S| .
We can compute the same determinant in a different way, by explicitly multiplying the two matrices and expanding along the first column: P T OPT XPT YPT Z 0 P T XPT YPT Z T T T T T T T | ˜ | Q OQXQYQZ 0 Q XQYQZ PS = T T T T = T T T R ORXRYRZ 0 R XRYRZ OT OOT XOT YOT Z OT OOT XOT YOT Z T T T P XPYPZ T T T T = −O O Q XQYQZ −|x, y, z| . RT XRT YRT Z
We have found that |PS˜ |−|S| and |PS˜ |−|x, y, z|. The desired conclusion follows immediately: |x, y, z||O, X, Y , Z| .
Proposition 2.11 (Corollary). Let Π be a plane in space not containing the camera center O. Then the restriction of the camera projection map P to the plane Π is orientation- preserving if and only if Π ∨ O < 0 (that is, the plane Π is facing away from the camera center).
Each pair of projection planes intersects along a projection ray of the camera. The oriented Pl¨ucker coordinates of these three rays are given by Q∧R, R ∧P ,andP ∧Q.The next proposition identifies these rays with the first three points of the canonical projective basis (2.1) for the image plane, T2.
34 Proposition 2.12. Let A, B,andC be points in T3 such that Q ∧ R O ∨ A, R ∧ P
O ∨ B,andP ∧ Q O ∨ C. Then we have
T P A (1, 0, 0) = ε1 ,
T P B (0, 1, 0) = ε2 ,
T P C (0, 0, 1) = ε3 .
B C
¶ 2 ¶ 3 A
¶ 1
O
Figure 2.1 Illustration of Proposition 2.12.
Proof. Point A lies in the intersection of planes Q and R,sowehaveQ ∨ A = R ∨ A =0. Analogously, P ∨ B = R ∨ B =0andP ∨ C = Q ∨ C = 0. Therefore, P A =(α, 0, 0)T ,
P B =(0,β,0)T ,andP C =(0, 0,γ)T . Now we just have to show that α = P ∨A, β = Q∨B, and γ = R ∨ C are all positive.
Since the operation of meet is associative, we can rewrite (2.6) as O P ∧ (Q ∧ R). Let us now follow the definition of meet (Definition 2.10). Let L be a line such that P L ∨ O.
From the statement of the proposition, we already have a point A such that Q∧R O ∨A.
35 Because O is the result of the meet P ∧ (Q ∧ R), we know that L ∨ O ∨ A P ∨ A is equal to the positive universe Υ. According to equation (A.14), α = P ∨ A > 0. Geometrically, the point A on the projection ray Q ∧ R is in front of the plane P . By similar reasoning, we get β = Q ∨ B > 0andγ = R ∨ C > 0.
T The three points ε1, ε2, ε3, together with unit point ε4 =(1, 1, 1) , form the canonical projective basis for the oriented image plane, and for the projectively equivalent space of oriented lines through the camera center O. Thus, we have identified the ray Q∧R with the coordinate vector ε1, R ∧ P with ε2,andP ∧ Q with ε3. The next proposition [14, p. 183]
T shows that the visual ray corresponding to the image point (x1,x2,x3) = x1ε1 +x2ε2 +x3ε3 can be written as a linear combination of the three basis visual rays with the same coefficients.
Proposition 2.13. Let X be some (unknown) point in T3 that projects to point x =
T (x1,x2,x3) : x P X. Then the viewing ray containing the point X is given by
O ∨ X x1Q ∧ R + x2R ∧ P + x3P ∧ Q . (2.8)
In matrix form:
O ∨ X Q ∧ RR∧ PP∧ Q x .
Proof. First, we will derive (2.8) as an unoriented expression [14, p. 183], and then we will check that it indeed represents a line oriented from O to X. Consider the three planes
x1Q − x2P ,x2R − x3Q,x3P − x1R .
EachoftheseplanescontainsX. Let us take, for example, the first plane:
(x1Q − x2P ) ∨ X = x1(Q ∨ X) − x2(P ∨ X)
= x1x2 − x2x1 =0.
Also, each of the planes contains O. Thus, we may form a line L containing both O and X as the meet of any two of the planes. At least one of x1, x2,orx3 must be nonzero. Let us
36 suppose that x1 = 0 (the arguments for x2 and x3 are very similar).
L (x3P − x1R) ∧ (x1Q − x2P )
2 = x1Q ∧ R + x1x2R ∧ P + x1x3P ∧ Q .
Since L contains both O and X, we know that either L O ∨ X or L X ∨ O.
There are two cases to consider, based on the sign of x1. First, suppose x1 > 0. Then we have
L x1Q ∧ R + x2R ∧ P + x3P ∧ Q ,
P ∧ L P ∧ (x1Q ∧ R) O .
Let us follow the definition of meet to “deconstruct” the statement O P ∧ L.LetM be some line such that P M ∨ O. Also, we can write L O ∨ X˜ , such that either X˜ X or X˜ −X. By definition of meet, M ∨ O ∨ X˜ P ∨ X˜ > 0. Since x1 > 0, we also have
P ∨ X > 0. Because both P ∨ X˜ > 0andP ∨ X˜ > 0, we must have X˜ X. Therefore, L O ∨ X.
Next, suppose x1 < 0. Then
L −x1Q ∧ R − x2R ∧ P − x3P ∧ Q ,
P ∧ L P ∧ (−x1Q ∧ R) −O .
The argument is very similar to the one above. Because P ∧ L −O and P ∨−X > 0, by definition of meet, we have L −O ∨−X = O ∨ X.
In many cases, we are interested not only in the projection of 3D points into an image, but in the projection of other flats, such as lines. The following proposition (for the unoriented version, see Faugeras et al. [14, p.195]) gives the correct oriented formula for the coefficient vector of a line in 2D given by the projection of a line in 3D. This result will be used in
Section 2.4.1 to define the oriented fundamental matrix between two views.
37 Proposition 2.14. Let L be a line in space, and l be its image under the perspective pro- jection P . Then we may compute the coefficient vector of the 2D line as follows: ⎛ ⎞ (Q ∧ R) ∨ L l ⎝ (R ∧ P ) ∨ L ⎠ . (2.9) (P ∧ Q) ∨ L
Proof. Let X and Y be any two points such that L X ∨ Y .Then ⎛ ⎞ ⎛ ⎞ P T X P T Y l (P X) ∨ (P Y )=⎝ QT X ⎠ ∨ ⎝ QT Y ⎠ RT X RT Y ⎛ ⎞ (QT X)(RT Y ) − (QT Y )(RT X) = ⎝ (RT X)(P T Y ) − (RT Y )(P T X) ⎠ (P T X)(QT Y ) − (P T Y )(QT X) ⎛ ⎞ XT (QRT − RQT )Y = ⎝ XT (RP T − PRT )Y ⎠ XT (PQT − QP T )Y ⎛ ⎞ (Q ∧ R) ∨ X ∨ Y = ⎝ (R ∧ P ) ∨ X ∨ Y ⎠ by (A.10) (P ∧ Q) ∨ X ∨ Y ⎛ ⎞ (Q ∧ R) ∨ L = ⎝ (R ∧ P ) ∨ L ⎠ . (P ∧ Q) ∨ L
Using our notation for the coefficient vector of a line (A.17), we can also write (2.9) as ⎛ ⎞ T (Q ∧ R)∗ ⎜ T ⎟ l P ∗L = ⎝ (R ∧ P )∗ ⎠ L . T (P ∧ Q)∗
The matrix P ∗ is called the line projection matrix.
The next result concerns the inverse operation, back-projection of lines. Of course, a line l in the image does not uniquely determine a line in space. Let L be any line that projects to l in the image, and let Π be the plane containing the camera center O and the line L:
Π O ∨ L. Then any other (unknown) line L that also projects to l must also lie on
38 Π. The following proposition shows that the coefficient vector Π can be computed using a simple formula. See Faugeras et al. [14, p. 184] for the unoriented version of the same result.
Proposition 2.15. Let l be a line in the image, and L be any line that projects to l according to (2.9). Then the coefficient vector of the plane Π O ∨ L can be computed as follows:
Π P T l . (2.10)
Proof. Let X and Y be two points such that L X ∨ Y ,andZ be any other point in space. Also, let x, y,andz be the projections of these three points in the image. Note that l x ∨ y. By equation (2.7), we have the following:
|x, y, z||O, X, Y , Z|
(x ∨ y) ∨ z (O ∨ X ∨ Y ) ∨ Z
l ∨ z Π ∨ Z .
Thus, if Z is some arbitrary point in space, it must have the same orientation relative to
Π as its projection, z P Z,relativetol. Let us rewrite the expression l ∨ z:
l ∨ z l ∨ (P Z)
= lT (P Z)=(lT P )Z
=(P T l)T Z =(P T l) ∨ Z .
Therefore, Π P T l.
2.4 Oriented Multi-View Geometry
2.4.1 Fundamental Matrix
Let Pi and Pj be the projection matrices of two cameras: ⎛ ⎞ ⎛ ⎞ T T Pi Pj ⎝ T ⎠ ⎝ T ⎠ Pi = Qi ,Pj = Qj . T T Ri Rj
39 T Suppose that X is a scene point that projects onto image points xi =(xi1,xi2,xi3) PiX
T and xj =(xj1,xj2,xj3) PjX (refer to Figure 2.2). The following is an oriented adaptation of the derivation of Faugeras et al. [14, p. 264]. Using (2.8), the oriented lines from the respective camera centers Oi and Oj through X are given by
Li = Oi ∨ X xi1Qi ∧ Ri + xi2Ri ∧ Pi + xi3Pi ∧ Qi , (2.11)
Lj = Oj ∨ X xj1Qj ∧ Rj + xj2Rj ∧ Pj + xj3Pj ∧ Qj . (2.12)
Since Li and Lj both pass through the point X,wemusthaveLi ∨ Lj = 0. Because of the linearity of the join operator (Proposition A.1), it is easy to see that the expression
T T for Li ∨ Lj is bilinear in (xi1,xi2,xi3) and (xj1,xj2,xj3) . We can write this expression compactly in matrix form as
T Li ∨ Lj xj Fijxi , where
⎛ ⎞ (Qi ∧ Ri) ∨ (Qj ∧ Rj)(Ri ∧ Pi) ∨ (Qj ∧ Rj)(Pi ∧ Qi) ∨ (Qj ∧ Rj) ⎝ ⎠ Fij = (Qi ∧ Ri) ∨ (Rj ∧ Pj)(Ri ∧ Pi) ∨ (Rj ∧ Pj)(Pi ∧ Qi) ∨ (Rj ∧ Pj) (Qi ∧ Ri) ∨ (Pj ∧ Qj)(Ri ∧ Pi) ∨ (Pj ∧ Qj)(Pi ∧ Qi) ∨ (Pj ∧ Qj) ⎛ ⎞ |Qi, Ri, Qj, Rj||Ri, Pi, Qj, Rj||Pi, Qi, Qj, Rj| ⎝ ⎠ = |Qi, Ri, Rj, Pj||Ri, Pi, Rj, Pj||Pi, Qi, Rj, Pj| . (2.13) |Qi, Ri, Pj, Qj||Ri, Pi, Pj, Qj||Pi, Qi, Pj, Qj|
To get a geometric interpretation of the fundamental matrix Fij, let us use the result on projection of lines (2.9) to write out the coefficients of the epipolar line lji which is the projection of Li onto the image plane of the jth view: ⎛ ⎞ (Qj ∧ Rj) ∨ Li ⎝ ⎠ lji = (Rj ∧ Pj) ∨ Li . (2.14) (Pj ∧ Qj) ∨ Li
After substituting (2.11) for Li and comparing with (2.13), it becomes easy to see that
lji = Fijxi . (2.15)
40 X
L Li j
lji lij
xi xj
O e e i ij ji Oj
Figure 2.2 Epipolar geometry.
In this way, we can regard Fij as the matrix that transforms points in the ith view to corresponding epipolar lines in the jth view. The epipolar constraint
T xj Fijxi = 0 (2.16)
can be rewritten as xj ∨ lji = 0, which says that the point xj lies on the epipolar line lji.
This becomes obvious if we write an alternative expression for lji as the projection of of the optical ray Li = Oi ∨ X:
lji (PjOi) ∨ (PjX)=eji ∨ xj . (2.17)
The point eji PjOi is called the epipole.
T If we switch the roles of Pi and Pj, we get the epipolar constraint xi Fjixj =0.By
T inspection of (2.13), it is clear that Fji = Fij . The projection of the optical ray Lj into the image plane of the ith camera is given by lij = Fjixj. Also, the second epipole is given by eij PiOj.
41 From looking at (2.17), it is clear that lji = 0 when xj eji. Thus, we can conclude that
T ejiFij = Fjieji = 0 .
Similarly,
T eijFji = Fijeij = 0 .
Thus, eij is the null vector of Fij,andeji is the null vector of Fji.
We can explicitly write out the coordinates of the epipoles as follows: ⎛ ⎞ ⎛ ⎞ Pi ∨ (Pj ∧ Qj ∧ Rj) |Pi, Pj, Qj, Rj| ⎝ ⎠ ⎝ ⎠ eij PiOj = Qi ∨ (Pj ∧ Qj ∧ Rj) = |Qi, Pj, Qj, Rj| , (2.18) Ri ∨ (Pj ∧ Qj ∧ Rj) |Ri, Pj, Qj, Rj| ⎛ ⎞ ⎛ ⎞ Pj ∨ (Pi ∧ Qi ∧ Ri) |Pj, Pi, Qi, Ri| ⎝ ⎠ ⎝ ⎠ eji PjOi = Qj ∨ (Pi ∧ Qi ∧ Ri) = |Qj, Pi, Qi, Ri| . (2.19) Rj ∨ (Pi ∧ Qi ∧ Ri) |Rj, Pi, Qi, Ri|
The last coordinates ei3 and ej3 of eij and eji reflect the orientations of each camera center with respect to the other camera’s focal plane:
sgn(ei3)=Ri Oj and sgn(ej3)=Rj Oi .
Previously,wehaveassumedthatallscenepointsmustbeinfrontofthefocalplanesof each camera that can see them. This constraint, however, does not apply to camera centers.
Whereas for regular image points, we can simply assume that the third coordinate is positive, this assumption does not hold for epipoles. For this reason, it is crucial to compute epipoles using the oriented formulas above.
In the beginning of the section, we have seen that if two points xi and xj are the projections of the same 3D point X, then the epipolar constraint (2.16) is satisfied. Let us now consider the converse of this statement. For example, if we have xi PiX and xj −PjX, the epipolar constraint is still satisfied, but there is no point in space with a well-defined orientation that would project to the two image points.
42 X X
Li Lj Li Lj
Oi Oj Oi Oj
Figure 2.3 The epipolar consistency criterion (2.20). The oppositely oriented arrows on the left and right sides of the figure indicate the orientation of Oi ∨ Lj and Oj ∨ Li, respectively.
Proposition 2.16. Let xi and xj be two points such that
T xj Fijxi =0,Fijxi = 0 ,Fjixj = 0 .
There exists a unique 3D point X such that xi PiX and xj PjX if and only if
Oi ∨ Lj −Oj ∨ Li , (2.20) where Li and Lj are two viewing rays given by
Li = Qi ∧ Ri Ri ∧ Pi Pi ∧ Qi xi ,
Lj = Qj ∧ Rj Rj ∧ Pj Pj ∧ Qj xj .
Figure 2.3 illustrates (2.20).
Proof. Suppose there does exist a point X such that xi PiX and xj PjX.Then
Oi ∨ Lj Oi ∨ (Oj ∨ X)=−Oj ∨ (Oi ∨ X) −Oj ∨ Lj .
Now, suppose that the converse is true:
Oi ∨ Lj −Oj ∨ Li .
Because xi and xj satisfy the epipolar constraint, the visual rays given by Li and Lj intersect in space. That is, there exists a point X such that Li ∨ X = 0 and Lj ∨ X = 0.Notethat
43 the orientation of X is not defined, since the same constraints are satisfied by −X.Letus choose the orientation such that Li Oi ∨ X.ThenLj Oj ∨ X˜ ,whereX˜ ∼ X. Finally, we have:
Oi ∨ Lj Oi ∨ Oj ∨ X˜ ,
−Oj ∨ Li Oi ∨ Oj ∨ X .
Since Oi ∨ Lj −Oj ∨ Li, X˜ X. By construction of the viewing rays (Proposition 2.13),
X projects to xi and xj.
The epipolar consistency criterion (2.20) can be stated in several alternative ways.
• Since the epipolar lines lij = Fjixj and lji = Fijxi are oriented projections of the rays
Lj and Li, respectively, we can invoke Proposition 2.15 to write (2.20) as
T T Pi (Fjixj) −Pj (Fijxi) . (2.21)
• By examining the second half of the proof of Proposition 2.16, we can notice that if
(2.20) is not satisfied, there exists a point X such that xi PiX and xj −PjX.
Then Oj ∨X −Lj. But the oriented projection of Lj is lij = Fjixj, and the oriented
projection of −Lj is eij ∨ xi. Therefore, we must have
Fjixj −eij ∨ xi
and, by similar reasoning,
Fijxi −eji ∨ xj .
Thus, the alternative form of (2.20) becomes
Fjixj eij ∨ xi or equivalently, Fijxi eji ∨ xj . (2.22)
This is the “strong realizability” condition of Werner and Pajdla [60, 61].
44 2.4.2 Oriented Trifocal Tensor
The following derivation is after Faugeras et al. [14, p. 419].
Let Pi, Pj and Pk be three camera matrices, and L be a 3D line that projects onto lines
T T T li =(li1,li2,li3) , lj =(lj1,lj2,lj3) ,andlk =(lk1,lk2,lk3) in the three images (see Figure
2.4). Suppose that we know the (properly oriented) coefficient vectors li and lj in two of the
Ok
lk
Pk
L
Pj Pi
lj li
Oi Oj
Figure 2.4 The trifocal tensor (see text). images. According to Proposition 2.15, these two lines back-project to the following planes:
T Πi = Oi ∨ L Pi li ,
T Πj = Oj ∨ L Pj lj .
45 By intersecting the above planes, we will retrieve the (unoriented) coordinates of the line L:
Πi ∧ Πj =(Oi ∨ L) ∧ (Oj ∨ L)=(Oi ∨ L) ∧ (L ∨ Oj) L , Oi ∨ L ∨ Oj > 0 , = −L , Oi ∨ L ∨ Oj < 0 . The last step follows by definition of meet. To keep orientation consistent, we will assume that Oi ∨L∨Oj = Πi ∨Oj = Oi ∨Πj > 0. In practice, we can compute the properly oriented coordinates of Πi or Πj without knowing L, so we can always find the sign of Oi ∨ L ∨ Oj and interchange i and j if necessary. Let us expand the expression L = Πi ∧ Πj:
T T L (Pi li) ∧ (Pj lj)=(Pili1 + Qili2 + Rili3) ∧ (Pjlj1 + Qjlj2 + Rjlj3)
= li1lj1Pi ∧ Pjli1lj2Pi ∧ Qjli1lj3Pi ∧ Rj +
li2lj1Qi ∧ Pjli2lj2Qi ∧ Qjli2lj3Qi ∧ Rj +
li3lj1Ri ∧ Pjli3lj2Ri ∧ Qjli3lj3Ri ∧ Rj .
Next, let us compute the projection of L onto the line in the third image, lk, using the line projection equation (2.9): ⎛ ⎞ (Qk ∧ Rk) ∨ L ⎝ ⎠ lk = (Rk ∧ Pk) ∨ L , (Pk ∧ Qk) ∨ L ⎛ ⎞ (Qk ∧ Rk) ∨ (Pi ∧ Pj)(Qk ∧ Rk) ∨ (Pi ∧ Qj)(Qk ∧ Rk) ∨ (Pi ∧ Rj) T ⎝ ⎠ lk1 = li (Qk ∧ Rk) ∨ (Qi ∧ Pj)(Qk ∧ Rk) ∨ (Qi ∧ Qj)(Qk ∧ Rk) ∨ (Qi ∧ Rj) lj , (Qk ∧ Rk) ∨ (Ri ∧ Pj)(Qk ∧ Rk) ∨ (Ri ∧ Qj)(Qk ∧ Rk) ∨ (Ri ∧ Rj) ⎛ ⎞ (Rk ∧ Pk) ∨ (Pi ∧ Pj)(Rk ∧ Pk) ∨ (Pi ∧ Qj)(Rk ∧ Pk) ∨ (Pi ∧ Rj) T ⎝ ⎠ lk2 = li (Rk ∧ Pk) ∨ (Qi ∧ Pj)(Rk ∧ Pk) ∨ (Qi ∧ Qj)(Rk ∧ Pk) ∨ (Qi ∧ Rj) lj , (Rk ∧ Pk) ∨ (Ri ∧ Pj)(Rk ∧ Pk) ∨ (Ri ∧ Qj)(Rk ∧ Pk) ∨ (Ri ∧ Rj) ⎛ ⎞ (Pk ∧ Qk) ∨ (Pi ∧ Pj)(Pk ∧ Qk) ∨ (Pi ∧ Qj)(Pk ∧ Qk) ∨ (Pi ∧ Rj) T ⎝ ⎠ lk3 = li (Pk ∧ Qk) ∨ (Qi ∧ Pj)(Pk ∧ Qk) ∨ (Qi ∧ Qj)(Pk ∧ Qk) ∨ (Qi ∧ Rj) lj . (Pk ∧ Qk) ∨ (Ri ∧ Pj)(Pk ∧ Qk) ∨ (Ri ∧ Qj)(Pk ∧ Qk) ∨ (Ri ∧ Rj)
Overall, we can express the coordinates of lk as a bilinear function of li and lj as follows: ⎛ ⎞ T 1 li Gk lj ⎝ T 2 ⎠ lk = li Gk lj , where (2.23) T 3 li Gk lj
46 ⎛ ⎞ |Qk, Rk, Pi, Pj||Qk, Rk, Pi, Qj||Qk, Rk, Pi, Rj| 1 ⎝ ⎠ Gk = |Qk, Rk, Qi, Pj||Qk, Rk, Qi, Qj||Qk, Rk, Qi, Rj| , (2.24) |Qk, Rk, Ri, Pj||Qk, Rk, Ri, Qj||Qk, Rk, Ri, Rj| ⎛ ⎞ |Rk, Pk, Pi, Pj||Rk, Pk, Pi, Qj||Rk, Pk, Pi, Rj| 2 ⎝ ⎠ Gk = |Rk, Pk, Qi, Pj||Rk, Pk, Qi, Qj||Rk, Pk, Qi, Rj| , (2.25) |Rk, Pk, Ri, Pj||Rk, Pk, Ri, Qj||Rk, Pk, Ri, Rj| ⎛ ⎞ |Pk, Qk, Pi, Pj||Pk, Qk, Pi, Qj||Pk, Qk, Pi, Rj| 3 ⎝ ⎠ Gk = |Pk, Qk, Qi, Pj||Pk, Qk, Qi, Qj||Pk, Qk, Qi, Rj| . (2.26) |Pk, Qk, Ri, Pj||Pk, Qk, Ri, Qj||Pk, Qk, Ri, Rj|
1 2 3 The three matrices Gk, Gk,andGk are called the trifocal matrices. The whole trio is also sometimes known as the trifocal tensor. For short, we will write (2.23) as
lk = Tijk(li, lj) .
We can regard the trifocal tensor Tijk as a function that, given lines li in view i and lj in view j, returns a line lk in view k such that li, lj,andlk are projections of the same 3D line L. Just as the fundamental matrix is a device for transferring points from one view into lines in the second view, the trifocal tensor is a device for transferring lines in two views into a line in a third view. This transfer fails (that is, we have Tijk(li, lj)=0) in two cases [14, p. 423]:
1. li and lj are epipolar lines with respect to the kth view. Since li passes through the
T T epipole eik and lj passes through the epipole ejk, the planes Pi li and Pj lj formed by
back-projecting these lines both contain the camera center Ok. The line L formed by
the intersection of these planes is actually a visual ray that passes through Ok,andits
image in the kth view degenerates to a point.
2. li and lj are corresponding epipolar lines with respect to ith and jth views. Then the
T T (unoriented) planes Pi li and Pj lj coincide, and the 3D line L is undetermined.
At this stage, one important disclaimer must be made. In the unoriented case, given two lines li and lj that do not satisfy either of the above conditions, there always exists a 3D line
47 li lj
eij eji
Figure 2.5 Orientation consistency for two images of the same line (Proposition 2.17). The ith image plane is on the left, and the jth image plane is on the right.
L that projects onto these lines. In the oriented case, there may not be a consistent way of assigning orientation to this 3D line. That is, it may only be possible to find a line L that projects onto li in the first image, and onto −lj in the second one.
Proposition 2.17. Given two lines li and lj in the ith and jth views, respectively, there
∗ ∗ exists an oriented 3D line L such that li Pi L and lj Pj L if and only if
Oi ∨ Πj −Oj ∨ Πi , (2.27)
where Πi and Πj are the two back-projected planes
T T Πi Pi li and Πj Pj lj .
An image-based equivalent of condition (2.27) is
eij ∨ li −eji ∨ lj .
This condition is illustrated in Figure 2.5. It is related to a result due to Werner and
Pajdla [61] on matching constraints for lines in two images.
2.4.3 Oriented Transfer
Suppose we know two points xi and xj in the ith and jth views that are the projections of the same 3D point X. From Proposition 2.16, we know that this is equivalent to the
48 epipolar consistency requirement (2.20). Without explicitly computing X itself, how can we
find the properly oriented point xk PkX in the third view?
2.4.3.1 Transfer Using Epipolar Geometry
First, we will solve this problem with the help of epipolar geometry, that is, with the knowledge of the fundamental matrices between the three views. We can apply the matrices
Fik and Fjk to map the points xi and xj onto their respective epipolar lines in the kth view:
lki = Fikxi and lkj = Fjkxk .
But also, from our oriented derivation in Section 2.4.1, we know that
lki eki ∨ xk and lkj ekj ∨ xk .
Let us take the intersection (meet) of the two epipolar lines:
lki ∧ lkj (eki ∨ xk) ∧ (ekj ∨ xk)=−(eki ∨ xk) ∧ (xk ∨ ekj) xk , eki ∨ xk ∨ ekj < 0 , = (2.28) −xk , eki ∨ xk ∨ ekj > 0 .
Now, we have
eki ∨ xk ∨ ekj lki ∨ ekj =(Fikxi) ∨ ekj
−lkj ∨ eki = −(Fjkxj) ∨ eki . (2.29)
In this way, we don’t need the knowledge of xk to find the sign of eki ∨ xk ∨ ekj. Putting together (2.28) and (2.29), we ge the following formula for computing the properly oriented coordinates of xk: (Fikxi) ∧ (Fjkxj) , (Fikxi) ∨ ekj < 0or(Fjkxj) ∨ eki > 0; xk = (2.30) (Fjkxj) ∧ (Fikxi) , (Fikxi) ∨ ekj > 0or(Fjkxj) ∨ eki < 0 .
Epipolar transfer is illustrated in Figure 2.6. It is well known [14, p.413] that point transfer using fundamental matrices fails in the following cases:
49 Ok
lkj lki
xk
ekj eki
X
xj xi
Oi Oj
Figure 2.6 Transfer using epipolar geometry (see text).
1. The 3D point X lies on the line defined by camera centers Oi and Oj.Inthiscase,
we have lki ∼ (PkOi) ∨ (PkOj) eki ∨ ekj. Similarly, lkj ∼ ekj ∨ eki.Thus,thetwo
(unoriented) epipolar lines lki and lkj coincide.
2. The 3D point X lies on the line defined by Oi and Ok,orOj and Ok. In the first case,
we have xi ∼ eik and Fikxi = 0; in the second case, xj ∼ ejk and Fjkxj = 0.
3. The three camera centers Oi, Oj and Ok are collinear. Then eki ∼ ekj,andthe
(unoriented) epipolar lines lki eki ∨ xk and lkj ekj ∨ xk coincide.
4. The 3D point X is in the trifocal plane containing Oi, Oj and Ok. Then the image
points eki, ekj,andxk are collinear, and the lines lki and lkj once again coincide.
50 In the last three cases above, we actually have enough information to predict the position of xk. Since the visual rays formed by back-projecting xi and xk are distinct, we can recover
X as the unique point of their intersection in space, and find xk by projecting X into the third image. The failure of equation (2.30) in these cases is an unsatisfying feature of the method. In practice, the problem may be even more serious: whenever the configuration of the three cameras approaches one of the degenerate cases above, (2.30) becomes numerically unstable.
2.4.3.2 Transfer Using the Trifocal Tensor
Next, let us see how the transfer of points can be accomplished with the help of the trifocal tensor. Let li be a line in the ith view containing the point xi. Also, let lj and
lj be two lines in the jth view such that lj ∧ lj xj (see Figure 2.7). Consider the three back-projected planes
T T T Πi = Pi li , Πj = Pj lj , Πj = Pj lj .
Let us assume that the pairs (li, lj)and(li, lj) meet the trifocal consistency constraints of Proposition 2.17:
Πi ∨ Oj Oi ∨ Πj Oi ∨ Πj > 0 . (2.31)
These constraints guarantee the existence of two 3D lines L and L such that
L = Πi ∧ Πj , and L = Πi ∧ Πj .
Moreover, the properly oriented projections of these lines into the kth image are given by
lk = Tijk(li, lj)andlk = Tijk(li, lj) .
Let yj and yj be any two points in the jth image such that
lj yj ∨ xj and lj xj ∨ yj .
51 Ok
l k y' k ' lk xk Pi yk
L L' Y' X
Y li lj ' lj x y'j xi j
yj
O i Oj
Figure 2.7 Transfer using the trifocal tensor.
52 Because we assumed that lj ∧ lj xj,wemusthave
|yj, xj, yj| > 0 . (2.32)
Let X, Y ,andY be points on the plane Πi that project onto xj, yj and yj (recall that xi and xj are constrained to meet the epipolar consistency criterion (2.20), so X must project
onto xi as well). Because lj yj ∨ xj and lj xj ∨ yj are oriented projections of the lines L and L, respectively, we can write
L Y ∨ X and L X ∨ Y .
By Proposition 2.10, (2.32) implies that
|Oj, Y , X, Y | > 0 .
By (2.31), we know that Πi ∨ Oj > 0. Since the points Y , X,andY all belong to the plane Πi, we must conclude that
Y ∨ X ∨ Y −Πi .
Now, let us consider the transformation induced on the plane Πi by the third projection matrix Pk. By Proposition 2.10,
Πi ∨ Ok |Ok, Y , X, Y ||yk, xk, yk| ,
where yk PkY and yk PkY .Sincelk and lk are projections of L Y ∨ X and L X ∨ Y , respectively, we have
lk yk ∨ xk and lk xk ∨ yk .
If |yk, xk, yk| > 0, then xk lk ∧ lk,andif|yk, xk, yk| < 0, then xk lk ∧ lk. Finally, we have the procedure for finding the properly oriented coordinates of xk: Tijk(li, lj) ∧ Tijk(li, lj) , Πi ∨ Ok > 0; xk = (2.33) Tijk(li, lj) ∧ Tijk(li, lj) , Πi ∨ Ok < 0 .
53 Though it is not necessary for the derivation presented above, it is useful to explicitly write out the coordinates of the point X, defined as the point that projects onto xj in the
T jth image and lies on the plane Πi = Pi li. Because li passes through the point xi in the ith image, and xi and xj are in exact epipolar correspondence, X also projects to xi.The coordinates of X can be found by taking the meet of the visual ray Mj Oj ∨ X with the plane Πi:
Mj ∧ Πi (Oj ∨ X) ∧ Πi = −X,sinceOj ∨ Πi < 0 by (2.31) .
Plugging in formula (2.8) to find Mj, we end up with the following expression for X:
X = −Mj ∧ Πi = − Qj ∧ Rj ∧ Πi Rj ∧ Pj ∧ Πi Pj ∧ Qj ∧ Πi xj . (2.34)
This is one simple example of a formula for reconstructing the 3D coordinates of a point based on its projections in two images.
2.5 Oriented Projective Reconstruction
In our derivations of the fundamental matrix and trifocal tensor in Section 2.4, we have assumed that the projection matrices are known for all cameras in the scene. In practice, fundamental matrices or trifocal tensors are often estimated first, and the camera matrices are derived from them. However, the goal of our research is to reconstruct objects in space given not just two or three, but an arbitrary number of views. Because it is inefficient to independently estimate trifocal tensors for each triple of views in the scene, we would like to have all the camera matrices computed at once. In Section 2.3, we have placed no affine or Euclidean constraints on the camera matrices, only requiring them to have full rank. There exist several methods in the literature [14, 21] for projective reconstruction under this assumption. However, the output of these methods does not respect oriented constraints. In this section, we will discuss the issues involved in upgrading a standard projective reconstruction to an oriented projective reconstruction.
54 Let us formulate the reconstruction problem more precisely. Suppose we take n images j of a scene consisting of m points. Let xi denote the image of the ith point in the jth picture.
j j j j T Note that if xi =(xi1,xi2,xi3) is defined, this means that the ith point is visible to the jth j camera, we can assume that xi3 > 0 (recall (2.5) in Section 2.3). A projective reconstruction consists of estimated 3D points X1,...,Xm and camera matrices P1,...,Pn, such that
j xi ∼ PjXi . (2.35)
j However, we need a reconstruction meeting a stronger assumption. Since each xi is deter- j mineduptoapositive factor by the constraint xi3 > 0, the projected point PjXi must also j match xi up to a positive factor: j xi PjXi . (2.36)
If the cameras P1,...,Pn and the 3D points X1,...,Xm meet the above constraint, they are said to constitute a strong realization or an oriented projective reconstruction of the scene. Fortunately, it is relatively simple to convert a reconstruction satisfying (2.35) to a reconstruction satisfying (2.36). We have the following result, due to Hartley [20], [21, p.508]:
Proposition 2.18. Suppose there exists a set of points Xˆ 1,...,Xˆ m and cameras Pˆ1,...,Pˆn such that j xi PˆjXˆ i .
j Moreover, the third coordinate of each xi is positive. That is, the point Xˆ i lies in front of j the focal plane of camera Pˆj if xi is defined. Let P1,...,Pn; X1,...,Xm be a projective reconstruction of the same scene, such that (2.35) is satisfied. Then there exist camera matrices P˜1 = ±P1, ..., P˜n = ±Pn and 3D points X˜ 1 = ±X1, ..., X˜ m = ±Xm such that j for each xi , j xi P˜jX˜ i .
55 To compute the sign-adjusted matrices and 3D points, we can use the following procedure. j j j Begin with any point xi .Ifxi PjXi,leavePj and Xi alone. If xi −PjXi, multiply Pj j j by −1. For every other point Xi such that xi is defined, multiply Xi by −1ifxi −PjXi
Next, consider all cameras Pj that can see Xi and flip their signs if necessary. In this way, continue propagating the changes until (2.36) is satisfied for all cameras and points.
So far, we have described how to upgrade a projective reconstruction to an oriented projective reconstruction, where each camera matrix Pj is determined up to a positive scale factor. Notice, however, that the upgrade is not unique. If P1,...,Pm; X1,...,Xn is an oriented reconstruction, we could multiply all the cameras and all the points by −1, and obtain another oriented reconstruction, since the overall sign each of PjXi would remain unchanged. This observation brings us to the subject of the ambiguity inherent in any reconstruction.
Let us assume that P1,...,Pn; X1,...,Xm meets (2.36), and consider what happens when we respectively transform the cameras and the points as
−1 P˜j = PjM and X˜ i = MXi , (2.37) where M is a non-singular 4 × 4 matrix. This is called the projective ambiguity of the scene:
−1 j P˜jX˜ i =(PjM )(MXi)=PjXi xi . (2.38)
j Based on the knowledge of the 2D points xi alone, we cannot tell whether the “real” re- construction should be P1,...,Pn; X1,...,Xm or P˜1,...,P˜n; X˜ 1,...,X˜ m. The degrees of freedom of the unknown transformation M encapsulate our ignorance about the scene based on the available 2D data. Note that we have not constrained the sign of the determinant of M
— it may be either an orientation-preserving or an orientation-reversing transformation. If
M is orientation-reversing, it simultaneously flips the orientations of the camera focal planes and the image points, so their relative orientation remains the same. It is also possible to show that an arbitrary projective transformation does not affect the relative orientation of
56 camera centers and scene planes. However, as the following argument will demonstrate, the relative orientation of one camera’s center with respect to another camera’s focal plane does depend on the sign of the determinant of M. Consider the transformation of a single camera
Pj: ⎛ ⎞ ⎛ ⎞ ˜ T T −1 ⎜ Pj ⎟ ⎜ Pj M ⎟ ˜ ⎜ T ⎟ −1 ⎜ T −1 ⎟ Pj = ⎝ Q˜ j ⎠ = PjM = ⎝ Qj M ⎠ . T T −1 R˜ j Rj M By Proposition 2.9, the center O of the camera transforms as
−T −T −T −1 O˜j = P˜j ∧ Q˜ j ∧ R˜ j =(M Pj) ∧ (M Qj) ∧ (M Rj)=|M | MOj .
Consider some other camera in the scene, say Pk. Let us also transform this camera as
−1 P˜k = PkM , and compute the relative orientation of its focal plane R˜ k and the center O˜j of the first camera:
−T −1 −T T −1 R˜ k O˜j R˜ k ∨ O˜j =(M Rk) ∨ (|M |MOj)=(M Rk) (|M |MOj)
−1 T −1 −1 T = |M |Rk M MOj = |M |Rk Oj
−1 = |M |(Rk ∨ Oj) .
If |M −1| > 0, the relative orientation of the camera centers and focal planes does not change. If we know whether the center of the jth camera is actually visible to the kth camera (this is equivalent to knowing the proper orientation of the epipole ekj), we can constrain M to have either positive or negative determinant, to enforce the correct visibility relationship between the cameras.
For the purposes of this thesis, we have not needed to estimate properly oriented camera matrices based on multi-view point matches, since metric camera calibration was available for every data set demonstrated in Chapter 5. However, in the future it will be necessary to further study the process of oriented reconstruction, and to implement and test this process on real-world data.
57 CHAPTER 3
Projective Differential Geometry
Projective differential geometry (PDG) deals with the properties of curves and surfaces that remain invariant under projective transformations. Many familiar constructions of Eu- clidean differential geometry are absent here, including surface normals, Gaussian curvature, and the Gauss map. We are interested only in the simplest projective differential invariants that require partial derivatives up to second order. The most important such invariant is local shape, discussed in Section 3.2.6.
As a branch of mathematics, PDG seems to have seen most of its activity before the
1950’s. The bulk of our presentation in this chapter follows the 1932 textbook of Lane [28], which seems to be the most recent elementary English-language source. Currently, PDG is all but unused in the field of computer vision. We tend to agree with Koenderink [25], who states that PDG has many applications in computer vision, and its neglected status is due largely to the unavailability of accessible introductory texts. We are interested in reviving PDG because it offers the right framework for the problem of reconstructing smooth curves and surfaces based on projective information alone. Over the past decade, vision researchers have been acquiring a deeper understanding of projective reconstruction techniques for points, lines, and planes [14, 21]. In our opinion, the addition of tools applicable to the reconstruction of more complex geometric entities would greatly enrich the subject of multi-view geometry.
58 3.1 Curves
We will deal with smooth curves in Pn for n =2, 3.
Definition 3.1 (Curve). A parametrized smooth curve in Pn is a smooth mapping x : I → Pn where I is an open interval of the real line R. That is, the curve point corresponding to any parameter value t ∈ I can be written in homogeneous coordinates as
T x(t)= x1(t),...,xn+1(t) such that the coordinate functions xi(t) are infinitely continuously differentiable.
T Apointx(t) is called regular if the derivative x (t)= x1(t),...,xn+1(t) exists and is independent of x(t)—thatis,x(t) = 0 and x(t) is not a scalar multiple of x(t). The trace of the curve is the set Γ ⊂ Pn that is the image of the domain I under the mapping x.In the subsequent sections, when we refer to a curve as Γ, we mean that Γ is the trace of a differentiable map, as defined above. Beginning in Section 3.3.1, we will also be interested in closed curves. A closed curve can be defined as a differentiable map from the unit circle to Pn. Another definition, after Do Carmo [12, p. 30] is as follows:
Definition 3.2 (Closed Curve). A curve defined on a closed interval I =[a, b] is the restric- tion of a curve defined on an open interval that contains [a, b].Aclosed curve is a mapping x : I → Pn defined on the interval I =[a, b] such that all the derivatives of x agree at x(a) and x(b):
x(i)(a)=x(i)(b),i=0,...,∞ .
3.1.1 Differential Equations of Curves
Following Lane [28], we begin the study of curves in projective space by showing that a curve contained in a k-dimensional flat of Pn must satisfy a differential equation of order k + 1. Incidentally, this characteristic differential equation is the same for two curves related by a projective transformation, so it can serve as a kind of projective differential invariant.
59 Definition 3.3 (Immersion). AcurveΓ is said to be immersed in a linear subspace of dimension k (that is, a k-dimensional flat) if it is contained in a linear subspace of dimension k, but not in any subspace of lower dimension.
Proposition 3.1. AcurveΓ is immersed in a linear subspace of dimension k if and only if the coordinates x(t) of a variable point on the curve satisfy a linear homogeneous differential equation of order k +1, but not of lower order [28, p. 5].
Proof. Suppose that Γ is immersed in a k-dimensional subspace defined by k +1 points
T a1,...,ak+1. We will denote the coordinates of the ith point as ai =(ai,1,...,ai,n+1) . Then we may express a point x on the curve as a linear combination of k + 1 independent scalar functions ξi associated with each ai: k+1 x(t)= ξi(t)ai , (3.1) i=1 k+1 xj(t)= ξi(t)aij for j =1,...,n+1. (3.2) i=1
Equation (3.2) states that the function xj(t) is linearly dependent on functions ξ1(t), ...,
ξk+1(t). This means that we can find k +2constantsc0,...,ck+1 such that the following equation holds:
c0xj + c1ξ1 + ...+ ck+1ξk+1 =0.
Differentiating k + 1 times, we get the following system: ⎧ ⎪ ⎪ c0xj + c1ξ1 + ...+ ck+1ξk+1 =0 ⎨ c0x + c1ξ + ...+ ck+1ξ =0 j 1 k+1 (3.3) ⎪ ...... ⎩⎪ (k+1) (k+1) (k+1) c0xj + c1ξ1 + ...+ ck+1ξk+1 =0.
There exists a non-trivial solution for c0,...,ck+1 if and only if the matrix of the system is singular: xj ξ1 ... ξk+1 xj ξ1 ... ξk+1 . . .. . =0. (3.4) . . . . (k+1) (k+1) (k+1) xj ξ1 ... ξk+1
60 Expanding the above determinant along the first column, and dividing out by the coef- (k+1) ficient of xj (since we assumed that Γ cannot be immersed in a space of dimension less than k, we know that none of the minor determinants are zero), we get a differential equation of the form
(k+1) (k) xj + φ1xj + ...+ φk+1xj =0,j=1,...,4or
(k+1) (k) x + φ1x + ...+ φk+1x =0, (3.5)
where φ1,...,φk+1 are scalar functions of t. It is clear that (3.5) is an order (k + 1) linear homogeneous differential equation.
For example, a curve that reduces to a single point satisfies the equation
x + φx =0. (3.6)
We can see this by realizing that the solution is of the form x = λa where λ is a scalar function of t and a is a fixed point. A line, which can be written as x = λ1a + λ2b,satisfies the equation
x + φ1x + φ2x =0. (3.7)
Finally, the equation of a plane curve is
x + φ1x + φ2x + φ3x =0. (3.8)
Obviously, in P2, every curve satisfies (3.8). The general solution to equation (3.5) has the form (3.1): it is a linear combination of the functions ξ1,...,ξk+1 weighted by arbitrary constants a1,...,ak+1.Anytwosets of k + 1 independent points a1,...,ak+1 and a˜1,...,a˜k+1 can be related by a projective transformation of the form
a˜i = Mai , (3.9)
61 where M is some (n +1)× (n + 1) non-singular matrix. Thus, the differential equation (3.5) is the same for any two curves Γ and Γwhosepoints˜ x(t)andx˜(t) are related by a projective transformation x˜(t)=Mx(t). Note that the exact form of (3.5) is not preserved under certain transformations that do not change the curve, such as multiplying x(t) by a scalar function μ(t) or reparametrizing the curve as x t(s) . However, this is not a problem for us since we will not use the differential equations themselves as curve invariants.
3.1.2 Osculating Spaces
Equation (3.5) says that x(k+1) lies in the span of independent points x, x, x,...,x(k).
These k + 1 points are useful for defining a series of subspaces that locally characterize Γ at x [28, p. 11]:
Definition 3.4 (Osculating Space). At a regular point x of a curve Γ immersed in n- dimensional space, the osculating linear space of order k is the flat spanned by x, x, x,...,x(k).
We will use the join operator to write the order-k osculating subspace as x∨x ∨...∨x(k). The most important osculating space is order-1, the tangent line x∨x. The tangent line can be defined as the limit of a line connecting x with a “nearby” point that approaches x along Γ. In this framework, we can give a geometric interpretation to x as the derivative point that lies on the tangent line and is defined by a process of convergence1:
x(t + δt) − x(t) x(t) = lim . (3.10) δt→0 δt
Consider what happens when x is multiplied by a scalar function μ(t), a transformation that does not change the image of the curve in Pn. Taking the derivative, we get
(μx) = μx + μx . (3.11)
1Strictly speaking, we need to define convergence in Pn, but this can be easily done [50, p. 72].
62 As it turns out, we can scale x to place the derivative point (μx) anywhere on the tangent to the curve Γ at x, though it cannot coincide with x itself unless Γ is zero-dimensional (this fact is expressed by equation (3.6)).
Proposition 3.2. The curve tangent is a projective differential invariant. That is, the tangent is not affected by scaling of the homogeneous coordinates of x by a scalar function or by reparametrization of the curve. If a projective transformation x˜(t)=Mx(t) is applied to the curve, the tangent transforms in a natural way. That is, the tangent line at the transformed point x˜ is given by (Mx) ∨ (Mx).
Proof. First, consider the effect of multiplying the curve by a scalar function μ(t): x˜(t)=
μ(t) x(t). As can be seen from (3.11), the new point x˜ =(μx) is still in the span of x and x, so the tangent line remains invariant to scaling of homogeneous coordinates. If we reparametrize the curve as x˜(s)=x t(s) ,wheret(s) is a differentiable function representing the parameter change, we can use the chain rule to get
dt x˜ = x . ds
Thus, the linear span of x and x is not changed by reparametrization. Finally, the fact that the tangent line at x˜ is given by (Mx) ∨ (Mx)isobvious:
x˜ ∨ x˜ =(Mx) ∨ (Mx) =(Mx) ∨ (Mx) by linearity of differentiation.
The order-2 osculating space x ∨ x ∨ x,ortheosculating plane, is the limit of the plane through x and two points that independently approach x. Looking back at equation
(3.7) we can see that for a curve that is contained in a one-dimensional subspace of P3, x is actually linearly dependent on x and x, so a proper osculating plane does not exist, degenerating instead to a line. More generally, (3.5) states that for a curve contained in a
63 Type Dimensions of osculating spaces Regular Point 0, 1 Inflection Point 0, 1, 1 Flat Point 0, 1, 1, 0 Handle Point (stationary osculating plane) 0, 1, 2, 2 Cusp 0, 0
Table 3.1 Projectively invariant classification of curve points. k-dimensional subspace, x(k+1) is dependent on x, x,...,x(k).Thatis,x(k+1) lies in the osculating subspace of order k, so the osculating subspace of order k + 1 is degenerate.
Using the same techniques as in the proof of Proposition 3.2, we can establish that projective osculating subspaces of any order are projective differential invariants [50, p. 72].
Taking advantage of this result, we can obtain a projectively invariant classification of curve points based on the sequence of dimensions of their osculating spaces of order 0, 1, etc. Table 3.1 shows the characteristic sequences for a few useful types [50, p. 73].
In particular, an inflection point has a degenerate osculating plane, which means that the point x is in the span of points x and x. This is precisely the condition expressed in (3.7). Even though we derived (3.7) using the global assumption that all points on the curve lie on a fixed line, the equation can also be interpreted locally. Namely, (3.7) holds at a particular point x if the curve instantaneously becomes one-dimensional at x. In the same way, it is easy to see that a cusp locally satisfies the equation (3.6) of a 0-dimensional curve, and a handle point locally satisfies the equation (3.8) for a plane curve.
3.1.3 Order of Contact
Possibly the most important invariant of projective differential geometry is order of con- tact2, which describes how “intimately” curves interact with other curves and lines in space.
2In fact, since the definition of order of contact relies only on the differential properties of the curves, and not on the linear structure of space, order of contact is invariant not only under projective transformations, but under a more general class of diffeomorphisms.
64 Definition 3.5 (Order of Contact). Two curves Γ1 and Γ2, parametrized by x1(s) and x2(t) respectively, have contact of order k at point x = x1(s0)=x2(t0) if there exists a parameter transformation t = t(s) with t0 = t(s0) such that
(i) (i) x1 (s0)=x2 t(s0) ,i=0,...,k .
The order of contact of two curves is the maximum k such that the curves have contact of order k there [50, p. 75].
Intuitively, two curves have order of contact k at a certain point x if they intersect each other at k+1 “consecutive” points. By visualizing contact in this way, one can easily become convinced of the following well-known facts.
Proposition 3.3. A tangent line to a curve Γ at a point x has order of contact ≥ 1 with Γ at x.Ifx is an inflection point, then the tangent has order of contact ≥ 2 with Γ at x.
3.2 Surfaces
In our work, we will use the word surface to denote a two-dimensional manifold im- mersed in P3 each of whose points can locally be expressed in parametric form using smooth coordinate functions.
Definition 3.6 (Surface). AsetΣ ⊂ P3 is a surface if each point in Σ is contained in an open coordinate neighborhood V ⊂ P3, and there exists a map x : U → V ∩ Σ,whereU is an open subset of R2, such that x is smooth (has continuous partial derivatives of all orders) and has a continuous inverse map x−1 : V ∩ Σ → U.
Using homogeneous coordinates, we can write any point in Σ as
T x(u, v)= x1(u, v),x2(u, v),x3(u, v),x4(u, v) , where each xi(u, v) is a smooth coordinate function. Next, let us define the notion of a tangent plane [28, p. 35]:
65 Definition 3.7 (Tangent Plane). The tangent plane at a point of a surface is the plane containing the tangent lines at the point of all curves on the surface through the point.
Note that this definition makes no reference to the concept of a normal to the surface, which does not exist in projective space. We can easily verify that the tangent plane is determined by x ∨ xu ∨ xv, where subscripts denote partial differentiation with respect to u and v. Let Γ be a curve on the surface Σ passing through the point x. The equation of Γ is x(t)=x u(t),v(t) . The tangent to Γ at x is spanned by x and
x = xuu + xvv . (3.12)
Thus, we can see that the tangent to any surface curve Γ at x lies in the subspace spanned by x, xu,andxv (see Figure 3.1). A point x on Σ is called regular if x, xu,andxv are independent, so they indeed span a plane. A surface that contains exclusively regular points is called smooth or regular3.
x P
x x' v xu S G
Figure 3.1 The surface tangent plane.
3We can modify Definition 3.6 to exclude the possibility of non-regular (singular) points: simply add the condition that the inverse map x−1 : V ∩ Σ → U must also be smooth.
66 3.2.1 Order of Contact of Surfaces
In this section, we state the definitions of order of contact for two surfaces, and for a surface and a curve. In Section 3.2.6, we will use these notions to classify the local shape of a surface in the neighborhood of a point by the number of tangent lines having order-2 contact with the surface at that point.
Definition 3.8 (Contact of Two Surfaces). Two surfaces Σ1 and Σ2, parametrized by x1(s, t) and x2(u, v) in the neighborhood of a common point x = x1(s0,t0)=x2(u0,v0), have contact of order k at x if there exists a locally regular parameter transformation u = u(s, t) and v = v(s, t) such that all derivatives of x1(s, t) and x2 u(s, t),v(s, t) are equal up to order k [50, p. 79].
Definition 3.9 (Contact of a Curve and a Surface). AcurveΓ and a surface Σ, which have a common point x, have contact of order k if there exists a curve Γ ⊂ Σ passing through x such that Γ and Γ have contact of order k [12, p. 171].
For example, a tangent line (as well as the tangent plane itself) has order ≥ 1contact with the surface, and a curve has order ≥ 2 contact with its osculating plane.
3.2.2 Developable Surfaces
In the rest of this document, we will make extensive use of the properties of a developable, a special kind of surface made up of a one-parameter family of lines. For our purposes, a developable is a surface that locally (that is, in the neighborhood of each of its points) can be described as either a tangent developable or a cone, as defined below.
Definition 3.10 (Tangent Developable). A tangent developable is a surface that contains all the tangent lines of a curve y(u). The equation of a tangent developable can be written as
x(u, v)=y(u)+v y(u) . (3.13)
67 The tangent lines to y are called the generators of the tangent developable, and the curve y itself is called the edge of regression. The edge of regression can be regarded as the locus of intersections of “consecutive” straight lines contained on the surface.
Definition 3.11 (Cone). A cone is a surface that has the parametric form
x(u, v)=a + v y(u) (3.14) where y(u) is some curve and a is a constant. The point a is called the apex of the cone. Note that a cone and a cylinder are projectively equivalent (we may regard a cylinder as a cone with the apex at infinity).
Developable surfaces can also be defined as envelopes of a one-parameter family of planes. The following proposition deals with a property of developable surfaces that follows directly from this definition.
Proposition 3.4. The tangent plane of a developable surface remains the same for any point along a single generator.
Proof. Let us first compute the tangent plane of a tangent developable (3.13):
x = y + v y
xu = y + v y
xv = y
x ∨ xu ∨ xv ∼ y ∨ y ∨ y .
The tangent plane at x(u, v) is spanned by y, y,andy, and there is no dependence on v. Note that for values of u where the curve y has an inflection point, the tangent plane of the developable is not defined.
68 We can verify the same property for the equation of a cone (3.14):
x = a + v y
xu = v y
xv = y
x ∨ xu ∨ xv ∼ a ∨ y ∨ y .
Note that the tangent plane of a cone is not defined when v =0.
3.2.3 Conjugate Nets
Our goal in this section is to derive a formula (3.22) relating pairs of special conjugate directions on a surface tangent plane. By a direction on the tangent plane to surface Σ at point x, we will mean any point on the line spanned by the first-order partial derivatives xu and xv. The bilinear form appearing in (3.22) is actually the projective version of the famous second fundamental form from Euclidean projective geometry. We will introduce the notion of conjugacy through a geometric definition of a conjugate net.
Definition 3.12 (Net of Curves). Two one-parameter families of curves on the surface Σ are said to form a net if exactly one curve of each family passes through each point of Σ, and the tangents of the two curves passing through the same point are always distinct [28, p.
34].
Definition 3.13 (Conjugate Net). Anetisconjugate if the tangents of the curves of one family of the net constructed at the points of each fixed curve of the other family form a developable surface. The tangents to the two curves passing through any point are said to be conjugate or to lie in conjugate directions [28, p. 122].
It is known that a surface immersed in P3 can support infinitely many conjugate nets.
In fact, one of the families of a conjugate net can be assigned arbitrarily [28, p. 127].
69 Proposition 3.5. Consider the neighborhood of a point x = x(s0,t0) on the surface Σ,and let the two families of curves of a conjugate net be given by
x(s, t0)=x u(s, t0),v(s, t0) and x(s0,t)=x u(s0,t),v(s0,t) . (3.15)
(For the first family, t is treated as a constant, and for the second family, s is treated as a constant.) Then the tangent directions xs = xuus + xvvs and xt = xuut + xvvt of the two curves passing through the point x = x(s0,t0) satisfy the following curvilinear differential equation:
lusvs + m (usvt + utvs)+nvsvt = 0 (3.16) where l = |x, xu, xv, xuu|, m = |x, xu, xv, xuv|,andn = |x, xu, xv, xvv|.
x t,s ds ()00+ x t,s ds t()00+
x t,s x t,s ()00 t()00 s const.
S t const.
Figure 3.2 Illustration for the proof of Proposition 3.5.
Proof. Let us construct two “consecutive” tangents in the t-direction at x(t0,s0)andx(t0,s0+ δs). The two tangent lines are given by
x(t0,s0) ∨ xt(t0,s0)andx(t0,s0 + δs) ∨ xt(t0,s0 + δs) . (3.17)
70 (See Figure 3.2.) By definition 3.13, the one-parameter family of lines defined by x(t0,s) ∨ xt(t0,s)ass varies is supposed to sweep out a developable surface. That is, as δ approaches
0, we expect the two consecutive tangents x(t0,s0)∨xt(t0,s0)andx(t0,s0+δs)∨xt(t0,s0+δs) to intersect in a point lying on the edge of regression (or apex) of the developable. Let us
find the limit of the two lines defined by x(t0,s0)∨x(t0,s0 +δs)andxt(t0,s0)∨xt(t0,s0 +δs):
lim [x(t0,s0) ∨ x(t0,s0 + δs)] δ→0 x(t0,s0 + δs) − x(t0,s0) ∼ lim x(t0,s0) ∨ δ→0 δs
= x(t0,s0) ∨ xs(t0,s0) , (3.18)
lim[xt(t0,s0) ∨ xt(t0,s0 + δs)] δ→0 xt(t0,s0 + δs) − xt(t0,s0) ∼ lim xt(t0,s0) ∨ δ→0 δs
= xt(t0,s0) ∨ xst(t0,s0) . (3.19)
If the two lines defined by (3.17) intersect in the limit, the two lines (3.18) and (3.19) must lie in the same plane. This constraint can be written as
|x, xs, xt, xst| =0. (3.20)
Let us write the complete expression for the derivative point xst using the chain rule:
xst =(xuuut + xuvvt)us +(xvuut + xvvvt)vs
= xuuusut + xuv(usvt + vsut)+xvvvsvt . (3.21)
Now we can expand (3.20) by plugging in the expression for xst (3.21):
|x, xs, xt, xst|
= |x, xuus + xvvs, xuut + xvvt, xuuusut + xuv(usvt + vsut)+xvvvsvt|
= |x, xu, xv, xuuusut + xuv(usvt + vsut)+xvvvsvt|
= usut |x, xu, xv, xuu| +(usvt + vsut) |x, xu, xv, xuv| + vsvt |x, xu, xv, xvv| .
71 Setting the above expression to zero and rewriting, we get the desired result (3.16). In matrix form, lm ut us vs = 0 (3.22) mn vt where l = |x, xu, xv, xuu|, m = |x, xu, xv, xuv|,andn = |x, xu, xv, xvv|.
If the conjugate curves (3.15) are actually parametric curves, then the condition (3.20) becomes simply m = 0. From this observation, we easily obtain the following result [28, p.
122]:
Proposition 3.6 (Corollary). If the parametric curves on the surface Σ form a conjugate net, then x satisfies the following differential equation:
xuv = cx + axu + bxv , where a, b,andc are scalar constants.
3.2.4 Asymptotic Directions
In the previous section, we introduced the concept of conjugate directions. In this section, we will consider asymptotic directions that are conjugate to themselves. But first, we will give a purely geometric definition of a curve whose tangent at every point lies in an asymptotic direction.
Definition 3.14 (Asymptotic Curve). AcurveΓ on a surface Σ is called asymptotic if at each point of the curve, the tangent plane to Σ coincides with the osculating plane to Γ.[28, p. 35].
Proposition 3.7. Let the curve defined by x = x u(t),v(t) be an asymptotic curve. Then at each point x of the curve, the following differential equation is satisfied:
l (u)2 +2muv + n (v)2 =0, (3.23) where l, m,andn are defined as in (3.16).
72 Proof. The points x, x,andx that span the osculating plane must also lie in the tangent plane x ∨ xu ∨ xv.Wehave
x = xuu + xvv ,
x =(xuuu + xuvv )u + xuu +(xvuu + xvvv )v + xvv
2 2 = xuu(u ) +2xuvu v + xvv(v ) + xuu + xvv .
Clearly, x and x already lie in the tangent plane. The remaining point x must be dependent on x, xu,andxv, so the following expression must vanish:
2 2 |x, xu, xv, x | = |x, xu, xv, xuu(u ) +2xuvu v + xvv(v ) + xuu + xvv |
2 2 = |x, xu, xv, xuu(u ) +2|xuvu v + xvv(v ) |
2 2 =(u ) |x, xu, xv, xuu| +2u v |x, xu, xv, xuv| +(v ) |x, xu, xv, xvv|
= l (u)2 +2muv + n (v)2 .
In matrix form, we get lm u u v =0. (3.24) mn v
By statement of Proposition 3.7, (3.24) holds at every point of an asymptotic curve on a surface. However, by examining the proof one ascertains that (3.24) also applies locally at a point x of any curve Γ lying on the surface Σ if the osculating plane to Γ at x coincides with the tangent plane to Σ, or if the point x is in the span of x and x, in which case Γ has an inflection and the osculating plane is not defined. This observation motivates the following definition.
Definition 3.15 (Asymptotic Directions). Let αxu + βxv be a direction in the tangent plane to surface Σ at point x. This direction is asymptotic for Σ at x if the tangent line t = x∨ αxu +βxv has order ≥ 2 contact with Σ at x (t can be referred to as an asymptotic tangent).
73
Proposition 3.8. If t = x ∨ αxu + βxv is an asymptotic tangent to Σ at x,then
lα2 +2mαβ+ nβ2 =0, (3.25)
where l = |x, xu, xv, xuu|, m = |x, xu, xv, xuv|,andn = |x, xu, xv, xvv|.
Proof. By Definition 3.9, if t has order 2 contact with Σ at x, Σ must contain a curve Γ passing through x and having order 2 contact with t. This can only happen if t is the the tangent of Γ at x,andx is also an inflection point of Γ. Let Γ be parametrized as x u(t),v(t) . If Γ has an inflection at x, this means that the second derivative point x is
contained in x ∨ x ∼ t.Sincet lies in the tangent plane x ∨ xu ∨ xv, we can also write
|x, xu, xv, x | = 0. By proof of Proposition 3.7, this is equivalent to (3.25).
Incidentally, by comparing (3.25) with (3.22), we can easily see why asymptotic directions are sometimes referred to as self-conjugate.
Proposition 3.9. At any point of a developable surface, there is a single asymptotic direc- tion, which is the direction of the generator at that point.
Proof. Let us compute the coefficients l, m,andn of the equation (3.23) for a tangent developable surface (3.13).
x = y + vy xuu = y + vy
xu = y + vy xuv = y
xv = y xvv = 0 .
2 l = |x, xu, xv, xuu| = |y + vy , y + vy , y , y + vy | = −v |y, y , y , y |
m = |x, xu, xv, xuv| = |y + vy , y + vy , y , y | =0
n = |x, xu, xv, xvv| =0.
74 Now, let us repeat the same procedure for a cone (3.14):
x = a + vyxuu = vy
xu = vy xuv = y
xv = yxvv = 0 .
l = |x, xu, xv, xuu| = |a + vy,vy , y,vy | = −v |a, y, y , y |
m = |x, xu, xv, xuv| = |a + vy,vy , y, y | =0
n = |x, xu, xv, xvv| =0.
In both cases, the equation (3.23) reduces to l(u)2 = 0 which has a single solution u =0.
Thus, a developable surface has a single asymptotic direction, which is the direction of the generator. A developable surface does not contain an asymptotic net; its single family of asymptotic curves is precisely the set of its generators.
3.2.5 Alternative Definitions of Conjugacy
In this section, we look at two alternative definitions of conjugate directions, and show how the fundamental equation (3.22) may be derived from each of them. The first alternative definition makes use of the notion of harmonic position.
Definition 3.16 (Harmonic Position). Let x1, x2, x3, x4 be four collinear points. The two pairs (x1, x2) and (x3, x4) are said to be in harmonic position if cr(x1, x2; x3, x4)=−1, where cr denotes cross ratio [53, p. 48].
Definition 3.17 (Conjugacy and Harmonic Position). At a point of a surface in ordinary space (P3) two tangents are said to be conjugate, or lie in conjugate directions, if they separate the asymptotic tangents harmonically. A net of curves on such a surface is said to be a conjugate net if the two tangents of the curves of the net at each point of the surface are conjugate tangents.
75 We can give an alternative proof of a variant of Proposition 3.5 using the definition above.
Proposition 3.10. Let x1 = α1xu +β1xv and x2 = α2xu +β2xv be two asymptotic directions at the point x on a surface Σ.Alsoletx3 = α3xu + β3xv and x4 = α4xu + β4xv be two conjugate directions. By the above definition, we can write
cr(x1, x2; x3, x4)=−1 .
Then the conjugate directions satisfy
lα3α4 + m (α3β4 + β3α4)+nβ3β4 =0,
where l = |x, xu, xv, xuu|, m = |x, xu, xv, xuv|,andn = |x, xu, xv, xvv|.
Proof. The points x1,...,x4 all lie on the line xu ∨ xv. To compute cr(x1, x2; x3, x4), we
∨ 1 0 need a projective coordinate system for xu xv. Let us assign coordinates 0 and 1 to xu
1 and xv, respectively, and make 1 the unit point. Then the coordinate vector for each of the
αi i four points x becomes βi . To simplify the calculations, we will represent each point using
αi i 1 2 3 4 − its projective parameter θ = βi . It is easy to show (see [53, p. 48]) that cr(θ ,θ ; θ ,θ )= 1 if and only if
(θ1 + θ2)(θ3 + θ4)=2(θ1θ2 + θ3θ4) . (3.26)
Because θ1 and θ2 are asymptotic directions, they each satisfy the equation
2 lθi +2mθi + n =0
(this is the same equation as (3.24), divided by v). By two well-known identities derived from the quadratic formula, the sum and product of the roots of the equation are
2m n θ1 + θ2 = − and θ1θ2 = . l l
76 Plugging back into (3.26), we get 2m n − (θ3 + θ4)=2 +2θ3θ4 l l
lθ3θ4 + m(θ3 + θ4)+n =0 lm θ4 θ3 1 =0, mn 1 yielding an expression equivalent to (3.22).
Yet another way to define conjugacy is via a construction involving an involution with respect to an osculating quadric [50, p. 81]. An involution is a projective transformation that is its own inverse, but is not the identity. An osculating quadric of Σ at point x is any quadric that has contact of order 2 with the surface (an osculating quadric is not unique).
The standard way to represent a quadric surface Ω is using a 4 × 4 symmetric matrix Q, such that for all points y belonging to Ω, yT Qy =0. Q can be regaded as a projective transformation that maps points in space to coefficient vectors of their polar planes with respect to Ω. For a point on Ω, the polar plane is the tangent plane. For example, Qy is the coefficient vector of the tangent plane to Ω at y. The intersection of Ω with the plane defined by Qy is a conic containing all points z such that the line y ∨ z is tangent to Ω at z.For this reason, this conic is known as the rim or contour generator of Ω with respect to y [21, p.190]. This gives us enough background information to introduce the second alternative definition of conjugate surface tangents.
Definition 3.18 (Conjugacy as Involution of Tangents of an Osculating Quadric). Let x1 =
α1xu + β1xv and x2 = α2xu + β2xv be two directions on the tangent plane Π = x ∨ xu ∨ xv.
Then x1 is conjugate to x2 if
x ∨ x2 ∼ Π ∧ (Qx1) , (3.27) where Q is the matrix of an osculating quadric to Σ at x.
Let us discuss in more detail the geometric construction underlying the definition (refer
∗ to Figure 3.3). The plane Π = Qx1 is the polar plane of the point x1. The intersection of
77 ∗ Π with the osculating quadric Ω is the rim of Ω with respect to x1. Since the line x ∨ x1 is tangent to Σ and also to Ω, it is clear that Π∗ contains the point x. The line of intersection between the tangent plane Π and the polar plane Π∗, denoted by Π∧Π∗, is conjugate to the
∗ tangent line x ∨ x1.NotethatΠ ∧ Π is also the tangent to the rim of Ω with respect to x1.
Once again, we can give an alternative proof of Proposition 3.5 using the above construction.
P*
rim P
x x 1
x 2 W
Figure 3.3 Conjugacy as involution of tangents of an osculating quadric.
Proposition 3.11. Let x1 = α1xu +β1xv and x2 = α2xu +β2xv be two conjugate directions in the tangent plane Π = x ∨ xu ∨ xv. By Definition 3.18, x1 and x2 satisfy (3.27). Then we have
lα1α2 + m (α1β2 + β1α2)+nβ1β2 =0, where l = |x, xu, xv, xuu|, m = |x, xu, xv, xuv|,andn = |x, xu, xv, xvv|.
78 ∗ T Proof. The point x2 lies on the plane Π = Qx1,sowemusthavex2 Qx1 =0.
T T x2 Qx1 =(α2xu + β2xv) Q(α1xu + β1xv)
T T T = α1α2 x Qxu +(α1β2 + β1α2) x Qxv + β1β2 x Qxv u u v T T xu Qxu xu Qxv α2 = α1 β1 T T . xu Qxv xv Qxv β2
Qx ∼ x ∨ xu ∨ xv
Qxu =(Qx)u ∼ (x ∨ xu ∨ xv)u
=(x ∨ xu)u ∨ xv +(x ∨ xu) ∨ xuv
=(xu ∨ xu + x ∨ xuu) ∨ xv +(x ∨ xu) ∨ xuv
= x ∨ xuu ∨ xv + x ∨ xu ∨ xuv
T xu Qxu ∼|xu, x, xuu, xv| + |xu, x, xu, xuv| = |x, xu, xv, xuu| .
By an analogous process, we derive
T T xu Qxv ∼|x, xu, xv, xuv| and xv Qxv ∼|x, xu, xv, xvv| .
T T (Note that the constant of proportionality concealed by ∼ is the same for xu Qxu, xu Qxv,
T and xv Qxv.) The result is once again identical to (3.22): lm α2 α1 β1 =0. (3.28) mn β2
3.2.6 Local Shape
In Section 3.2.4, we have seen that for a developable surface, the quadratic equation