Multi-View Reconstruction Using Photo-Consistency and Exact Silhouette Constraints: a Maximum-Flow Formulation
Total Page:16
File Type:pdf, Size:1020Kb
Multi-view Reconstruction using Photo-consistency and Exact Silhouette Constraints: A Maximum-Flow Formulation Sudipta N. Sinha Marc Pollefeys Department of Computer Science, University of North Carolina at Chapel Hill, USA. Abstract when seen from different views must produce pixels with similar colors; this is the color consistency or the photo- This paper describes a novel approach for reconstructing a consistency constraint. In the ill-posed reconstruction prob- closed continuous surface of an object from multiple cali- lem, different scenes can be consistent with the same set of brated color images and silhouettes. Any accurate recon- color images. Theoritically the union of all photo-consistent struction must satisfy (1) photo-consistency and (2) silhou- scenes, the photo hull [12] is a unique reconstruction, but it ette consistency constraints. Most existing techniques treat is sensitive to the sampling rate of 3D voxels, the range of these cues identically in optimization frameworks where sil- textures in the scene and image noise. The reconstructed houette constraints are traded off against photo-consistency shape when re-projected must coincide with the respective and smoothness priors. Our approach strictly enforces sil- silhouettes; this is the silhouette consistency constraint.The houette constraints, while optimizing photo-consistency and exact visual hull’s silhouettes [1, 2] should be considered, smoothness in a global graph-cut framework. We transform when calibration or segmentation errors are present. the reconstruction problem into computing max-flow / min- SFS methods compute the visual hull [1, 14, 2, 5]; the cut in a geometric graph, where any cut corresponds to a maximal shape consistent with a set of silhouettes. It can be surface satisfying exact silhouette constraints (its silhou- computed by intersecting visual cones obtained by back- ettes should exactly coincide with those of the visual hull); a projecting silhouettes from calibrated viewpoints. Exact minimum cut is the most photo-consistent surface amongst polyhedral representations [5] of the visual hull as well them. Our graph-cut formulation is based on the rim mesh, as volumetric ones [15] are common. Polyhedral visual (the combinatorial arrangement of rims or contour genera- hulls [5] are watertight and efficient to compute, although tors from many views) which can be computed directly from they coarsely approximate the actual shape when only a few the silhouettes. Unlike other methods, our approach en- views are available. In fact visual hulls cannot recover sur- forces silhouette constraints without introducing a bias near face concavities as these never appear on the silhouettes. the visual hull boundary and also recovers the rim curves. Volumetric methods like Space Carving [12], Generalised Results are presented for synthetic and real datasets. Voxel Coloring (GVC) [15] can reconstruct complex ob- jects or scenes based on photo-consistency by carving away 1. Introduction voxels, producing a reconstruction where all inconsistent Two well-known categories of multi-view 3D reconstruc- voxels have been removed. Photo-consistency is also used tion algorithms are - (1) Shape from Silhouette(SFS) [1, 2, by [3, 10] to formulate multi-view stereo as a max-flow 5] : techniques that compute a coarse shape of an object problem and by [8, 9] to build energy functions, which are from its silhouettes; (2) Shape from Photo-consistency [12, minimized by graph-cuts. 15, 18] : volumetric methods which recover the geome- Some recent approaches combine silhouette constraints try of complete scenes using the photo-consistency con- and photo-consistency; [13] combines them into a single straint [12]. Multi-view stereo [17] also relies on photo- cost function; [6] uses them in a level-set based variational consistency to recover dense correspondence across views approach; [3] uses the visual hull for initialization and graph and compute scene depth. 3D reconstruction is an opti- cuts for optimization, [4, 7] describe iterative mesh defor- mization problem, solved by both local [12, 15], as well as mation using texture and silhouette forces. These methods global methods like dynamic programming [17], variational do not guarantee exact silhouette consistency; in fact they techniques [6, 7] and graph-cut optimization [3, 8, 9]. could introduce a bias in the reconstruction near the visual In this paper we present a global optimization approach hull boundary. We enforce silhouette constraints strictly, for surface reconstruction, by imposing constraints present and obtain the most photo-consistent solution amongst the in both color images and silhouettes. A true scene point, ones which satisfy silhouette constraints exactly. We start 1 by computing the exact visual hull [5] from silhouettes and from a camera form a ruled surface called a cone-strip while then recover the rim mesh [2] from it. The rim mesh tells us the boundaries of cone strips are called intersection curves; how the actual surface touches the visual hull (we assume these lie outside the actual surface in general. The cone- that the actual surface is inside it). This is used to build a strips are made up of multiple visual cone patches each of geometric graph; a graph cut on it yields by construction, which has a rim segment, the part of the rim between suc- one of the many possible surfaces which exactly satisfy the cessive frontier points. Figure 1 illustrates these definitions silhouette constraints. Graph edges are now assigned costs while [2, 5] provide details. from a photo-consistency measure and smoothness prior. Computing the minimum cut on this graph, minimizes an energy function that produces an optimal photo-consistent smooth surface along with the location of the rim curves on the surface. Sec. 2 describes the theory of visual hulls and the rim mesh; Sec. 3 explains our max-flow formulation of the re- construction problem and the rim mesh construction, while Sec. 5 and 6 discusses results and conclusions respectively. 2. Theory X Z Y Figure 1: 2-view visual hull: Rims are dashed curves; Frontier Point Rim they intersect at frontier points; intersection curves are solid curves. Cone-strips and visual cone faces are also shown. Figure 2: Top : 3-view case. Rims are dashed curves and intersecton curves are solid lines. X, Y and Z are frontier points. Below : The complete rim mesh. Every rim segment 2.1. Visual Hulls and the Rim Mesh occupies a single visual cone patch. The visual hull is the maximal shape that projects consis- tently into a set of silhouettes, and is obtained by intersect- The idea of epipolar nets [13] was refined into a for- ing visual cones from the corresponding calibrated view- mal representation called the rim mesh by [2]. The rim points. Visual rays from a camera which grazes the true mesh is a combinatorial arrangement of rims on the surface surface tangentially give rise to a smooth continous curve (see Figure 2). Frontier points constitute its vertices and on the true surface called the rim [2, 16] or the contour gen- the rim segments between successive frontier points, form erator and its projection in the image is the apparent con- its edges. Each rim mesh edge thus corresponds to a visual tour. Rims from different cameras intersect on the surface cone patch. Patches on the surface bounded by different rim at points called frontier points, which project to the respec- segments form faces in the rim mesh. The rim mesh itself tive apparent contours. The projected frontier points satisfy contains only information about the arrangement of rims the 2-view epipolar tangency constraint [2] (these are the and their connectivity, not their geometric shape. The im- points of tangencies of tangents from the epipole to the sil- age based algorithm [2] computes the rim mesh only from houette which are also corresponding epipolar lines). The silhouettes under some simplifying assumptions; it does not back-projection of points on the apparent contour produces deal with occluded rims or objects with non-zero genus. An viewing rays, each of which contributes a view edge to the embedding of the rim mesh on the surface partitions it into visual hull polyhedron. At least one point on the view edge patches where every patch is purely inside the visual hull must touch the surface at a point on the rim. The view edges but touches it along its boundary (along different rim seg- 2 v ments). This patch separates the object’s interior from a set 1 v4 p4 p To Source To Sink of intersection curves, which lie outside the object in gen- 1 p3 eral. This property will be useful in our method. p2 v2 ( a ) v3 2.2. The maximum flow problem v1 Many global minimization and combinatorial optimization Even Level p1 problems have been solved by formulating them as graph- v p2 cut problems on network flows [20]. Given a flow network 2 v3 G(V,E) [20] with a source s and a sink t, a graph cut parti- v4 tions V into S and T such that sSand tT. the maximum p flow problem computes a flow with the maximum value. 3 p The max-flow min-cut theorem [20] shows that computing v2 2 v the maximum flow is equivalent to finding the mininum cost 3 v cut between s and t (the s-t min-cut). While s-t cuts can be 1 v4 p4 p computed efficiently, the more general multiway-cut prob- 3 Odd Level lem (partitioning graphs into 3 or more subsets) is NP-Hard. p3 In computer vision, [3, 8, 10] have used graph-cuts on geo- v3 metric graphs while [11] has solved stereo, segmentation, v1 v4 p4 multi-view reconstruction using energy minimization for- p1 To Source To Sink mulations via graph-cuts. v2 ( b ) ( c ) 3. Our Graph Cut Formulation 2 = 3.1.