The Project ∗

Peter K. Allen, Alejandro Troccoli, Benjamin Smith, Ioannis Stamos†and Stephen Murray‡

Department of Computer Science Department of Art History and Archaeology‡ , New York, NY 10027 Department of Computer Science, Hunter College, CUNY†

Abstract Saint-Pierre in Beauvais [9], a unique and still used which is a prime example ofhigh gothic Preserving cultural heritage and historic sites is an architecture (fig 1). A team ofarchitectural histo- important problem. These sites are subject to ero- rians, computer scientists, and engineers has begun sion, vandalism, and as long-lived artifacts, they have to study the fragile structure of the tallest medieval gone through many phases ofconstruction, damage cathedral in France . The thirteenth-century Gothic and repair. It is important to keep an accurate cathedral at Beauvais collapsed three times in the record ofthese sites using 3-D model building tech- and continues to pose problems for its nology as they currently are, so preservationists can long-term survival. Our group is creating a highly ac- track changes, foresee structural problems, and allow curate three-dimensional model based on laser scans a wider audience to ”virtually” see and tour these ofthe interior and exterior ofthe building. This in- sites. Due to the complexity ofthese sites, building formation will be used to examine the weaknesses in 3-D models is time consuming and difficult, usually the building and propose remedies; visualize how the involving much manual effort. This paper discusses building looked when it was first built; and to serve new methods that can reduce the time to build a as the basis for a new collaborative way of teach- model using automatic methods. Examples ofthese ing about historic sites both in the classroom and on methods are shown in reconstructing a model ofthe the Internet. The building is included on the World Cathedral ofSaint-Pierre in Beauvais, France. Monuments Fund’s Most Endangered List. Although the cathedral survived the heavy incendi- 1 Introduction ary bombing that destroyed much ofBeauvais dur- Preserving cultural heritage and historic sites is an ing World War II, the structure is as dangerous as important problem. The advent ofnew digital 3D it is glorious, being at risk from flaws in its original scanning devices has provided new means to pre- design, compounded by differential settlement and serve these sites digitally, and to preserve the historic with stresses placed on its flying buttresses from gale record by building geometric and photorealistic 3D force winds off the English Channel. The winds cause models. A recent international conference (Virtual the buttresses to oscillate and already weakened roof and Augmented Architecture ‘01 [6]) has highlighted timbers to shift. Between the 1950s and 1980s nu- the need for simple and automatic methods to create merous critical iron ties were removed from the rich models ofhistoric environments. buttresses in a damaging experiment. A temporary tie-and-brace system installed in the 1990s may have A number ofother projects have addressed this and made the cathedral too rigid, increasing rather than similar problems including [17, 8, 5, 4, 3, 11]. This decreasing stresses upon it. Although the cathedral paper discusses new methods we have developed to has been intensively studied, there continues to be a recover complete geometric and photometric mod- lack ofconsensus on how to conserve the essential vi- els oflarge sites and to automate this process. In sual and structural integrity ofthis Gothic wonder. particular, we discuss new methods for data abstrac- With its five-aisled choir intersected by a towered tion and compression through segmentation, 3D to and its great height (keystone 152.5 feet 3D registration (both coarse and fine), and 2D to 3D above the pavement), Beauvais Cathedral, commis- texture mapping ofthe models with imagery. sioned in 1225 by Bishop Milon de Nanteuil, provides The testbed for our methods is the Cathedral of an extreme expression ofthe Gothic enterprise. ∗This work was supported in part by NSF grant IIS-01- 21239 and the Samuel Kress Foundation. point cloud is aligned relative to the other clouds at the position and orientation corresponding to the po- sition occupied by the physical surface it represents on the actual structure Although the point clouds may be registered manually, it is very time consum- ing and error-prone. Manually visualizing millions ofsmall points and matching them is quite impre- cise and difficult as the number ofscans increases. When possible, it is a common practice to use spe- cially designed targets/fiducials to help during the registration phase. In our case, however, it was al- most impossible to place targets higher than 2.5 me- ters above the ground, requiring us to develop an automatic registration method. Our registration method is a three step process. The first step is an automatic pairwise registration be- tween two overlapping scans. The pairwise registra- tion matches 3-D line segments extracted from over- lapping range scans to compute the correct transfor- mation. The second step is a global registration step that tries to align all the scans together using over- lapping pairs [15]. The third step is a multi-image Figure 1: Cathedral of Saint-Pierre simultaneous ICP algorithm [2] that does the final fine registration ofthe entire data set.

3.1 Segmenting the Scans 2DataAcquisition Our previously developed range segmentation algo- rithm [13, 14] automatically extracts planar regions The surveying process involved capturing range and and linear features at the areas of intersection of intensity data. On site work started in June 2001 neighboring planar structures. Thus, a 3–D range using a Cyrax 2400 scanner. The whole interior and scan is converted to a set ofbounded planes and a 1/3 ofthe exterior were surveyed, but due to techni- set offinite lines. The extracted 3–D intersection cal problems, the scanning process was aborted and lines are very accurate because their orientation and resumed in June 2002, this time with a new Cyrax length depends on all the points ofboth planes that 2500 scanner. Over 200 range images were acquired, intersect. 120 interior and 100 exterior scans, most ofthem sampled at 1000 by 1000 points. Intensity images The pairwise registration algorithm efficiently com- were captured by a 5 megapixel digital camera, which putes the best rigid transformation (R, T ) between a was freely placed. pair ofoverlapping scans S1 and S2. This transfor- mation has an associated grade g(R, T )thatequals The scanner uses a time-of-flight laser technology to the total number of line matches after the transfor- measure the distance to points on an object. Data mation is applied. Note that the grade is zero ifthere from the scanner comprises point clouds, each point is no overlap between the scans. comprising four coordinates, (x, y, z) and a value rep- resenting the amplitude ofthe laser light reflected In a typical scanning session tens or hundreds of back to the scanner. The amplitude is dependent on range scans need to be registered. Calculating all the reflectance ofthe material and the surfaceorien- possible pairwise registrations is impractical because tation. it leads to a combinatorial explosion. In our system, the user is providing a list ofoverlapping pairs of 1 3 Registration of Range Scans scans . All pairwise transformations are computed. Then, one ofthe scans is chosen to be the anchor scan In order to acquire data describing an entire struc- Sa. Finally, all other scans S are registered with re- ture, such as the Beauvais Cathedral, multiple scans spect to the anchor Sa. In the final step, we have must be taken from different locations with differ- the ability to reject paths ofpairwise transformation ent scanner orientations. To reconstruct the origi- that contain registrations oflower confidence. nal structure, the different scans must be registered 1Note that the list does not have to be a complete list of together correctly. When correctly registered, each all possible overlaps. In more detail, the rigid transformations (Ri,Ti)and are searched. Then the RSV distance (to the model their associated grades g(Ri,Ti) are computed be- point) for each of them is evaluated to get the closest tween each pair ofoverlapping scans. In this man- point. ner a weighted undirected graph is constructed. The Once correspondences are made, least-squares is typ- nodes ofthe graph are the individual scans, and the ically used to find the correct transformation matri- edges are the transformations between scans. Finally ces for the data points. However, in the presence the grades g(Ri,Ti) are the weights associated with ofoutliers, least-squares can be unstable. Accord- each edge. More than one path ofpairwise transfor- ingly, an M-estimator is used to weight the data mations can exist between a scan S and the anchor points. The procedure is iterative, using a conjugate- Sa. Our system uses a Dijkstra-type algorithm in or- gradient search to find the minimum. der to compute the most robust transformation path Figure 3b shows the reduction oferror vs. iteration from S to Sa.Ifp1 and p2 are two different paths for a known test data set. The data set of Beauvais from S to Sa,thenp1 is more robust than p2,if Cathedral contains over 100 scans, and it requires the cheapest edge on p1 has a larger weight than the significant computational resources and time to reg- cheapest edge of p2. This is the case because the cheapest edge on the path corresponds to the pair- ister these scans with full resolution; therefore, those wise transformation of lowest confidence (the smaller scans are sub-sampled down to 1/25 oftheir original the weight the smaller the overlap between scans). In resolution. Using this method the error metric is re- this manner, our algorithm utilizes all possible paths duced from 0.27 to 0.13. Figure 4 shows the results ofapplying the algorithm on 2 scans that have been ofpairwise registrations between S and Sa in order to find the path ofmaximum confidence. This strategy coarsely registered. The column, which is misaligned can reject weak overlaps between scans that could initially, is correctly aligned after the procedure. affect the quality ofthe global registration between The resulting model is very large, made up ofdata scans. Fig. 2 shows a partial registration of14 seg- from all the scans, and visualizing the entire model mented scans, with scanner coordinate axes shown can be difficult. Fig. 5 show the resulting model as well. from a number of views 2. A 3-D video fly-through animation ofthe model is available at the author’s 3.2 Simultaneous Registration of Multiple website [16]. For these models, 120 scans were reg- Range Images istered on the inside ofthe cathedral, and 47 on the Once the range images are registered using the auto- outside. matic method above, a refinement ofthe basic ICP algorithm to simultaneous registration ofmultiple 4 Texture mapping range images is used to provide the final registra- tion. This method is an extension ofthe method The range data allowed us to build a geometrically proposed by Nishino and Ikeuchi [10]. Their work correct model. For photorealistic results we mapped extends the basic pair-wise ICP algorithm to simul- intensity images to the polygon meshes. The in- taneous registration ofmultiple range images. An put to the texture mapping stage is a point cloud, error function is designed to be minimized globally a mesh corresponding to the point cloud, and a 2D against all range images. The error function utilizes image. A user manually selects a set ofcorrespond- an M-estimator that robustly rejects the outliers and ing points from the point cloud and the 2D image can be minimized efficiently through the conjugate which are used to compute a projection matrix P gradient search framework. An additional attribute, that transforms world coordinates to image coordi- laser reflectance ofrange data, is introduced in the nates. The method for computing P is described error metric ofpoint-to-point distance forrobustness in [7]. Let (Xi, xi) be a pair of3D and 2D homo- offinding better corresponding points. To speed up geneous point correspondences, with Xi and xi of the registration process, a k-d tree structure is em- the form (Xi,Yi,Zi,Wi)and(xi,yi,wi) respectively. ployed so that the search time for the matched point Each pair provides the following two equations, is considerably reduced.   1 T T T P One major drawback ofthe point-to-point approach 0 −wiXi yiXi  2  T T T P = 0, is, ifonly the point-to-point geometric distance is wiXi 0 −xiXi 3 used, its inability offinding best match point in the P sliding direction. Additional information to suggest i better matches is required. For this purpose, the where each P is a row of P .Bystackingupthe laser reflectance strength value (referred to as RSV) equations derived from a set of n pairs, a 2nx12 is used (see fig. 3a). To find a best match point of 2Images are best viewed in color, as many details are easily a model point, multiple closest points in a K-D tree seen, and are not as clear in grayscale matrix A is obtained. The solution vector p ofthe set ofequations Ap = 0 contains the entries ofthe matrix P . At least 6 point correspondences are re- quired to obtain a unique solution. In practice, an overdetermined system is used, which we solve using the SVD decomposition ofmatrix A. Prior to solving the system ofequations, both 3D and 2D points are normalized to improve data accuracy [7]. This nor- malization consists ofa translation and scaling step. Both 2D and 3D points are translated so that their centroid is at the origin. Both 2D and 3D points are then scaled so that their√ RMS (root-mean-squared)√ distance to the center is 2and 3 respectively. Once the projection matrix P is obtained an occlu- sion function V (P, Ti) → 0, 1whereeachTi is a mesh triangle is computed. The function evaluates to 1 when Ti is visible from the camera described by P and 0 otherwise. At this point the mesh is textured by calculating the texture coordinates ofevery visi- ble triangle. The matrix P can also be computed from 3D and 2D line correspondence or a mixture ofboth, points and lines. We are currently working on computing P using line correspondences so that we can later make this process automatic following the ideas of the range to range registration described before. Fig- ure 5(last row) shows a textured mesh ofthe cathe- dral’s cloisters, entrance and exterior.

5 Conclusions and Future Work

We have developed methods that can address the problem ofbuilding more and more complex 3D mod- Figure 2: Top: Segmented regions from point els oflarge sites. The methods we are developing can cloud. Bottom: 14 segmented, automatically regis- automate the registration process and significantly tered scans. reduce the manual effort associated with registering large data sets. Due to the complexity ofthese sites, References building 3-D models is time consuming and difficult, usually involving much manual effort. [1] P. K. Allen, I. Stamos, A. Gueorguiev, E. Gold, and P. Blaer. Avenue: Automated site modeling in ur- Texture mapping can also be automated in a simi- ban environments. In 3rd Int. Conference on Digital lar way using distinguishable features on 3D and 2D Imaging and Modeling, Quebec City, pages 357–364, images to create a texture map. We are develop- May 2001. ing automatic feature selection algorithms to create [2] P. J. Besl and N. D. Mckay. A method for registration these maps. In addition, texture mapping is cur- of 3–D shapes. IEEE Transactions on PAMI, 14(2), rently done with single 2D images. We are working Feb. 1992. to incorporate multiple overlapping images to com- pletely fill in a texture map. [3]P.E.Debevec,C.J.Taylor,andJ.Malik.Modeling and rendering architecure from photographs: A hy- We also are implementing our own sensor planning brid geometry-based and image-based approach. In algorithms to reduce the number ofscans [12]. We SIGGRAPH, 1996. have mounted the scanner on a mobile platform we [4] P. Dias, V. Sequeira, J. M. Goncalves, , and F. Vaz. have built [1]. We can then link our sensor plan- Combining intensity and range images for 3d architec- ning algorithms to our mobile robot path planning tural modelling. In International Symposium on Vir- algorithms, and automatically acquire scans oflarge tual and Augmented Architecture. Springer-Verlag, sites. 2001. Figure 3: Top: Correspondences using Euclidean Figure 4: Before and after fine registration. Note distance. Middle: Corrspondence using RSV value. column misalignment has been corrected. Bottom: Error convergence of ICP algorithm.

C. O’Sullivan, editors, International Symposium on [5] S. F. El-Hakim, P. Boulanger, F. Blais, and J.-A. Be- Virtual and Augmented Architecture (VAA 01), pages raldin. A system for indoor 3–D mapping and virtual 125–132. Springer, June 2001. environments. In Videometrics V, July 1997. [12] M. K. Reed and P. K. Allen. Constraint based sensor [6] R. Fisher, K. Dawson-Howe, and C. O’Sullivan. Inter- planning. IEEE Trans. on PAMI, 22(12):1460–1467, national Symposium on Virtual and Augmented Ar- 2000. chitecture, Trinity College Dublin. Springer-Verlag, [13] I. Stamos and P. K. Allen. 3-D model construction 2001. using range and image data. In Computer Vision & [7] R. Hartley and A. Zisserman. Multiple View Geome- Pattern Recognition Conf. (CVPR), pages 531–536, try in Computer Vision. Cambridge University Press, June 2000. 2000. [14] I. Stamos and P. K. Allen. Geometry and texture re- [8] M. Levoy, K. Pulli, B. Curless, S. Rusinkiewicz, covery of scenes of large scale. Computer Vision and D. Koller, L. Pereira, M. Ginzton, S. Anderson, Image Understanding(CVIU), 88(2):94–118, 2002. J. Davis, J. Ginsberg, J. Shade, and D. Fulk. The [15] I. Stamos and M. Leordeanu. Automated feature- Digital Michelangelo Project: 3D scanning of large based range registration of urban scenes of large scale. statues. In SIGGRAPH, 2000. In IEEE International Conference on Computer Vi- [9] S. Murray. Beauvais Cathedral, architecture of tran- sion and Pattern Recognition (CVPR), June 2003. scendence. Princeton University Press, 1989. [16] Scanning and modeling the cathe- [10] K. Nishino. and K. Ikeuchi. Robust simultaneous reg- dral of saint pierre, beauvais, france. istration of multiple range images. In The 5th Asian http://www.cs.columbia.edu/˜allen/BEAUVAIS. Conference on Computer Vision, 2002, pages 454– 461, January 2002. [17] The piet´a project. http://www.research.ibm.com/pieta. [11]K.Nuyts,J.-P.Kruth,B.Lauwers,H.N.M.Polle- feys, L. Qiongyan, J. Schouteden, P. Smars, K. V. Balen, L. V. Gool, and M. Vergauwen. Vision on conservation. In B. Fischer, K. Dawson-Howe, and Figure 5: Row 1: Exterior model, 47 registered scans. Row 2-3: Interior model (viewed from outside and inside), 120 registered scans. Row 4: Texture mapped views of the model.