Inverse Procedural Modelling and Applications
Total Page:16
File Type:pdf, Size:1020Kb
DISS. ETH NO. 22224 Inverse Procedural Modelling and Applications A dissertation submitted to ETH ZURICH for the degree of Doctor of Sciences (Dr. sc. ETH Zurich)¨ presented by Julien Weissenberg MSc. Advanced Computing, Imperial College London Ingenieur´ diplomˆ e´ ENSIIE, Evry, France born July 11, 1986 citizen of France accepted on the recommendation of Prof. Dr. Luc Van Gool, examiner Prof. Dr. Nikos Paragios, co-examiner 2014 TO MY FAMILY. Abstract Architects, urban planners, thefilm and gaming industry as well as map makers are all very much interested in detailed, semantic city models. While generating a synthetic city has become much easier thanks to procedural modelling, automatically modelling an existing city remains a challenge. With procedural modelling, a building is described as a procedure, wherein an initial shape is successively refined according to a formal grammar set of rules. In this work, we present a set of methods to automatically instantiate and generate such sets of rules, from pictures. First, we present the applications and challenges behind city modelling. Shape grammars have greatly contributed to making city modelling easier. From there, a large spectrum of applications became possible, such as generating realistic large-scale virtual cities or fine-grained urban planning simulations. We introduce the shape grammar framework and explain how it is used to represent buildings. Nevertheless, when it comes to modelling an existing city, much of the work remains a tedious, manual task. Thefirst steps to an automatic pipeline include facade extraction and style detection. Next, we distinguish between two cases: thefirst, where a pre-defined set of rules is available and can be used to support detections of architectural elements; the second, where the set of rules is automatically inferred from a facade layout. In thefirst case, the set of rules is used as a powerful tool to support 3D reconstruction. In more detail, the goal is tofind the parameters of a grammar for a specific building structure. The rules are used to constraint the search and infer hard-to-detect parameters from initial detections. We demonstrate the use of rules for reconstruction of Doric style temples. We show how the initial Structure-from-Motion reconstruction and detections can be leveraged by using rules that make it possible to automatically restore the ruins. However, it is the exception that a pre-defined set of rules is available. In fact, architects systematically transgress the rules they establish. Furthermore, we must account for the wide spectrum of architectural styles and the very large number of buildings to be recon- structed. Therefore, automation of large-scale city reconstruction implies inferring these sets of rules automatically. We present a method to automatically infer such set of rules. In addition to reconstruction, we show how to use the inferred rules for compression, comparison and virtual facade synthesis. More than tools to reconstruct cities, this work tells about the logic of architecture. Fi- nally, it is a unique opportunity to analyze the relations between language and meaning. Resum´ e´ Les architectes, tout comme l’industrie du cinema´ et du jeu video,´ ou encore les car- tographes et urbanistes, ont besoin de moeles` de villes semantiques´ et detaill´ es.´ Si la modelisation´ procedurale´ a rendu la synthese` d’une ville virtuelle bien plus facile, la modelisation´ d’une ville existante reste un de´fi. En modelisation´ procedurale,´ un batimentˆ est de´fini par une procedure,´ qui affine progressivement un bloc de depart.´ Nous traitons dans cette these` d’un ensemble de methodes´ pour automatiquement determiner´ les regles` et leurs parametres` partir de photos. Tout d’abord, nous presentons´ les applications et de´fis lies´ la modelisation´ urbaine. Les grammaires formelles ont ouvert la voie a` la synthese` de megalopoles´ virtuelles ou encore de simulations urbaines precises.´ Nous presentons´ les grammaires de formes et expliquons comment les utiliser pour decrire´ un batiment.ˆ Neanmoins,´ lorsqu’il s’agit de repesenter´ des villes existantes, le travail reste en grande partie manuel. La detection´ du style et l’extraction des fac¸ades constituent les premieres` etapes´ vers une automatisation. Ensuite, il faut distinguer deux cas. Dans le premier, un ensemble de regles` est pre-d´ e´fini et sert ameliorer´ les detections´ des el´ ements´ architecturaux. Dans le second, les regles` sont determin´ ees´ automatiquement a` partir d’images de fac¸ades. Dans le premier cas, les regles` sont un outil puissant pour la reconstruction 3D. Plus precisement,´ le but est de trouver les parametres` d’une grammaire pour une structure de batimentˆ donnee.´ Les regles` permettent de restreindre l’optimisiation et ainsi de determiner´ des parametres` difficiles a` detecter´ a` partir des observations. Une etude´ de cas sur les temples doriques montre comment les regles` permettent de tirer partie d’une reconstruction 3D et des detections´ pour restaurer automatiquement les ruines. Neanmoins,´ la disponibilite´ de regles` releve` de l’exception. En fait, les architectes trans- gressent systematiquement´ les regles` qu’ils etablissent.´ De plus, il faut tenir compte du panel de styles et du grand nombre de batimentsˆ a` reconstruire. Par consequent,´ l’auto- matisation pour la reconstruction a` grande echelle´ passe par la creation´ automatique de telles regles.` Nous presentons´ une methode´ pour ce faire. De plus, nous montrons l’utilite´ de ces regles` afin de compresser, comparer et synthetiser´ des fac¸ades. Au dela` de la creation´ d’outils pour la reconstruction de villes, cette these` porte sur la logique memeˆ de l’architecture. Enfin, il s’agit d’une opportunite´ d’analyser les relations entre sens et langage. Acknowledgements Thank youfirst of all to Professor Luc Van Gool for offering me the opportunity of doing research at the Computer Vision Lab. His genuine passion and great knowledge about the topic made the PhD a delight. He also gave me the unvaluable freedom to explore my own research interests. Also, I would like to warmly thank Hayko Riemenschneider for making my contribution stronger, for his valuable guidance and human qualities. No others deserve more thank than everyone at CVL and from 3DCOFORM and VarCity projects, from Zurich to Leuven I will keep very joyful memories. Keeping in mind the importance of encouragements, I would especially like to thank my office collegues for making C108 the funniest office, of my friends and family for always being there and being who they are. Starting with the Computer Vision community I would like tofinish with the architecture one. Thank you very much to everyone at Information Architecture whose knowledge was very helpful. Contents List of Figures xiv List of Tables xx 1 City Modelling: applications, methods and challenges 1 1.1 Architecture and Urbanism . ......1 1.2 Digital fabrication . ...........3 1.3 Maps . .................4 1.4 Film and video games . .........6 1.5 Cultural heritage . ...........7 1.6 Challenges in City Modelling . ......8 1.7 Contributions . .............9 1.8 Overview . ............... 10 2 A language for architecture 11 2.1 3D data acquisition . .......... 11 2.1.1 LiDAR vs. Structure-from-Motion . 11 2.2 Airborne vs. Ground imagery . ...... 13 2.2.1 Picture acquisition . ...... 13 2.3 From visual realism to abstraction . ... 13 2.4 A language for architecture . ....... 14 2.4.1 Architectural theory . ...... 14 2.4.2 Computational design . .... 16 2.4.3 Shape grammars . ....... 17 2.4.4 L-systems . .......... 19 2.4.5 Split grammars . ........ 20 2.5 CGA shape grammars . ........ 21 2.5.1 Properties . .......... 22 2.5.2 Syntax . ............ 22 2.5.3 Shape . ............ 22 2.5.4 Operations . .......... 22 2.5.5 Derivation tree . ......... 23 CONTENTS xi 2.5.6 CGA example . ......... 23 2.6 GML . ................. 24 3 Approach 25 3.1 Pre-processing steps . ........ 25 3.1.1 Syle detection . ......... 25 3.1.2 Facade extraction . ....... 25 3.2 Top-down vs. bottom-up . ....... 25 3.3 Pipeline . ................ 27 4 Style recognition 29 4.1 Introduction . ............. 29 4.1.1 Related work . ......... 30 4.2 System overview . ........... 31 4.3 Scene classification . ....... 31 4.3.1 Scene classes . ......... 32 4.3.2 Feature extraction and classification . 32 4.3.3 Results . ............ 33 4.4 Image rectification . ........ 33 4.5 Facade splitting . ............ 35 4.5.1 Line segment detection and grouping . 35 4.5.2 Vertical line sweeping . ..... 36 4.5.3 Results . ............ 36 4.6 Style classification . ........ 37 4.6.1 NBNN algorithm . ........ 38 4.6.2 Results . ............ 38 4.7 Conclusion and future work . ...... 39 5 Facade segmentation 41 5.1 Introduction . ............. 41 5.2 Related Work . ............. 43 5.3 Datasets description . ......... 44 5.4 Bottom Layer: Recursive Neural Network for Semantic Segmentation 45 5.5 Middle Layer: Introducing Objects Through Detectors . 48 5.6 Top Layer: Weak Architectural Principles . 50 5.6.1 Parameters . .......... 52 5.7 Results . ................ 52 5.8 Conclusion and future work . ...... 54 6 Learning Where To Classify In Multi-View Semantic Segmentation 55 6.1 Introduction . ............. 55 6.2 Related Work . ............. 57 6.3 3D Surface and Semantic Classification . 60 CONTENTS xii 6.3.1 Multi-View Surface Reconstruction . 60 6.3.2 Heavy vs. Light Features for Semantic Labelling . 60 6.3.3 Multi-View Optimization For 3D Surface Labelling . 61 6.4 Multi-view Observation Importance . 63 6.4.1 Ranking Observations by Importance . 63 6.4.2 Reducing View Redundancy and Scene Coverage . 64 6.5 Experimental Evaluation . ....... 65 6.5.1 Single Discriminative Views - Zero Redundancy . 67 6.5.2 Reduction of Scene Coverage . 69 6.6 Conclusions . ............. 69 7 Rules for reconstructions 71 7.1 Introduction . ............. 71 7.2 Related work . ............. 73 7.3 Main system components . ....... 74 7.3.1 Grammar Interpreter . ..... 75 7.3.2 Asset detector . ......... 78 7.3.3 3D reconstruction module . ... 78 7.3.4 Vision module .