UNIVERSITY OF MELBOURNE

Melbourne School of Engineering

Procedural 3D Reconstruction and Quality Evaluation of Indoor Models

by

Ha Thi Thu Tran

A thesis submitted in total fulfilment of the requirements of the degree of Doctor of Philosophy

in the Department of Infrastructure Engineering

September 2019

Declaration of Authorship

This is to certify that

1. the thesis comprises only my original work towards the PhD;

2. due acknowledgement has been made in the text to all other material used;

3. the thesis is less than 100,000 words in length, exclusive of tables, maps,

bibliographies and appendices.

Signed: Ha Thi Thu Tran

Date: September 10, 2019

i

Abstract

Building Information Modelling (BIM) plays an important role in the digital transformation of the construction sector and built environments. BIM promises to achieve better quality infrastructures and to shorten the duration of construction projects, and also has additional values in the global infrastructure market, as it potentially provides more efficiency in collaboration, transparency and information management, and greater intelligence in the decision-making process during the whole lifetime of buildings. In addition, up-to-date 3D building models serve as a versatile data source for various applications such as energy simulation, navigation, location-based services, and emergency response. Today, 3D building models are available only for newly designed or recently constructed buildings. A large proportion of existing buildings have been in existence for many decades. Meanwhile, the 3D models are not often updated to reflect changes in an existing building during the different stages of its lifecycle. Automated methods for efficient and reliable generation of 3D building models have the potential to expand the application domains to existing buildings. Although recent lidar scanning and photogrammetry techniques allow efficient capturing of the as-is condition of the built environment, the development of automated processes for the production of accurate, correct, and complete 3D models from the data remains a challenge. In addition, quantitative measurement of the quality of the 3D reconstructed models is essential, as it enables the measurement of the faithfulness of the models in representing the physical built environment.

This thesis aims to develop a novel approach to automated reconstruction of indoor models and to provide a comprehensive method for quantitative evaluation of the quality and change detection of indoor models. The thesis contains three main contributions. First, a shape grammar approach for procedural modelling of indoor environment containing Manhattan world designs from lidar data is proposed. The hypothesis of this research is that understanding and translating the principles of indoor architectural design into a modelling algorithm will provide the capability to overcome the challenges and assist the reconstruction of a 3D semantic-rich model. Second, a procedural method for automated reconstruction of generic

ii

indoor models (i.e., Manhattan and non-Manhattan world buildings) using a stochastic approach is developed. The approach is based on a combination of a shape grammar and a data-driven approach, which facilitates the automated application of grammar rules in the production process and enhances its robustness to incomplete and inaccurate input. Third, a comprehensive method for quality evaluation, comparison, and change detection of 3D indoor models is proposed. The evaluation method facilitates a quantitative assessment of geometric quality of indoor models in terms of three quality aspects: completeness, correctness, and accuracy. The change detection method enables identification of redundant elements in existing 3D models and the missing elements in indoor environments. A series of experiments was carried out to evaluate the performance of the proposed methods on synthetic and real datasets, and the results show the capability of the methods for the reconstruction of complex indoor environments with high accuracy, completeness, and correctness.

iii

Abbreviations

BIM Building Information Modelling

2D Two Dimensional

3D Three Dimensional

MW Manhattan World

Non-MW Non-Manhattan World

IFC Industry Foundation Classes

GIS Geographic Information System

RGB-D Red Green Blue - Depth

OGC Open Geospatial Consortium

CityGML City Geography Markup Language

IndoorGML Indoor Geography Markup Language

LOD Level of Detail

B-rep Boundary representation

CSG Constructive Solid Geometry

TLS Terrestrial laser scanner

MLS Mobile laser scanning

CGA Computer Generated Architecture

SfM Structure from Motion

SLAM Simultaneous Localisation and Mapping

iv

L-system Lindenmayer systems rjMCMC reversible jump Markov Chain Monte Carlo

v

Publications

List of the peer-reviewed publications:

Refereed Journal Publications

Tran, H., Khoshelham, K. and Kealy, A., 2019. Geometric comparison and quality evaluation of 3D models of indoor environments. ISPRS Journal of Photogrammetry and Remote Sensing, 149, pp.29-39.

Tran, H., Khoshelham, K., Kealy, A. and Díaz-Vilariño, L., 2018. Shape Grammar Approach to 3D Modelling of Indoor Environments Using Point Clouds. Journal of Computing in Civil Engineering, 33(1), p.04018055.

Submitted Journal Publications

Tran, H., Khoshelham, 2019. Procedural reconstruction of 3D indoor models from lidar data using reversible jump Markov Chain Monte Carlo. ISPRS Journal of Photogrammetry and Remote Sensing.

Peer-reviewed Conference Publications

Tran, H. and Khoshelham, K., 2019. A stochastic approach to automated reconstruction of 3D Models of interior spaces from point clouds. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences (Best paper award – Indoor3D workshop at ISPRS Geospatial Week 2019).

Tran, H., Khoshelham, K., Kealy, A. and Díaz-Vilariño, L., 2017. Extracting topological relations between indoor spaces from point clouds. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 4, p.401.

Tran, H. and Khoshelham, K., 2019. Building change detection through comparison of a lidar scan with a building information model. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences (ISPRS Geospatial Week 2019).

vi

Khoshelham, K., Tran, H., Acharya, D., 2019, Indoor mapping eyewear: Geometric evaluation of spatial mapping capability of Hololens. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences (ISPRS Geospatial Week 2019).

Khoshelham, K., Tran, H., Díaz-Vilariño, L., Peter, M., Kang, Z. and Acharya, D., 2018. An evaluation framework for benchmark indoor modelling methods. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences, 42(4).

vii

Acknowledgements

I would like first to express my grateful appreciation of my principal supervisor, Dr. Kourosh Khoshelham. I thank him deeply for the excellent supervision, great supports and encouragement that he provided throughout the years of my doctoral studies. I would like to thank him not only for helping me to gain skills and enrich my knowledge of the field, but also for constructively challenging my initial ideas whilst at the same time allowing me the freedom to shape those ideas. His relentless guidance, patience, and kindness inspire me to develop a deeper interest and stronger motivation for doing research and other academic activities. It has been truly a great experience and I feel very fortunate to have worked under his supervision.

I would also like to thank my co-supervisor Professor Allison Kealy and the chair of my advisory committee Professor Stephan Winter for their support, suggestions, and encouragement. Special thanks go to Professor Stephan Winter for helping me to find my principal supervisor. Without his help and support, I might not have had the chance to do my doctoral studies at the University of Melbourne.

I would like to extend my acknowledgement to Professor Tuan Ngo, who gave me the opportunity to join an industrial project, which provided me with better understanding and enables me to see the potential of applying my knowledge in the construction sector. I also thank my mentors, Dr. Noel Faux, Dr. Behzad Bozorgtabar and Dr. Suman Sedai for their guidance, support and sharing of industrial experience during my three-months internship at IBM research–Australia.

I also want to extend my deep gratitude to all my lab mates, Milad Ramezani, Fuqiang Gu, Debaditya Acharya, Yan Li (IE), Yan Li (EE), Hao Chen and Hanxian He, not only for fruitful discussion and exchange of useful information, but also for their encouragement and cheerful moments.

viii

I feel very lucky to have friends who care about me and are always there for me when I need them. I would like to thank my close friends Dr. Nguyen Xuan Thu, Mr. John Drennan, Dr. Thu Phan, Mr. Sengor Kusturica, Dr. Ken – Ho Le Khoa, Dr. Kim Quy, and Dr. Kim Anh Thi Dang. Their friendship has been a great support during my studies, and they mean a lot to me in my life. I specially thank Dr. Thu Phan, who encouraged me to pursue my doctoral study and provided great support during my application. I also particularly thank Mr. John Drennan for his great help with my writing and for proofreading my thesis.

Finally, I wanted to express my love and appreciation of my parents, Mr. Tran Van Dung and Mrs. Tran Thi Vy, and my sister, Dr. Tran Thi Thanh Huyen. Their unconditional love and support are always the greatest inspiration and motivation in my life. I sincerely dedicate this thesis to them.

ix

Contents Chapter 1 ...... 1

Introduction ...... 1

1.1 Problem statement and motivation ...... 2

1.2 Scope and research objectives ...... 6

1.3 Contributions ...... 8

1.4 Thesis structure ...... 9

Chapter 2 ...... 11

Literature review ...... 11

2.1 3D Data acquisition ...... 12

2.2 Automated reconstruction of 3D indoor models...... 16

2.2.1 Model representation in indoor modelling ...... 16

2.2.2 Reconstruction approaches ...... 18

2.3 Quality evaluation of 3D indoor models...... 25

2.4 Procedural modelling ...... 26

2.4.1 Formal grammar ...... 26

2.4.2 L-systems ...... 28

2.4.3 Shape grammar ...... 30

2.5 Summary ...... 31

Chapter 3 ...... 33

A shape grammar approach for modelling Manhattan world environments ...... 33

3.1 Overview of the approach ...... 34

3.2 The shape grammar ...... 35

3.2.1 Starting shape ...... 37

3.2.2 Grammar rules ...... 37

3.3 Procedural production ...... 42

3.4 Demonstration of model reconstruction ...... 44

3.5 Summary ...... 45

x

Chapter 4 ...... 46

Extension of the shape grammar for modelling of generic indoor environments 46

4.1 Overview of the approach ...... 47

4.2 Space partitioning ...... 48

4.3 The indoor shape grammar ...... 52

4.3.1 Shapes ...... 52

4.3.2 Grammar rules ...... 53

4.4 Procedural model generation using rjMCMC ...... 58

4.4.1 Model probability ...... 60

4.4.2 Model transition...... 63

4.5 Demonstration of model reconstruction ...... 65

4.6 Summary ...... 66

Chapter 5 ...... 68

Geometric quality evaluation and change detection of 3D indoor models ...... 68

5.1 Geometric quality aspects of indoor models...... 69

5.2 Quantitative evaluation and comparison of indoor models ...... 71

5.2.1 Challenges in the comparison of indoor models ...... 71

5.2.2 Quality evaluation metrics ...... 73

5.2.3 Demonstration of quality evaluation of indoor models ...... 79

5.3 Building change detection ...... 82

5.3.1 A method for comparison of a 3D indoor model and a point cloud 82

5.3.2 Demonstration of building change detection ...... 85

5.4 Summary ...... 88

Chapter 6 ...... 89

Experiments and results ...... 89

6.1 Datasets ...... 89

6.2 Evaluation of the shape grammar for MW buildings ...... 95

6.2.1 Qualitative evaluation ...... 96

6.2.2 Quantitative evaluation ...... 105

xi

6.2.3 Compatibility with geometric data exchange standards ...... 111

6.3 Evaluation of the shape grammar for non-MW buildings ...... 112

6.3.1 Qualitative evaluation ...... 113

6.3.2 Quantitative evaluation ...... 117

6.4 Evaluation of the method for building change detection ...... 120

6.5 Summary ...... 124

Chapter 7 ...... 126

Conclusion and future work ...... 126

7.1 Contributions ...... 126

7.2 Limitations ...... 128

7.3 Future work ...... 129

Bibliography ...... 130

xii

List of figures

Figure 2.1 Camera-based systems...... 13

Figure 2.2 RGB-D systems...... 14

Figure 2. 3 Laser scanning systems...... 15

Figure 2. 4 Model representation of indoor environment...... 17

Figure 2.5 Local data-driven approach for indoor modelling...... 20

Figure 2.6 A procedure-based approach for indoor modelling ...... 24

Figure 2.7 Generating a quadratic Koch island...... 29

Figure 2.8 Application of the Palladian shape grammar for generation of the layout of the Villa Malcontenta...... 31

Figure 3.1 Overview of the method ...... 35

Figure 3.2 Representation of a unit cube ...... 37

Figure 3.3 Location and size of shapes derived from the histograms of point coordinates ...... 38

Figure 3.4 Classification rule ...... 39

Figure 3.5 Merging non-terminal connected spaces...... 41

Figure 3.6 The production procedure...... 42

Figure 3.7 An example of the production procedure ...... 43

Figure 3.8 Modelling results of an indoor environment...... 44

Figure 4.1 Overview of the proposed approach ...... 47

Figure 4.2 Point clustering ...... 49

Figure 4.3 Local surface extraction ...... 50

Figure 4.4 Globally refined surfaces of the hexagon building ...... 51

Figure 4. 5 3D cell decomposition ...... 51

xiii

Figure 4.6. Geometric representation of a 3D shape and its scope ...... 52

Figure 4.7. Reversible merging and splitting...... 54

Figure 4.8 Reversible classification-declassification processes...... 55

Figure 4.9 An example of topological relations between two non-terminal navigable spaces 푁1 and 푁2 ...... 57

Figure 4.10 An example of containment relation ...... 57

Figure 4.11 Production procedure...... 58

Figure 4.12 Examples of transitions between two models in the model space of an interior space...... 64

Figure 4.13 Reconstruction results for the synthetic hexagon building...... 66

Figure 5.1 An example of low geometric accuracy of the source (right) with respect to the reference (left) due to the discrepancy between the location of a wall (dark blue) in the models...... 69

Figure 5.2 An example of the incomplete source (right) with respect to the reference (left) due to a missing wall in the source...... 70

Figure 5.3 An example of the source model (right) that has low correctness in comparision with the reference model (left) due to the inclusion of an additional wall...... 70

Figure 5.4 Different interpretations of unobservable elements in 3D indoor models...... 72

Figure 5.5 Example of lack of one-to-one correspondence between the source and the reference elements ...... 72

Figure 5.6 An example of an inaccurate wall (dark blue) in the source (right) with respect to that in the reference (left)...... 73

Figure 5.7 An example of relevant source and reference surfaces...... 75

Figure 5.8 An example of sampled points of the reference model ...... 77

Figure 5. 9 The synthetic dataset ...... 79

Figure 5.10 Quantitative evaluation of the source models in the synthetic dataset in terms of completeness (a), correctness (b), and accuracy (c)...... 81

xiv

Figure 5.11 An example of surface coverage...... 84

Figure 5.12 The synthetic dataset...... 85

Figure 5.13 Comparison results for the synthetic dataset...... 86

Figure 5. 14 Comparison results of the synthetic dataset...... 87

Figure 6.1 The point clouds of the datasets...... 93

Figure 6.2 The ground truth models of the datasets...... 94

Figure 6.3 The reconstructed models ...... 97

Figure 6.4 Point clouds colorized according to the signed point-model distances ...... 99

Figure 6.5 Prediction of spaces that are not captured in the point cloud ...... 101

Figure 6.6 The grid coordinates of the cuboids...... 102

Figure 6.7 Topology graphs reconstructed for Office ...... 103

Figure 6.8 Connectivity relations (black solid lines) between the interior spaces of Office...... 103

Figure 6.9 Topology graphs and connectivity relations between the interior spaces reconstructed for House-1 (left column) and House-2 (right column)...... 104

Figure 6.10 Path planning (Room 1 → Corridor → Room 5) by applying a shortest path algorithm to the connectivity graph of Office...... 105

Figure 6.11 The reference models (left) with surfaces marked as either visible (light grey and yellow) or interpreted (dark grey) and the visible surfaces of the source models...... 106

Figure 6.12 Quantitative evaluation of the source models reconstructed from the ISPRS benchmark dataset...... 108

Figure 6.13 Localization of completeness and correctness errors of the UoM models...... 110

Figure 6.14 Sampled point cloud of the reference walls UoM colorized according to point-surface distances in accuracy measures...... 111

xv

Figure 6.15 IFC data model of Office-1 (middle) with its hierarchical model tree (left) and the geometry information (right) of a selected wall (green) (viewed by Solibri Model Viewer)...... 111

Figure 6.16 Reconstructed models, including the 3D cell decomposition (left) and the final models (right) ...... 114

Figure 6.17 Topological graphs between interior spaces of the Museum...... 116

Figure 6.18 The connectivity relations (black solid line) between final spaces (i.e., rooms, corridors) of the Museum...... 117

Figure 6.19 Quality evaluation of the final models in terms of completeness (a), correctness (b) and accuracy (c)...... 118

Figure 6.20 Localization of completeness and correctness errors of the Museum...... 120

Figure 6.21 The ISPRS Benchmark dataset – TUB1...... 121

Figure 6.22 Comparison results of the TUB1 dataset ...... 122

Figure 6.23 Comparison results of the TUB1 of the ISPRS benchmark dataset ...... 123

xvi

List of tables

Table 6.1 Overview of the datasets ...... 92

Table 6. 2 Parameter settings used in the experiments...... 96

Table 6. 3 Statistics of signed point-model distances...... 100

Table 6. 4 Accuracy of semantic reconstruction...... 100

Table 6.5. Parameter settings used in the experiments...... 112

xvii

Chapter 1 Introduction

Building Information Modelling (BIM) is a digital representation of construction and asset operations and is widely recognized as the key in the digital transformation of the construction sector and built environments (EU BIM Task Group, 2017). Up-to-date 3D building models potentially increase collaboration, transparency and efficiency of information management, and improve in the decision-making process during the whole lifetime of buildings (Hossain et al., 2018). However, at present these models are available only for newly designed or recently constructed buildings (EU BIM Task Group, 2017). A process to automatically produce up-to-date 3D building models from survey data (e.g., lidar data, imagery data) will have significant advantages in terms of time and cost efficiency and enables the availability of 3D models for both new and existing built environments. While several approaches for automated generation of 3D building models have been developed, the current solutions are still far from being practical and automated 3D semantic modelling of buildings remains a challenge (Pătrăucean et al. 2015). Coupled with the development of a method for automated reconstruction of 3D models, measurement of quality of the models is essential, as such measurement is useful for determining the faithfulness of the models in representing physical built environments as well as for facilitating comparison between reconstruction methods, thus enabling progress tracking of the research domain.

This introductory chapter highlights the importance of an automated reconstruction of 3D semantic-rich models of indoor environments and the need for a comprehensive framework for quality evaluation of indoor models. Open

1

1.1 Problem statement and motivation challenges are identified, followed by the main objectives and an overview of the contributions of this research.

1.1 Problem statement and motivation

First introduced in Architecture, Construction and Engineering (ACE) in 1992 (Nederveen and Tolman, 1992), BIM has now been recognized as a global trend and widely adopted across multiple disciplines, including the construction sector and built environments. The adoption of BIM promises to achieve not only large- cost effective savings, but also additional values in the global infrastructure market (Eastman et., 2011). Up-to-date 3D building models bring more intelligence and greater efficiency in decision marking processes and collaboration among stakeholders through each stage of the lifecycle of a built environment (EU BIM Task Group, 2017, Office of Projects Victoria, 2019). 3D building models accommodate efficient solutions regarding both technologies and processes for updating, maintaining, and distributing building information, which consequently can help to maximise the use of the information as well as minimizing gaps and delays during a construction project. 3D building models also have greatly environmental benefits. In return for more efficient information management, accurate utilisation of construction material will be improved and thus lead to a large reduction in waste going to landfill. 3D building models can be utilised as a versatile data source for energy analysis and for energy sustainable solutions to problems related to the built environment (O’Donnell et al., 2019), which has been identified as the largest sources of gas emission and energy consumption (Office of Projects Victoria, 2019). Moreover, 3D building models potentially provide better understanding about geometry and semantics of built environments, which can be greatly beneficial for various spatial analysis applications, such as pathfinding, augmented reality, location-based services, and emergency response (Tang et al. 2010).

3D building models are currently available only for new buildings at the design stage or for recent constructed buildings (EU BIM Task Group, 2017). This circumstance limits the development of BIM-based applications. On the one hand, there is usually inconsistency between the as-is condition of a constructed building

2

1.1 Problem statement and motivation and its as-designed 3D models. The updating of 3D building models to reflect changes in the environment is often ignored during construction. On the other hand, many buildings have been built and have existed for a long time. Their 3D building models are mostly not available. For example, in Australia these buildings take up approximately more than 90 percent of the nation’s total buildings (Australian Building Codes Board, 2016). In the US, about 72 percent of floor stocks, equivalent to 46 billion square feet, have existed for many decades (Building Efficient Initiatives, 2010).

Spatial data acquisition techniques, such as range-based and image-based techniques, enable contactless captures of the as-is condition of built environments (Khoshelham, 2018). Among them, the photogrammetry and laser scanning techniques are widely used to effectively generate 3D representation of a built environment. In comparison with the photogrammetry technique, the laser scanning technique is more expensive, but it can produce data with higher accuracy (millimetres to few centimetres) (Javadnejad et al. 2017; Lehtola et al., 2017). However, this data is unstructured and often has a very complex representation of geometry, describing a building as a set of millions of points or small triangular surfaces. Additionally, the lack of semantic information in the data makes it unready for many practical applications. A conversion from the data to 3D building models, known as scan-to-BIM, can be a potential solution to enable the availability of the models for both newly designed and existing buildings and thus providing impetus for the development of BIM-based applications.

Several geometric data exchange standards have been developed to standardize the data structures for a digital description of architecture and built assets. The main aim of these standards is to enable the sharing and exchange of building information among BIM-based systems. In these standards, the building information is generally organised in a hierarchical structure. However, there exist inherent differences among them, such as in terms of representation of geometry information (e.g., surface-based or volumetric representation), oriented users and types of applications, as well as the details of elements in a building model. Among various components of a built environment, the focus of this research is the 3D description of building interiors. The OGC City Geography Markup Language

3

1.1 Problem statement and motivation

(CityGML) standard uses surface-based navigable and non-navigable spaces to describe a building interior at Level-of-Detail 4 (LoD-4) (Kolbe et al. 2005). In Industry Foundation Classes (IFC), an indoor model is defined as a hierarchy of specialised classes containing building elements (i.e. beam, chimney, door, wall, etc.) in volumetric or surface-based representations, spaces, semantic information and topological relations (Liebich 2009). Meanwhile, the OGC Indoor Geography Markup Language (IndoorGML) emphases the description of indoor spatial information and its focus is more on navigable spaces (e.g. rooms, corridors), where objects can navigate, and the relations between the spaces (Lee et al. 2014). Both surface-based and volumetric solid representation are allowed in this standard. In general, in different standards for geometric data exchange, three components, including building elements, navigable spaces and their topological relations (i.e., adjacency, connectivity, and containment) are principal components of a 3D model of a building interior. Meanwhile, in comparison with surface-based representation, volumetric representation enables the description of an indoor model as a small set of parameters and is more suitable for manufacturing and other complex spatial analyses.

Manual or semi-automated processes for generating 3D models of indoor environments from the laser scanning data have been developed in commercial software (e.g., , Edgewise Building) (Autodesk, 2019; ClearEdge3D, 2019). The commercial software often supports exporting the models into the BIM file formats. However, this semi-automated or manual process usually requires considerable amount of manual interaction, which is tedious, time- consuming, error-prone, and heavily dependent on modeller’s intuition (Brilakis et al. 2010; Volk et al. 2014). Meanwhile, automated reconstruction of 3D indoor models from various data has been the subject of extensive research for decades. Until now, approaches to automatic or semi-automatic reconstruction of 3D indoor models by interpreting the survey data (e.g., point cloud, imagery, etc) are still far from being mature enough for practical applications. These applications usually require high geometric accuracy, rich-semantic information, and compatibility with geometric data exchange standards (e.g., IFC standard), which allow integration and interchange between BIM-based systems (Runne et al., 2001; Volk et al.,

4

1.1 Problem statement and motivation

2014). While there has been some success in automated reconstruction of facades and exterior of buildings (Musialski et al., 2013), indoor modelling has proved more challenging. These challenges lie in the complexity of interior geometry (Grussenmeyer et al. 2016), and incompleteness of data (Mura et al. 2016) due to the high amount of clutter representing non-structural elements such as furniture (Bassier et al. 2016).

Procedural modelling has been a powerful technique for generation of geometry using rewriting systems (Ebert et al., 1998; Prusinkiewicz and Lindenmayer 1991). The technique produces a complex geometry by recursive replacements of simple primitives using a set of grammar rules. These grammar-based systems have been widely applied in multiple disciplines, from biology to urban reconstruction and architectural modelling. In biology, the L-systems were introduced by a biologist, Aristid Lindenmayer, as the earliest attempt in generation of geometry using procedural modelling (Lindenmayer, 1968). These systems allow parallel applications of production rules, which are more suitable for describing the growth of plants and organic architecture rather than architecture of buildings. The concept of shape grammar was later introduced by Stiny (1972) focusing on the generation of complex architecture from simple 2D and 3D shapes. Within the field of urban reconstruction, the shape grammar-based systems have been commonly used for 3D modelling of architecture and structures quite successfully. The technique exploits the regularity and arrangement of building elements for both the synthesisation of architectural design as well as for the reconstruction of existing architectural structures (Mitchell, 1990; Parish and Müller, 2001; Marvie et al., 2005; Dehbi et al., 2016). The Computer Generated Architecture (CGA) shape grammar was proposed for efficiently synthesizing highly detailed 3D façade models of a complex building (Wonka et al., 2003; Müller et al., 2006). This grammar-based approach facilitates efficient generation of 3D highly realistic city models and was integrated as a part of the commercial software Esri CityEngine (Esri, 2019a). The integration between a shape grammar and a data-driven interpretation is usually used to facilitate the generation of 3D models of existing building façades (Dang et al., 2015 and Dehbi et al., 2016). Existing shape grammars for modelling outdoor environments and building façades cannot be

5

1.2 Scope and research objectives directly applied to the generation of 3D models of indoor environments, which usually contain a high level of geometric complexity and the data capturing the environment often suffers inherent noises and a high level of incompleteness.

Quality evaluation of 3D indoor models is essential in the development of indoor modelling. A quality evaluation system enables the measurement of the faithfulness of an indoor model in representing the physical structure of an indoor environment. This information is useful for deciding the applicable capacity of an indoor model as well as the modelling approach. Quality evaluation of indoor models is also needed for comparing the performance of reconstruction approaches. This comparison is useful for tracking the progress of the field. In addition to the quality evaluation of an indoor model in representing a physical environment, localizing the modelling errors or differences between the model and the environments are necessary for many practical purposes as well as for identifying the typical limitation and errors of each modelling approach. Despite extensive research on the automated modelling of indoor models, quality evaluation of indoor models has been attracting little interest and a comprehensive evaluation system is not available.

1.2 Scope and research objectives

This research has two main aims: (1) The development of a shape-grammar based approach for the automated reconstruction of 3D models of indoor environments and (2) quality evaluation and change detection of indoor models. This study focuses on the modelling of existing indoor environment from laser scanning data (i.e., point cloud) due to the high accuracy of the data capturing the as-is condition of the environment. The evaluation of the geometric quality of 3D indoor models, which are reconstructed by various modelling approaches, will also be investigated.

Automated reconstruction of a 3D indoor model: An indoor environment generally comprises a large variety of architectural elements which may be arranged in various ways and be governed by different architectural principles, such as Manhattan and non-Manhattan world designs. While, a Manhattan building is built in grid-like structures and the elements are arranged in a regular way, a non- Manhattan building contains elements having arbitrary orientations (Coughlan and

6

1.2 Scope and research objectives

Yuille 1999). An indoor environment often contains furniture and moving objects, which can obscure the building structures from the measuring sensors. These situations can cause incompleteness and inaccuracy of the data. Reconstruction of a 3D model of an indoor environment which contains various architectural designs (i.e., Manhattan and non-Manhattan) from incomplete and inaccurate data is a challenge. This is because the complexities of the indoor geometry and the deficient of the data can result in inaccurate models with missed elements. In this research, the integration between a shape grammar and a data-driven process to procedural indoor modelling will be explored. The hypothesis is that understanding and translating the principles of indoor architectural design into a modelling algorithm will provide the capability to overcome the challenges and assist the reconstruction of a 3D semantic-rich model. The common challenges in using procedural modelling for architecture and building structures lie in the requirement of a set of grammar rules, which are often manually defined and rely on expert knowledge, and a production procedure of the grammar rules indicating the order of rule applications. These two elements should be defined so as to enable the grammar to produce meaningful 3D models and to be able to model various indoor architectural designs. The objectives of this study regarding the application of shape grammar in automated reconstruction of a 3D indoor model are as follows:

▪ Procedural 3D modelling for both Manhattan and non-Manhattan world designs of indoor environments. ▪ Procedural modelling with automated application of the grammar rules during a reconstruction process. ▪ An indoor modelling approach which is robust to the inherent noise and incompleteness of the data.

Quality evaluation and change detection of indoor models: Generally, there is no unique way to represent a 3D model of an indoor environment. In fact, in different standards of geometric data exchange, an indoor environment can be represented as a surface-based model or a volumetric model. A surface-based model represents the environment by a set of surfaces, while a volumetric model represents it as a set of parametric volumes. Comparison between these two representations of the same indoor environment can be a challenge. Moreover, in

7

1.3 Contributions different modelling approaches, a 3D indoor model can be reconstructed from different types of data, which represent the as-is condition of the physical environment. As a result, a comparison between 3D models of an indoor environment and its available survey data can be biased and it is likely to reveal the quality of the data rather than the quality of the indoor models. Meanwhile, 3D models of an indoor environment are often not available for change detection of a building at different stages of the building’s lifecycle. Recognising these challenges in indoor modelling, this study aims to achieve the following objectives in order to provide an insight to the quality of an indoor model:

▪ A comprehensive quality evaluation method, which is applicable to both surface-based and volumetric models regardless of the types of input data. ▪ Methods for localization of the modelling errors as well as for detecting the changes between the model and the physical environment.

1.3 Contributions

To achieve the above objectives, in this research, an approach for procedural 3D reconstruction of indoor models from laser scanning data is developed, and a comprehensive framework for quantitative evaluation of the geometric quality of indoor models is proposed. A method for localising modelling errors as well as for detecting the differences between an indoor model and the physical environment is also implemented in this research. In summary, the main contributions are as follows: Automated reconstruction of 3D indoor models: ▪ A shape grammar approach, which is the first in literature, capable of generating 3D models of indoor environments containing not only the structural elements but also the navigable spaces and topological relations between them. The models are represented in a hierarchical structure, compatible with geometric data exchange standards. ▪ A shape grammar applicable to both Manhattan and non-Manhattan world designs of indoor environments.

8

1.3 Contributions

▪ An integration of a shape grammar and a data-driven process to enhance the robustness of the procedural modelling of indoor environments, as well as to facilitate the automated application of grammar rules. ▪ An approach to infer and predict interior structures from missing and incomplete data, which enhances the robustness of the indoor modelling method to the inherent noise and incompleteness of the data.

Quality evaluation and change detection of indoor models:

▪ A comprehensive framework, which is also the first in literature, taking into account all quality aspects (i.e., completeness, correctness, and accuracy) of the geometry of 3D indoor models. ▪ An approach for automated evaluation and comparison of the geometric quality of 3D models of indoor environments based on comparison with a ground truth reference model. ▪ Quality metrics for a quantitative description of the quality of an indoor model. ▪ Methods for localization of the modelling errors and building change detection based on the comparison between two 3D indoor models and through the comparison between a 3D indoor model and a point cloud.

1.4 Thesis structure

The thesis is organized in 7 chapters, including this Introduction chapter, as follows:

Chapter 2 presents a literature review of works related to indoor modelling, including 3D data acquisition, methods for automated reconstruction of indoor environments, and state-of-the art quality evaluations of 3D indoor models. The background of procedural modelling and shape grammar, which are the main techniques used in this study, is reviewed and discussed.

Chapter 3 details a shape grammar-based approach for modelling building interiors with Manhattan world designs. The modelling approach starts with placement of shapes into indoor environment, which is guided by the potential position of building surfaces estimated using the point distribution, following by a

9

1.4 Thesis structure classification to generate the semantic information, and establishments of the topological relations between indoor elements of a 3D model. The process is governed by a shape grammar. In this chapter, the components of the grammar, consisting of a starting shape, a set of grammar rules, and a production procedure are defined.

Chapter 4 describes a procedural reconstruction of indoor environment based on combination of a shape grammar and a data-driven process using a stochastic algorithm. A more robust algorithm for multi-scale extraction of potential building surfaces is introduced. The extended shape grammar, which is applicable to both Manhattan and non-Manhattan world designs, is also presented. A mechanism to integrate the shape grammar and a data-driven process using a stochastic approach, i.e., a variant of reversible jump Markov Chain Monte Carlo (rjMCMC), is proposed. The proposed approach aims for procedural reconstruction of the indoor environments from incomplete and inaccurate lidar point clouds.

Chapter 5 presents a method for automatic evaluation of the geometric quality of 3D models of indoor environments based on comparison with a ground truth reference model. The quality aspects of indoor models and the challenges in measuring the quality are identified. The quality metrics, which potentially enable quantitative evaluation and geometric comparison of indoor models are proposed. The evaluation and geometric comparison of an indoor model with a reference facilitates not only quantitative evaluation of the geometric quality of the model but also detection and temporal analysis of building changes. A method for building change detection based on a comparison of a 3D model and a point cloud is also presented.

Chapter 6 provides an evaluation of the proposed approaches for automated reconstruction of 3D indoor models, the approaches for quality evaluation and change detection of indoor models. A series of experiments and results for the proposed approaches on both synthetic and real data sets are presented and discussed.

Chapter 7 concludes this thesis and suggests the future research directions.

10

Chapter 2 Literature review

Indoor modelling has been of great interest in photogrammetry, computer vision, and computer graphics for decades (Pătrăucean et al., 2015; Volk et al. 2014). Generally, the modelling of an indoor environment involves three steps: (1) the measurement of geometry and appearance of the environment via a data capturing instrument, (2) the conversion from the data to a 3D semantic-rich model via a modelling process, and (3) the quality evaluation of the reconstructed model.

Photogrammetry and laser scanning are two common techniques used in capturing the as-is condition of a building environment. Each of the techniques has its own pros and cons. However, due to the architecture features and inherent characteristics of an indoor environment, lidar data is favourable over the photogrammetric data in indoor modelling. Among many possible approaches, automated reconstruction of 3D indoor models from lidar data will be the main interest in this study. These modelling approaches can be distinguished by several aspects, such as representation of the output model (e.g., surface-based versus volumetric representation), the modelling strategies (e.g., procedure-based versus and data-driven approaches), and data features (e.g., global versus local data features). The demand for automated, cost-effective and time-efficient indoor modelling methods leads to a need for performance evaluation of these methods. However, very little effort has been spent on quality evaluation of indoor models. This evaluation is usually based only on visual inspection and there is no common criterion for comparison between indoor models.

This chapter provides an overview on data acquisition techniques for capturing indoor environments, followed by a review of existing approaches for automated reconstruction of indoor models from lidar data. A review of the existing methods for quality evaluation of indoor model is also provided. Finally, for completeness,

11

2.1 3D Data acquisition the background on procedural modelling, which is the main technique applied in this study, will be summarized.

2.1 3D Data acquisition

Traditional survey techniques (e.g., tapes, callipers) require manual contact measurement of physical objects and structures, which is time-consuming, laborious, and error prone. State of the art spatial data acquisition techniques (i.e. range-based and image-based techniques) enable non-contact and effective measurement of physical objects and structures. They allow us to capture not only spatial and reflectivity information, but also colours of physical objects generally and building interiors particularly. In general, acquisition techniques for digital 3D representation of building interiors can be sub-divided into three categories based on the type of sensors: (1) camera-based systems, and (2) RGB Depth (RGB-D) sensors, and (3) laser scanning systems.

A camera-based system captures and merges still images to generate 3D representations of the interior of buildings (Furukawa et al., 2009; Lehtola et al., 2016) using Structure from Motion (SfM) (Furukawa and Ponce, 2010) and/or dense matching techniques (Hirschmuller, 2008). In general, the techniques need a large number of images in order to produce the data with high coverages of an indoor environment, which is usually featured by geometry complexity and a high level of clutter (Pătrăucean et al., 2015). A certain overlap across several images is needed in order to guarantee the success of the merging process, which is based on matching feature points extracted from them (Lehtola et al., 2017). Therefore, this solution is likely to suffer from shadows, changes of illumination conditions, and the presence of poorly textured surfaces (Khoshelham, 2018; Becker et al., 2018), which are common features of indoor environments. In other words, this technique does not always guarantee successful capture of 3D representation of indoor environments (Furukawa et al., 2009) and is more suitable for mapping outdoor environments. While the recent development of photogrammetric equipment (e.g., cameras) facilitates a cost-effective photogrammetric solution to capture indoor environments, the data processing is still an off-line process often involving human interaction, which is subject to errors and time-consuming (Volk et al., 2014). The accuracy of the generated data is moderate at centimetre level (Khoshelham, 2018).

12

2.1 3D Data acquisition

In practice, a camera-based system can be either a static system, e.g. a tripod- mounted camera, or a mobile system, such as an Unmanned Aerial Vehicle (UAV) with a camera. Figure 2.1 shows examples of the different types of camera-based systems, including a tripod-mounted Sony A6000 camera (Sony, 2019) and a UAV-borne camera - Microdrone Md4 1000 (Microdrones, 2019).

(a) (b) Figure 2.1 Camera-based systems: (a) tripod-mounted camera (Sony, 2019); (b) a UAV-borne camera (Microdrones, 2019).

RGB-depth sensors, such as Microsoft’s Kinect sensor (Microsoft, 2019a), SwissRanger (MESA Imaging, 2019), and PMD (PMD, 2019) have also been used in 3D indoor mapping systems (Breuer et al., 2007; Lichti et al., 2008; Lindner et al., 2010). The techniques enable the generation of 3D representation of indoor environments as coloured point clouds based on the integration of depth information and colour images (Khoshelham and Elberink, 2012). This integration facilitates real-time indoor mapping and has advantages in handling poorly textured structures. In general, these devices provide data with moderate level of accuracy (few centimetres) and they often have inherent limitations in range (≤ 10푚) (Khoshelham and Elberink, 2012; Lehtola et al., 2017). Similar to camera-based systems, RGB-D sensors are nowadays integrated on both static and mobile platforms (Khoshelham et al., 2019). Meanwhile, a mobile sensor (e.g., a handheld sensor) provides flexibility by allowing the user to move around an environment during the capturing process. The data processing technique is based on Simultaneous Localisation and Mapping (SLAM) algorithms to continuously estimate the location of the sensor and create a map of the environment (Santos et al., 2016; Khoshelham et al., 2013). Consequently, it enables direct generation of

13

2.1 3D Data acquisition dense point clouds across different areas on a large scale. However, the technique produces data which is inherently less accurate than the static sensor data. Figure 2.2 shows examples of a static RGB-D sensor-based system, i.e., Matterport (Matterport, 2019), and a mobile head-mounted mapping system, i.e., Microsoft Hololens (Microsoft, 2019b).

(a) (b) Figure 2.2 RGB-D systems: (a) Matterport; (b) Microsoft’s Hololens.

The laser scanning technique generates a 3D representation of a building interior based on the measurement of the distances from the nearby building surfaces to the sensor at various directions (Pesci et al., 2011). The technique allows the user to directly generate 3D representation of an indoor environment as a point cloud, which may contain millions of points, with a high level of accuracy ranging from millimetres to a few centimetres (Khoshelham, 2018). Among laser scanning techniques, terrestrial laser scanner (TLS) can provide the most precise data (millimetre level accuracy) (Pesci et al., 2011). The scanner is mounted on a stationary tripod during a scanning process. Due to the presence of clutter in indoor environments, the building surfaces are usually obstructed from the sensors. Consequently, the data captured at one station may incompletely represent the environment. It is often necessary to capture multiple scans at various locations to generate 3D data with a high coverage. The data capturing process is therefore usually time consuming and laborious. Thanks to the development of automated or semi-automated registration solutions, multiple scans can be efficiently merged together to provide better completeness of the point cloud. Meanwhile, SLAM- based laser scanning provides flexibility in producing complete data of a complex building on a large scale (Bosse et al., 2012; Zhang and Singh, 2014; Liu et al., 2010). In practice, there are variety of mobile laser scanning (MLS) systems

14

2.1 3D Data acquisition developed for efficiently capturing various indoor environments (Xiao and Furukawa, 2012; Liu et al., 2010), such as pushcart systems, e.g., NavVis 3D Mapping Trolley (NavVis, 2019), handheld scanners, e.g., Zeb Revo RT (GeoSLAM, 2019), and backpack solutions, e.g., Pegasus Backpack (Leica, 2019a). Nowadays, the laser scanners are integrated with a digital camera, which is used to capture images of the environment. The image data can be fused with the laser scanning data to generate a colour point cloud of an indoor environment. Figure 2.3 shows examples of laser scanning systems, including Faro Focus TLS (Faro, 2019), Zeb Revo RT (GeoSLAM, 2019) and NavVis 3D MLS (NavVis, 2019) systems.

(a) (b) (c) Figure 2. 3 Laser scanning systems: (a) Faro Focus sensor; (b) Zeb revo RT;(c) NavVis 3D Trolley.

In general, the laser scanning solution is more expensive in comparison with camera-based systems and RGB-D solutions. However, it can produce higher accuracy of the data (millimetres to few centimetres) and higher resolution (millimetres) as well as longer range capacity (several hundred metres) (Lehtola et al., 2017; Tang et al., 2010). In practice, the performance of automated reconstruction of indoor model heavily depends on the quality of the input data (Tang et al., 2010). Therefore, despite the high cost of the laser scanning solution, this technique is preferred and widely applied to indoor modelling.

15

2.2 Automated reconstruction of 3D indoor models

2.2 Automated reconstruction of 3D indoor models

Automated reconstruction of indoor models has mainly focused on the generation of three types of information: semantics, geometry, and topological relations. Semantic information indicates the type of indoor elements (e.g., building elements, navigable spaces). Topological relations show the spatial relations between elements (e.g., adjacency, connectivity, containment). The geometric information presents the shape of indoor elements, which can be represented in different ways (e.g., surface-based and volumetric representations). This section discusses automated reconstruction of indoor models from lidar data with respect to model representation, as well as modelling strategies (e.g., data-driven and procedure-based strategies) and data features (e.g., global and local data) used in reconstruction algorithms.

2.2.1 Model representation in indoor modelling

Surface-based representation and volumetric representation are two common descriptions of an indoor model, which are supported by different standards (e.g., IFC, cityGML, etc). Indeed, standards for geometric data exchange, such as Industry Foundation Classes (IFC), define various representations (e.g., volumetric and surface-based representation) of indoor elements (Liebich et al., 2009), whereas Geographic Information System (GIS) standards, such as OGC City Geography Markup Language (CityGML) define a surface-based representation (Kolbe et al., 2005; Bacharach, 2008). The surface-based representation describes an indoor model as an arrangement of a set of building surfaces (Tang et al., 2010; Pătrăucean et al., 2015). Meanwhile, volumetric representations describe an indoor model based on Constructive Solid Geometry (CSG) or based on boundary representation (B-rep) (Liebich et al., 2009). The geometry presents a complex shape as a combination of simple primitives using Boolean operations (e.g., union and intersection). Figure 2.4 show an example of a volumetric model in IFC standard and a surface-based model in cityGML of a simple building (Nagel et al., 2009).

16

2.2 Automated reconstruction of 3D indoor models

(a) (b) Figure 2.4 Model representation of indoor environment: (a) Volumetric representation; (b) surface-based representation (Nagel et al., 2009).

In literature, most of the existing approaches reconstruct surface-based models from point clouds by extracting planar surfaces of building elements (Sanchez and Zakhor, 2012; Xiong et al., 2013; Previtali et al., 2014; Díaz-Vilariño et al., 2015). The existing methods mainly focus on reconstruction of the geometry of building elements such as walls, ceilings, floors (Budroni and Boehm, 2010; Adan and Huber, 2011; Thomson and Boehm, 2015). This geometric information can be further interpreted to generate semantic information. These interpretations are usually based on specific assumptions, such as locations of walls at the room boundaries (Valero et al., 2012; Hong et al., 2015), or the condition on the minimum height of ceilings (Macher et al., 2017). In practice, the surface-based representation has flexibility in representing free-form indoor elements (e.g., walls, ceilings, and floors). However, the models are not suitable for many complex spatial analyses, such as energy simulation and manufacturing purposes, which usually require volume information (Pătrăucean et al., 2015). Additionally, the surface-based models often do not include interior spaces and their topological relations, which are the main components of 3D models and pre-requisite for a variety of application such as navigation applications and emergency responses (Kolbe et al., 2005; Liebich, 2009; Lee et al., 2014).

Some recent approaches reconstruct 3D volumetric models of the indoor environment, which is described as a combination of volumetric entities (Volk et al., 2014; Pătrăucean et al., 2015). In general, an indoor environment will be first sub-divided into sub-spaces (e.g., cells, segments). These sub-volumes will be combined together using Boolean operations (e.g., union, intersection) under certain constraints to successively generate final volumetric elements. Meanwhile, 17

2.2 Automated reconstruction of 3D indoor models the subdivision is usually based on the geometric information of building surfaces. For example, Jenke et al. (2009) proposed a statistical method for reconstructing volumetric indoor models as a set of cuboid shapes, which are constructed from at least 5 detected wall surfaces. Xiao and Furukawa (2014) generate 3D volumetric indoor model based on stacking horizontal volumes, which are segmented from the 3D point cloud data based on the location of horizontal structures (e.g., ceilings and floors). In Oesau et al. (2014), a 3D model of an indoor space is formed as a union of the navigable cells matching predefined conditions. The cells are generated by decomposing the point cloud based on the arrangement of building surfaces (e.g., walls, ceilings, and floors). Meanwhile, Mura et al. (2014) reconstructed 3D geometric models as the union of volumetric rooms extruded from a 2D cell complex generated by the intersection of line segments of projected wall candidates in a 2D plane. Later, Mura et al. (2016) generated volumetric interior spaces as a combination of a set of 3D cells decomposed from the indoor environment as an arrangement from permanent building structures.

In general, a volumetric representation and a surface-based representation are not always interconvertible. The conversion from a volumetric model to its surface- based representation is straightforward as a volume is inherently constituted by a set of boundary surfaces. Meanwhile, the inverse process requires the interpretation of individual volumes from surfaces in the surface-based model, which heavily depends on the model’s complexity and topological consistency. In comparison with a surface-based model, a volumetric model is closer to human-liked intuition in describing a building interior. In many cases, a volumetric model can be described by a small set of parameters, and is therefore more suitable for performing complex spatial analysis such as location-based services, energy analysis and manufactures.

2.2.2 Reconstruction approaches

Recent approaches for the reconstruction of as-is 3D models of indoor environments can be classified into two main categories: (1) data-driven

18

2.2 Automated reconstruction of 3D indoor models approaches and (2) procedure-based approaches. This classification is based on the data sources and the prior knowledge used in the reconstruction process.

2.2.2.1 Data-driven approaches

A data-driven approach generates 3D models of building interiors only based on direct interpretation from input data (e.g., point cloud). These data-driven approaches can be further sub-divided into two different classes based on the data characteristics used in the modelling algorithms: (1) local methods and (2) global methods.

A local approach often considers only local data characteristics (point density, orientation, dimension, etc.) within a local neighbourhood (e.g., plane segments, cells) (Adan and Huber, 2011; Díaz-Vilariño et al., 2015; Hong et al., 2015). For example, in Sanchez and Zakhor (2012) surface-based building models (e.g., walls, ceilings, floors) are reconstructed as an arrangement of planar primitives. The planar surfaces are extracted using local plane fitting from separated clusters of points which have similar normals. Similarly, in order to model a Manhattan indoor environment, Xiong et al. (2013) first decompose a point cloud into a set of 3D grids (voxels). The geometries of building surfaces are estimated locally from the voxelized data. The reconstruction of building geometry presented in one voxel (e.g., rooms, segments) does not influence the reconstruction at other parts of the building and vice versa. The semantic information is further interpreted based on the reconstructed geometry and the contextual relationship within the local neighbourhood. For example, in Xiong et al. (2013), the semantic information of building elements (e.g., walls, ceilings, floors) can be derived based on the intrinsic features (e.g., dimension, orientation) of each individual surface and the contextual relationships (e.g., parallelism, orthogonality) with their nearby neighbours. A similar strategy can be found in Macher et al. (2017) and Nikoohemat et al. (2018), the authors favoured the combination of geometrical features of planar surfaces and their contextual constraints (e.g., distances, parallelism) together with their adjacency relationship in reconstruction of geometrics and semantics of 3D indoor models. Figure 2.5 illustrates the generation of a volumetric model using a local

19

2.2 Automated reconstruction of 3D indoor models approach proposed in Macher et al. (2017). The volumetric model is interpreted from surfaces, which are extracted within room segments, and their contextual constraints.

(a) (b)

(c) Figure 2.5 Local data-driven approach for indoor modelling: (a) Room segmentation; (b) surface extraction within a room segmentation; (c) final volumetric model (Macher et al., 2017).

A global approach, on the other hand, enforces the coherence among element characteristics in a more holistic way. This approach facilitates the integration of global plausibility of the reconstructed model to a given data (i.e., a point cloud), which lies beyond the relations and constraints within a local neighbourhood (Previtali et al., 2018; Ochmann et al. 2019). For example, in Oesau et al. (2014), the reconstruction of a surface-based model is formulated as a global optimization, which maximizes the global fitness of a model’s surfaces to an input point cloud. Similarly, Mura et al. (2016) reconstruct volumetric indoor models by solving a multi-label optimization, which maximizes the visibility overlaps from different viewpoints in the indoor spaces and the point coverages of vertical surfaces. Ochmann et al. (2016) maximized the coverage of points which are orthogonally

20

2.2 Automated reconstruction of 3D indoor models projected on the floor plan in order to distinguish between exteriors and interior building elements. In Ochmann et al. (2019), the authors optimized the total volume of navigable spaces (i.e., rooms, corridors) based on supporting points of their bounding surfaces and the probability of the surfaces’ visibility from locations inside each space. Generally, in a global approach, the reconstruction of building elements from one building part can be influenced by the data quality and the reconstruction of the other building parts. This solution can facilitate the complementation of the reconstruction of building parts which are captured with low quality data.

In general, data-driven approaches depend heavily on data quality. Meanwhile, a local approach is suitable for the reconstruction of indoor environments which are well observable and are captured with high-quality data. A global approach is less susceptible to noise and occlusion and is more robust to clutter, due to the integration of global coherence of the model to the given input. However, this integration sometimes has negative influences on the reconstruction of well- captured building sections. Intuitively, the local data characteristics and the global plausibility of the model are not contradictory; instead they are complementary. Therefore, the combination of local data characteristics and the model’s global plausibility can potentially enhance the robustness and coherence in the reconstruction of 3D indoor models from an input point cloud.

2.2.2.2 Procedure-based approaches

Procedure-based approaches exploit the regularity and repetition of spaces and architectural design principles in the reconstruction process. The approaches generate complex geometries by recursively replacing simple primitives (e.g., cuboids, cylinders) using a set of rewriting rules. The background information and mathematical principals behind procedural modelling will be detailed later in this chapter (section 2.4 Procedural modelling).

Within the field of urban modelling, procedure-based approaches based on procedural modelling in general and shape grammar in particular are powerful tools

21

2.2 Automated reconstruction of 3D indoor models for synthesizing and modelling architectural design and structures (Mitchell, 1990; Parish and Müller, 2001; Marvie et al., 2005). Parish and Müller (2001) modelled streets and building façades by applying L-systems (Prusinkiewicz and Lindenmayer 1991). The method integrated the knowledge of the growth and the arrangement of plant branches in modelling streets of a city. Wonka et al. (2003) and Müller et al. (2006) adopted the shape grammar concept (Stiny and Gips, 1972) for efficiently synthesizing highly detailed 3D façade models of a complex building. A shape in the reconstructed model can have arbitrary geometry, which is generated based on assembling simple shapes (e.g., cuboids, cylinders) using Boolean operations (e.g., union, intersection). For example, the Petronas Towers can be synthesized by the combination of cubes and cylinders (Müller et al.,2006). The combination must be governed by a set of grammar rules and procedural production, manually predefined by an expert. The shape grammar-based approaches are also applied to reconstruct models of existing environments. Several researchers successfully proposed procedural reconstruction of building facades based on the integration of a shape grammar with a data-driven process from imagery and/or lidar data (Dick et al., 2004; Ripperda and Brenner, 2009; Becker, 2009). Additionally, several researchers further investigated a solution for automated learning of the grammar rules and its order of application from a large dataset of real building models (Dang et al., 2015 and Dehbi et al., 2016). These works can potentially reduce the interaction and human intervention in the definition of the grammars. Consequently, it facilitates modelling of buildings with various architectural structures and designs. Despite the successful application of procedure-based approaches in modelling building façades and outdoor environments, the grammars cannot be directly applied to indoor environments, due to the inherent differences between them.

In indoor modelling, there are only a few procedure-based approaches for generation of 3D indoor models. Gröger and Plümer (2010) were the first to apply a split grammar to synthesize 3D models of Manhattan world design buildings. The reconstruction procedure starts with the bounding box of the environment, followed by a splitting process to subdivide the spaces into individual sub-space

22

2.2 Automated reconstruction of 3D indoor models and its bounding surfaces. The containment relation between the wall surfaces with doors and windows are also established. This method can generate parametric models containing volumetric navigable spaces and wall surfaces; however, it requires predefining the set of grammar rules for modelling each individual indoor environment. Becker et al. (2015) combined L-grammar and split grammar for modelling rooms and hallways of a building interior from a point cloud and its Level-of-detail 3 (LOD3) model. Ikehata et al. (2015) reconstructed Manhattan- world indoor environments as structured models based on a structure grammar. The structured model consists of navigable spaces, building elements, and their topological relation. While navigable spaces are modelled as volumetric elements bounded by building surfaces, the building elements are reconstructed in a surface- based representation. Khoshelham and Diaz-Vilaniro (2014) proposed a shape grammar to reconstruct interior spaces of Manhattan World buildings from point clouds. The grammar contains three simple rules for generation of 3D navigable spaces (i.e., rooms, corridors) and the order application of the rules must be predefined by an expert. The reconstruction of a 3D model of indoor space from a point cloud by this procedure-based approach is illustrated in Figure 2.6. Generally, the procedure-based approaches have a great potential for the reconstruction of topologically correct as-is 3D indoor models, which have a hierarchical structure, and are therefore compatible with geometric data exchange standards (i.e., IFC, indoorGML). However, these recent approaches still require a predefined order for the application of grammar rules, which restricts them to simple designs only such as L-shaped (Becker et al., 2015) and Manhattan World buildings.

23

2.2 Automated reconstruction of 3D indoor models

(a) (b)

(c) (d) Figure 2.6 A procedure-based approach for indoor modelling: (a) An input point cloud; (b) Cuboid placement in the real point cloud; (c) Volumetric indoor spaces generated by iteratively connecting and merging cuboids; (c) The final surface- based model (Khoshelham and Diaz-Vilaniro, 2014)

In summary, most of the existing methods, including data-driven approaches and procedure-based approaches, are often limited with the Manhattan assumption. The methods for indoor modelling mainly focus on the reconstruction of building elements (e.g., walls, ceilings, floors) only and lack the ability of generation of navigable spaces and topological relations between them, which are also the principal components of an indoor model. In comparison with the data-driven approaches, the procedure-based approaches are less sensitive to erroneous and incomplete data. Their advantages lie in the translation of knowledge and principles of architectural design into a form of a grammar, which ensures topological correctness of the reconstructed elements and the plausibility of the whole model.

24

2.3 Quality evaluation of 3D indoor models

However, they require a set of grammar rules and their application order, which currently are manually defined in indoor modelling.

2.3 Quality evaluation of 3D indoor models

In the literature, quality evaluation of indoor models usually relies on visual inspection (Xiao and Furukawa, 2014; Becker et al., 2015). This approach is subjective and time-consuming and requires expert knowledge. Other methods using quantitative measures focus mainly on one quality aspect, e.g., accuracy (Valero et al., 2012; Xiong et al., 2013; Hong et al., 2015; Mura et al., 2016; Macher et al., 2017), while ignoring other important aspects such as completeness and correctness of the reconstructed elements. For instance, Thomson and Boehm (2015) measure the geometric deviation between two models based on the distances between centroids of wall surfaces, absolute differences of wall areas, and their angular deviations. This measurement is more suitable for evaluating the quality of rectangular structures, which are not always the case with building elements (e.g., beams, curved walls). Additionally, the proposed quality metrics cannot reveal faithful evaluation of the quality of elements, when there are deformations of surfaces. On the other hand, Oesau et al. (2014) and Díaz-Vilariño et al. (2015) measure the quality of the reconstructed models based on the deviation between the models and the input data. These comparisons do not reveal the quality of the models, as different approaches use various data sources and the quality of the data is also varied. In other words, these evaluations are not based on common and comprehensive quality criteria (Previtali et al., 2014; Díaz-Vilariño et al., 2015; Thomson and Boehm, 2015). As a result, these evaluation methods provide a biased view on the quality of the reconstructed models, and their results cannot be compared.

In comparison with quality evaluation for indoor models, more comprehensive evaluation methods have been developed for building exterior models and building façades (Rutzinger et al., 2009; Rottensteiner et al., 2014; Awrangjeb and Fraser, 2014). However, the evaluation criteria for an exterior model cannot be applied to an indoor model, due to the inherent differences of these models. While building façades are typically represented by a small number of surfaces, indoor environments generally comprise a large variety of architectural elements which

25

2.4 Procedural modelling may be represented in different ways (e.g., surface-based and volumetric representation) as well as the presented of occluded surfaces in volumetric model. Therefore, further investigation on a comprehensive approach for quality evaluation of indoor models is essential.

2.4 Procedural modelling

Procedural modelling is the generation of geometry using rewriting systems. While conventional modelling often focuses on the final shape of an object, in procedural modelling the focus is on the arrangement and combination of simple primitives in order to generate complex entities. The modelling technique produces a complex geometry by recursive replacements of the simple primitives by iterative application of syntactic rules. This allows definition of parametric 3D models and therefore potentially provides an effective solution for the generation of a large number of 3D objects and structures with a wide range of variations. To date, procedural modelling has been widely applied in various real application domains, such as urban planning (Esri 2019b; Kimberly et al., 2009), city modelling (Biljecki et al., 2015), and entertainment (e.g., video games, movies) (SpeedTree 2019; Smelik et al., 2014). This chapter will briefly review the background and historical development of procedural modelling, starting from the concept of a formal grammar, designed to define the mathematical structure of natural language, to a shape grammar used in modelling objects’ geometry and architectures, which are the main techniques used in this research.

2.4.1 Formal grammar

Formal grammar was first formalised and developed by Chomsky (Chomsky, 1952) to originally describe a mathematical model of natural language. The grammar is considered as a language generator which is applied to generate grammatical sentences based on a set of rewriting rules encoding the theory of language. Later, the concept of formal grammar drives the investigation of programming languages and stimulates the development of natural language processing, which enable us to use the advantages of computers and computing algorithms in interpreting human language.

A formal grammar G is a four-tuple G = {푉푁, 푉푇, P, S} (Fu, 1981), in which: 26

2.4 Procedural modelling

▪ 푉푁 and 푉푇 are the non-terminal and terminal symbols respectively, which constitute the set of vocabulary V of G. The non-terminal and terminal symbols are intermediate and final vocabularies of the grammar respectively. There is no overlap between these two sets of vocabulary,

푉푁 ∩ 푉푇 = ∅. ▪ P is a finite set of production rules in the form 훼 → 훽, which converts a string 훼 into a string 훽 on the condition that the string 훼 contains at least one non-terminal symbol. ▪ S is the starting symbol, which is the starting configuration of the language.

The language L(G) which is generated by the grammar G is defined as:

∗ ∗ 퐿(퐺) = {푥|푥 ∈ 푉푇 푠푢푐ℎ 푡ℎ푎푡 푆 ⇒ 푥}

∗ Where, 푉푇 denotes the set of all finite-length strings of terminal symbols 푉푇 of the ∗ grammar. 푆 ⇒ 푥 denotes a sequence of the conversion from the starting symbol S to the string 푥. Similarly to the ambiguity in natural language that one sentence can be interpreted in different meanings, in formal grammar, a string 푥 ∈ 퐿(퐺) can have more than one derivation. Additionally, the derivation can face the emerging problem that not all the possible derivations are meaningful. Therefore, in order to avoid the ambiguity and generate meaningful languages, restrictions are applied in the form of production rules as a guide for the derivation of a language. Chomsky divided the formal grammar into four different classes according to the restrictions of the grammar rules. This classification is known as the Chomsky hierarchy (Fu, 1981).

▪ Type 0 (unrestricted) grammar: In this type of grammar, there is no restriction on the grammar rules. The rules are in the form 훼 → 훽, which transform from arbitrary non-empty string to another. Consequently, a generated sentence can be too ambiguous, and its meaning is undecidable.

▪ Type 1 (context-sensitive) grammar: The grammar rules are restricted to

the form: 훾1퐴훾2 → 훾1훽훾2, indicating that the string A is replaced by the ∗ string 훽 in the context 훾1, 훾2. Where, 퐴 ∈ 푉푁, 훾1, 훾2, 훽 ∈ 푉 . The length of the string A must be smaller than the length of the string 훽 (|퐴 |<| 훽|).

27

2.4 Procedural modelling

▪ Type 2 (context-free) grammar: The productions of the grammar are defined as 퐴 → 훽 . This production allows us to replace a non-terminal ∗ symbol 퐴 ∈ 푉푁 by a non-empty string 훽 ∈ 푉 . The grammar is not general enough to describe natural languages. However, it can be used in defining a programming language, in which the productions are often independent of the context in that a non-terminal symbol A appears.

▪ Type 3 (regular) grammar: The productions of the grammar are restricted to the form: 퐴 → 푎퐵 표푟 퐴 → 푏. Where A, B are non-terminal symbol

(퐴, 퐵 ∈ 푉푁) and 푎, 푏 ∈ 푉푇. Similarly to the type 2 grammar, the type 3 grammar is too restricted to define a natural language, yet it can be useful in the analysis of the structure of a computer program and input matching.

2.4.2 L-systems

Lindenmayer systems or L-systems were introduced by a biologist Aristid Lindenmayer (Lindenmayer, 1968), who adopted the rewriting concept of a formal grammar in order to the describe mathematical syntactics of plant development. L- systems are considered as the earliest effort to use the concept of formal grammar in procedural modelling of geometry. In fact, the systems become a powerful tool for plant modelling, which focus not only on geometric modelling, but also maintains the coherence and correctness of plant topology, such as the neighbourhood relations between cells and plant modules. The main feature that distinguishes the Chomsky formal grammars from the L-systems is the production procedure. While the sequential productions are applied in the former, the latter exploits parallel rewriting systems, which are suitable for simulating the growing of plants and organic architectures. The formal definition of an L-system is similar to a Chomsky grammar. It is defined as an ordered triplet G = {푉, P, S}, in which V is the set of the vocabularies of the system. There is no distinction between non-terminal and terminal symbols in this grammar. The application of L-systems in geometric modelling has been developed for many decades. The first attempts were proposed in Frijter and Lindenmayer (1974) to establish the topology of plants’ branches. Later, Prusinkiewicz (1986) successfully applied an L-system based on LOGO-style turtle interpretation to

28

2.4 Procedural modelling model the geometry of fractals and plants. The idea behind the modelling of these geometries lies in the definition of the turtle geometry, in which a turtle is defined by a triplet (푥, 푦, 훼).Where (푥, 푦) is the Cartesian coordinates of the current location of a turtle, then angle 훼 is the turtle’s heading orientation. There are four possible productions, including operations F for moving forward a step and then drawing a line, f for moving forward without drawing, + for rotating left, and – for rotating right). The operations are associated with two parameters: the step size (the length of moving step) in the operation F and f which is, and the angle increment (the rotation angle) in the operations + and –. Figure 2.7 shows the geometry of quadratic Koch island generated from the following L-system with the step size s ranging from 0 to 3 and the angle increment being 90 degree (Prusinkiewicz and Lindenmayer 1991):

▪ Starting symbol ω: F − F − F − F ▪ Grammar rule p: F → F − F + F + FF − F − F + F

s = 0 s = 1 s = 2 s = 3 (a) (b) (c) (d)

Figure 2.7 Generating a quadratic Koch island (Prusinkiewicz and Lindenmayer 1991).

In practice, the application of L-systems as a versatile tool was not limited to modelling plants and biological processes. Later, within urban planning, L-systems were successfully applied in modelling streets (Parish and Müller, 2001). Recently, in indoor modelling, there have been few attempts to generate 3D indoor models using L-system based interpretation (Becker et al., 2015). However, the approaches are usually restricted to a certain assumption on buildings’ architecture (i.e., Manhattan or L-shape buildings).

29

2.4 Procedural modelling

2.4.3 Shape grammar

The shape grammar was introduced by Stiny (1980) to syntactically model architecture. The essential difference between a Chomsky formal grammar and a shape grammar lies in vocabularies constituting a language. The formal grammar is based on the combination of symbols. Meanwhile, the shape grammar aims to model geometry based on the application of grammar rules on a set of shapes, which is defined as a limited configuration of straight lines and points in a cartesian coordinate systems (Stiny, 1980). A shape associated with a set of labels is called labelled shape. Similarly to a formal grammar, a shape grammar consists of four components G = {푉, 퐿, P, S}:

▪ 푉 is the set of finite shapes, including non-terminal and terminal shapes.

▪ 퐿 is the set of labels associated with the shapes.

▪ P is a finite set of production rules in the form 훼 → 훽, which convert a labelled shape 훼 into another 훽. ▪ S is the starting shape The production from a shape to a new one is based on a set of operations on shapes, which facilitate generation of various shapes of the grammar. The operations include the Boolean operations (i.e., shape union, shape intersection, and shape difference) and the shape transformation (i.e., translation, rotation, reflection, scale, or finite compositions). Figure 2.8 illustrates the generation of layout of the Villa Malcontenta building interior using the Palladian shape grammar (Stiny and Mitchell, 1978), including a starting shape, i.e., a grid of rectangular spaces, and a Boolean operation, i.e., shape union (Khoshelham and Diaz-Vilaniro, 2014). Similar to a formal grammar, controlling of the production is necessary in order to ensure generation of meaningful architecture and to avoid ambiguity during the modelling process. Therefore, a production procedure which guides the application of the rules is needed for the interpretation of shape grammars. This procedure is usually defined manually.

30

2.5 Summary

(a) (b) (c) (d) Figure 2.8 Application of the Palladian shape grammar for generation of the layout of the Villa Malcontenta: (a) a starting shape – a grid of rectangular spaces; (b-c) intermediate layouts generated by application of the grammar; (d) the final layout of the Villa Malcontenta.

The shape grammar has been applied successfully in urban modelling (Mitchell, 1990; Parish and Müller, 2001; Marvie et al., 2005). However, the application of shape grammar in indoor modelling has not yet been fully explored.

2.5 Summary

This chapter has discussed relevant aspects of indoor modelling, encompassing data acquisition techniques, state-of-the-art automated reconstruction approaches, quality evaluation of indoor models, and the background of procedural modelling, which is the main technique used in this study. Lidar scanning is the favourable technique for indoor modelling, as it can produce highly accurate point clouds. Data-driven approaches have the flexibility in modelling building interiors with different architectures as the processes are based on data capturing the as-is condition of environments. However, these approaches depend heavily on the quality of the input. Procedure-based approaches, such as shape grammars, have been successfully applied for procedural modelling of architecture and building façades. The approaches translate the knowledge of architectural designs into modelling algorithms. Nevertheless, this knowledge is encoded in a set of grammar rules and a production procedure, which are often defined to be applicable to a specific architectural structure. Therefore, the procedural reconstruction of indoor models based on integration of a procedure-based approach (e.g., shape grammars) and a data-driven process can take advantage of both knowledge of architectural principles and the features extracted from input data (e.g., point cloud). This approach can potentially improve the robustness of modelling approaches to the incompleteness and inaccuracy of the data, as well as to enhance flexibility in

31

2.5 Summary generating various indoor architectural structures (e.g., Manhattan world and non- Manhattan world designs). In the following chapters, approaches for procedural reconstruction of indoor models will be introduced. These approaches aim at generation of a 3D semantic-rich model with high completeness, correctness, and accuracy. Comprehensive methods for quantitative evaluation as well as change detection of 3D indoor models are also developed.

32

Chapter 3 A shape grammar approach for modelling Manhattan world environments

Shape grammar is a powerful tool for describing architectural designs because of its ability to produce a complex model from simple primitive elements and a set of grammar rules (Stiny 2008). The technique integrates knowledge of indoor architecture and principal designs into a modelling algorithm. In practice, within the field of urban modelling, grammar-based approaches have been commonly used quite successfully for modelling building façades. However, the façade grammars cannot be applied to describe indoor environments, because of the inherent differences between these two environments. Building facades are typically represented by a small number of surfaces and the data capturing the structures often contain a tolerable level of occlusion and incompleteness. Meanwhile, an indoor environment generally comprises a large variety of architectural elements which may be represented in different ways and a high level of clutter (i.e., furniture, moving objects), which can negatively influence the quality of captured data.

In practice, most indoor environments have a Palladian design (Palladio 1997) characterized by a regular arrangement of rectangular spaces but in many different creative configurations. Such a regular configuration of building elements is sometimes referred to as a Manhattan world (Coughlan and Yuille 1999). In the field of planning and design, Stiny (1978) developed a shape grammar in order to describe the Palladian architectural designs. The Palladian grammar generates an indoor design by partitioning the indoor environment into a grid of rectangular

33

3.1 Overview of the approach spaces and then merging the rectangular spaces to form larger spaces of different shapes.

In this chapter, a new shape grammar approach based on a context-sensitive grammar (Fu, 1981) is proposed for the reconstruction of Manhattan-world indoor environments from point clouds. The proposed approach is not only able to automatically generate building elements and navigable spaces, but also the topological relation between them. The indoor models are reconstructed in a hierarchical structure compatible with geometric data exchange standards (e.g., IFC standard).

3.1 Overview of the approach

Figure 3.1 shows the overview of the proposed approach. The shape grammar generates a 3D parametric model by repeatedly placing cuboids into spaces enclosed by points, classifying them into building elements and spaces, and merging the spaces sequentially to form the final navigable and non-navigable spaces. Ultimately, it allows the reconstruction of 3D models of complex environments from simple cuboid shapes. Topological relations between cuboid shapes, including adjacency, connectivity, and containment, are reconstructed in the final model. The placement, classification, and merging processes as well as topology reconstruction are governed by the grammar rules. Intuitively, a grammar rule transforms antecedent shape(s) into subsequent one(s). The form of the grammar rule will be detailed in the next section. The parameters and conditions for invoking the grammar rules are learned from the input point cloud and the current state of the model. This strategy facilitates not only automated application of grammar rules, but also derivation of cuboid shapes from knowledge of existing elements in the current models, which can compensate for missing data points.

34

3.2 The shape grammar

(a) (b) (c) (d) Figure 3.1 Overview of the method: (a) a point cloud of the indoor environment as input, (b) cuboid shapes placed into spaces enclosed by points, (c) the final model with classified and merged cuboids - The brown, blue and grey cuboids represent walls, interior spaces and exterior spaces respectively, (d) topological relations between navigable spaces extracted from the cuboids.

In order to facilitate the application of grammar rules, the point cloud is first rotated such that the main walls are vertical and parallel to the x and y axes of the point cloud. This is done by computing the local normal vectors, and finding clusters in the 3D space of the normals, which show the dominant directions of the surfaces in the point cloud. A k-means clustering method is applied to find the three cluster centres, which are then used to estimate the rotation parameters that align the point cloud (Khoshelham and Díaz-Vilariño 2014). The peaks of the histogram of the z coordinates of the points indicate the potential location of the floors and ceilings, which is used to partition the point cloud into sub-clouds corresponding to building storeys (Khoshelham and Díaz-Vilariño 2014). The shape grammar is then applied to each sub-cloud separately to reconstruct a complete indoor model.

3.2 The shape grammar

The proposed shape grammar is based on a context-sensitive grammar (Fu 1981) defined by a 4-tuple, G = {I, N, T, R}, where:

▪ I is the starting shape; ▪ N denotes non-terminal shapes, which represent intermediate spaces and building elements; ▪ T denotes terminal shapes, which represent final spaces (rooms, corridors, external spaces) and building elements (walls, slabs);

35

3.2 The shape grammar

▪ R is the finite set of production rules. The rules have the general form: A →B:cond. A∈ N, B∈ V, where V is the vocabulary of the grammar defined as N ∪ T. The condition cond is a logical expression, which has to be true for invoking the rule. The rule can be read as: a shape A is replaced by a shape B on the condition cond.

To facilitate compatibility with standards for geometric data exchange, each shape generated by the grammar rules is stored as an object with a unique shape identifier id and a set of attributes (Müller et al. 2006). The attributes are the geometric information, semantic information and topological relations. The geometric information includes the parametric representation as well as boundary representation of the shapes. The semantic information for both terminal and non-terminal shapes includes the type of the spaces and building elements, which can take one of the three labels: internal space, external space, and wall. The topological relations for non-terminal shapes and terminal shapes are of three types: containment, adjacency, and connectivity (Tran et al. 2017). The containment relation indicates whether a shape is contained in another. Each shape has containment relation with itself. The adjacency and connectivity relations are established at two stages in the modelling process to allow path planning in different levels of granularity. Before the initial non-terminal shapes are merged to form the terminal ones, two shapes are ruled adjacent, if they share a common face, and connected, if there is no physical surface between them. This allows path planning between subspaces of the terminal navigable spaces; e.g., to go from one end of a corridor to another. After the initial non-terminal shapes are merged, the adjacency and connectivity relations are defined only for shapes of the type internal space (i.e., rooms and corridors) when they share a common wall and the wall contains a door. These latter relations allow path planning between the terminal navigable spaces; e.g., to go from one room to another.

36

3.2 The shape grammar

3.2.1 Starting shape

A unit cube is selected as the starting shape in the shape grammar. Each non- terminal shape is initially generated by applying a transformation (scaling and translation) to the unit cube. Two representations are used for the unit cube. The parametric representation {푂, 푆} where O is the origin of the local coordinate system located at the centre of the unit cube, and the scale 푆 = {푠푥, 푠푦, 푠푧} represents the dimensions along the x, y, z axes, which are all equal to 1. The boundary representation {푉, 퐹} is defined by a set of 8 vertices V and the 6 bounding unit square faces F as shown in Figure 3.2. The starting shape has no semantic information and topological relations.

(a) (b)

Figure 3.2 Representation of a unit cube: (a) Parametric representation, (b) Boundary representation.

3.2.2 Grammar rules

Six grammar rules are defined to produce an indoor model from the starting shape. Placement rule

1 푅푝푙푎푐푒: 퐼 → 푁: 푐표푛푑(퐻 ≠ 0)

The placement rule transforms the starting shape I to a non-terminal shape N by a transformation H, and places it into the model. The transformation H is a compact parametric representation of shape N, and consists of three scale parameters 푆 =

{푠푥, 푠푦, 푠푧}, corresponding to the size of N, and a translation vector 푇 = {푡푥, 푡푦, 푡푧}, corresponding to the location of N. The transformation H and the parameters therein are obtained from pairs of adjacent peaks in the histogram of point

37

3.2 The shape grammar coordinates. More specifically, any pair of adjacent peaks with a distance larger than a certain threshold is used to evaluate H. If no such peaks can be found, then

H is set as a zero matrix, and so the rule cannot be applied. The height (푠푧) and the vertical location (푡푧) of shape N are derived from the locations of two consecutive peaks on the histogram of z coordinates. The horizontal location (푡푥, 푡푦) and dimensions (푠푥, 푠푦) are determined based on the locations of consecutive peaks on the histograms of x and y coordinates. Figure 3.3 shows how the cuboids are placed according to the peaks of the histograms of point coordinates. The boundary representation of N is partly inherited from I (the bounding faces), and partly derived from the parameters in H (the coordinates of the vertices). This boundary representation {푉, 퐹} of N facilitates the conversion to Wavefront obj format (Murray and VanRyper 1996), which enables further incorporation in geometric data exchange standards (e.g., IFC) (Macher et al. 2017). Each placed shape is assigned a unique shape identifier id, and a triple of grid coordinates (i,j,k) according to the location of corresponding peaks in the histograms of the point coordinates in x, y, z.

(a) (b) (c) Figure 3.3 Location and size of shapes derived from the histograms of point coordinates: (a) histogram of z coordinates, (b) histograms of x- and y- coordinates, (c) top view of the cuboid shapes placed iteratively by applying the placement rule.

Classification rule

2 푅푐푙푎푠푠: 퐴 → 퐴[푡푦푝푒]: 푐표푛푑

The classification rule assigns a non-terminal shape A, whose type attribute is empty (∅), a certain type on the condition cond. The point-on-face index is defined to determine the type of a shape A:

38

3.2 The shape grammar

2 푛휎 (3.1) 퐼푃표퐹 = 퐴푓푎푐푒 Where n is the number of points that fall within a small buffer around the selected faces of the shape, 퐴푓푎푐푒 is the total area of the faces, and 휎 is the average point spacing within the point cloud. The point-on-face index indicates whether there are points on one or more faces of A. Using the point-on-face index together with adjacency relations and prior knowledge about the maximum thickness of the walls (max wall thickness), the type attribute (internal space, external space, and wall) of shape A is set as follows. If A has points on its top face (퐼푃표퐹.푇표푝 ≥ 휏), or if the top face is adjacent to a shape of type internal space, then A is classified as internal space. If A has points (퐼푃표퐹.푆푖푑푒푠 ≥ 휏) on its side faces, and its horizontal dimensions (푠푥, 푠푦) are not larger than the maximum wall thickness, then A is classified as wall. The threshold 휏 is empirically determined. In practice, A is not required to be fully covered either on ceiling (퐼푃표퐹.푇표푝 < 1) to be classified as an internal space, or on sides (퐼푃표퐹.푆푖푑푒푠 < 1) to be a wall. The combination of the point-on-face index and geometry information of A and its neighbour shapes enhance the robustness of the classification to missing input data. If A does not satisfy the conditions of either internal space or wall, then it is classified as external space. Figure 3.4 illustrates the classification of the placed cuboids into internal space, external space and wall by repeatedly invoking the classification rule.

(a) (b)

Figure 3.4 Classification rule: (a) top view of the cuboid shapes placed into point cloud, (b) the cuboid shapes classified as internal space (blue), external space

(grey) and wall (brown).

Additionally, the presence of clutter or furniture in the point cloud might result in additional cuboids as a consequence of additional histogram peaks. These cuboids might be misclassified due to unreasonable values of the point-on-face index. For

39

3.2 The shape grammar example, a row of bookshelf placed next to a wall may result in additional cuboids between the shelf and actual building surfaces due to an additional vertical histogram peak. These cuboids might be automatically classified as walls instead of internal spaces. To correct for such classification errors, the classification rule is invoked manually to reassign the type attribute of a shape A. This enables the user to correct possible misclassification of non-terminal shapes. The final classification result will be taken into the next modelling steps.

Once all non-terminal shapes are classified, those with type attributes set as wall and external space are treated as terminal and are not processed further.

Adjacency rule

3 1 2 푅푎푑푗: {퐴1, 퐴2} → {퐴1[푎푑푗 ], 퐴2[푎푑푗 ]}: 푐표푛푑

The adjacency rule establishes the adjacency relation between two non-terminal shapes 퐴1 and 퐴2. The condition cond depends on whether the application of the adjacency rule is before or after the merge rule. When applied before the merge rule, two non-terminal shapes are ruled adjacent on the condition that they share a common face. If 퐴1 and 퐴2 are produced by an application of the merge rule, the adjacency relations between them is established on the condition that they are of the type internal space (rooms, corridors) and there exists a common wall between them. The adjacency attribute 푎푑푗 of each shape contains a list of ids of all shapes that are adjacent to it.

Connectivity rule

4 1 2 푅푐표푛푛: {퐴1, 퐴2} → {퐴1[푐표푛푛 ], 퐴2[푐표푛푛 ]}: 푐표푛푑 The connectivity rule establishes the connectivity relation between two non- terminal shapes 퐴1 and 퐴2. The condition cond depends on whether the connectivity rule is applied before or after the merge rule. Before the merge rule, two non-terminals are connected on the condition that they are adjacent spaces and there is no point on the common face between them (퐼푃표퐹 < 휏). After the merge rule, two non-terminals are connected on the condition that they are adjacent internal spaces, and that the common wall between them contains a door. Note that in the current grammar doors are not modelled automatically; they are inserted manually by updating the containment attribute of the corresponding walls using

40

3.2 The shape grammar the containment rule. The connectivity attribute 푐표푛푛 of each shape contains a list of ids of all shapes that are connected to it. The subsequent of a connectivity rule can be a non-terminal or a terminal shape, depending on whether or not the connectivity attribute of the antecedent can be further updated.

Merge rule

5 푅푚푒푟푔푒: {퐴1, 퐴2} → 퐵: 푐표푛푑

The merge rule replaces two non-terminal shapes 퐴1 and 퐴2with their union 퐵. The condition for merging is that 퐴1 and 퐴2 are both of the type internal space and are connected. The non-terminal spaces 퐴1 and 퐴2are not discarded, as these can be used for querying topological relations and path planning in a finer granularity, e.g., between sub-spaces of a room. Figure 3.5 illustrates the application of merge rule.

(a) (b) (c)

Figure 3.5 Merging non-terminal connected spaces: (a) top view of the cuboid shapes classified as interior spaces, walls and exteriors, (b) merging of a pair of non-terminal connected spaces, (c) the model with final merged cuboids.

Containment rule

6 푅푐표푛푡: {퐴1, 퐴2} → {퐴1[푐표푛푡], 퐴2[푐표푛푡]}: 푐표푛푑

The containment rule establishes whether a non-terminal shape contains or is contained in another non-terminal. The condition for the containment is that A1 and

A2 are antecedent and subsequent (or vice versa) of a previously applied merge rule. The containment rule can also be invoked manually, for example to specify that a wall contains a door.

41

3.3 Procedural production

3.3 Procedural production

The grammar rules are applied iteratively and in a specific order as shown in Figure 3.6. The production procedure allows reconstruction of a 3D model of an indoor environment by not only using the knowledge of input point clouds, but also from the current state of the model. As a consequence, the information from the current state of a model makes the production procedure robust against the incompleteness of the input data. Each grammar rule is automatically applied with once for each non-terminal cuboid shapes. The rules can also be manually invoked in two cases: correction of classification errors using the classification rule, and adding doors into walls using the containment rule for establishing connectivity relations between interior spaces (i.e., rooms, corridors).

1 3 2 4 5 6 3 4 I R1place R3adj R2class R4conn R5merge R6cont R3adj R4conn

Terminal Terminal external Terminal internal walls spaces spaces

Figure 3.6 The production procedure. The arrows on top of the boxes indicate iteration.

The production procedure starts by placing cuboid shapes into navigable and non- navigable spaces. This is performed by applying repeatedly the placement rule with parameters, consisting of scale 푆 and translation 푇, derived from the peaks of the histograms of x, y, and z coordinates of the points as described in “Grammar Rules” Section. After every application of the placement rule, the adjacency rule is invoked to establish the adjacency relation between the placed cuboid and the previously placed cuboids. The placement rule and the adjacency rule are invoked once for each cuboid, and the rules are iteratively applied until no more cuboids can be placed in the model (i.e., the condition for placement rule cannot be satisfied).

Once all the cuboids are placed and their adjacency relations are established, the classification rule is applied to assign a specific type (internal space, external space, and wall) to each cuboid. This application order ensures the establishment of type attribute for each shape based on data features, geometric information and its neighbours. The classification rule is automatically applied once for each shape,

42

3.3 Procedural production and iteratively invoked until all non-terminal shapes have a type. Once the automated classification is done, the procedure allows user interaction to reassign the type for any misclassified shapes. The produced shapes of the type external space and wall are marked as terminal, and are not processed further.

The procedure continues with the application of the connectivity rule to establish the connectivity relation between the non-terminal navigable shapes. The connectivity relation is automatically established once between each pair. The connectivity rule is applied iteratively until all pairs of adjacent non-terminals are checked for connectivity. The connectivity relations and type attribute of each shape guide the application of the merge rule. The merge rule is applied to all non- terminal shapes that are classified as internal space and are connected. A non- terminal shape appears as an antecedent in a merge rule only once. After every application of the merge rule, the containment rule is applied to establish the containment relation, followed by the adjacency rule to establish the adjacency relation between the produced shapes (via a common wall). The sequence of the merge rule, containment rule, and adjacency rule is applied iteratively until there are no more non-terminal internal spaces that can be merged further. After the automatic application of the containment rule, the user can again invoke it manually to indicate if a wall contains one or more doors. Once all internal spaces are merged and their containment and adjacency relations are updated, the connectivity rule is applied iteratively until all pairs of adjacent internal spaces are checked for connectivity (via a door in their common wall). All shapes produced by this connectivity rule are terminal internal spaces.

Figure 3.7 illustrates a simple example of the production procedure for the indoor environment shown in Figure 3.1.

Figure 3.7 An example of the production procedure, showing the top view of the intermediate results (top) corresponding to the application of grammar rules (bottom). 푅푖 × 푘 denotes k times application of grammar rule 푅푖.

43

3.4 Demonstration of model reconstruction

3.4 Demonstration of model reconstruction

This section illustrates the application of the shape grammar for modelling a Manhattan world environment. Figure 3.8 shows the final model and the reconstructed graphs of topological relations of the building interior shown in Figure 3.1. A thorough evaluation of the proposed shape grammar on a variety of datasets will be presented in Chapter 6.

(a) (b)

(c) (d)

(e) Figure 3.8 Modelling results of an indoor environment. (a) Input point cloud; (b) the reconstructed model; (c-d) Adjacency relations and Connectivity relations at the granularity of non-terminal spaces; (e) Connectivity relations between terminal spaces (i.e., rooms and corridors).

44

3.5 Summary

As shown in Figure 3.8b, the final model contains both navigable spaces (i.e., rooms, corridors) shown as blue shapes, building elements (e.g., walls) shown as brown shapes and exterior spaces shown as grey shapes. The topological relations between indoor elements are reconstructed at different granularities of both non- terminal spaces (Figure 3.8c&d) and terminal spaces (i.e., rooms, corridors) (Figure 3.8e). Each shape is assigned a unique ID. The blue circles represent the containment relations between the final navigable spaces (i.e., the rooms and the corridor) and their subspaces (yellow circles). The connectivity relations between navigable spaces and the adjacency relations between the navigable spaces and their adjacent walls (brown circles) are represented as black solid lines and brown dash lines respectively. The walls containing doors, which are modelled manually by interactively invoking the containment rule, are indicated as green circles.

3.5 Summary

In this chapter, a shape grammar approach for modelling indoor environments with a Manhattan-world structure is presented. The shape grammar contains grammar rules for the production of both building elements and navigable spaces from point cloud data. The shape grammar also provides an effective method for the extraction of topological relations such as containment, adjacency and connectivity of the spaces, which are essential for path planning and navigation. However, the current grammar is limited to modelling indoor environments with Manhattan world designs. The production procedure indicating the order of application of the grammar rules is manually defined. In the next chapter, an extension of the shape grammar to be applicable to both Manhattan and non-Manhattan-world indoor environments is further investigated. An approach for automated application of the grammar rules is also be introduced.

45

Chapter 4 Extension of the shape grammar for modelling of generic indoor environments

The combination of a shape grammar and a data-driven process aims to improve flexibility in modelling different structures of indoor environments as well as to enhance the robustness of the reconstruction to incomplete and inaccurate input data. In the previous chapter, the proposed shape grammar is limited to the most dominant architecture of indoor environment, i.e., Manhattan world designs. Meanwhile, non-Manhattan world structures are also inevitable parts of indoor environments. Moreover, albeit the conditions of rule applications are automatically learnt during the modelling process, the order of the productions of the Manhattan grammar is manually predefined. The approach also makes use of only the local intrinsic features of data (e.g., point-on-face index) within a local neighbourhood (i.e., cells).

Within the field of urban reconstruction, stochastic processes using the rjMCMC have been widely used to guide the application of grammar rules in the derivation of building façade (Dick et al., 2004; Ripperda and Brenner, 2009). In addition, the technique is also applied to integrating local data features and global constraints of the model to entire input data in reconstructing 3D objects and structures such as in modelling rail tracks (Elberink et al., 2013; Elberink and Khoshelham, 2015), in river and network extraction (Schmidt et al., 2017), and for the generation of 2D floorplans (Merrell et al., 2010).

In this chapter, a procedural reconstruction of indoor models from point cloud using a stochastic algorithm, i.e., a variant of rjMCMC, is proposed. The approach is based on the combination of a shape grammar and a data-driven process, which can

46

4.1 Overview of the approach overcome the limitation of each strategy. The extended shape grammar aims to model both Manhattan and non-Manhattan world buildings. The integration of a shape grammar and a data-driven process is also exploited, based on a stochastic process using a variant of rjMCMC. This solution not only facilitates automated application of the grammar rules, but also enables integration of both local data properties and global plausibility of the model to entire data. Consequently, the technique enhances the robustness of the reconstruction approach to the inherent noise and incompleteness of the data.

4.1 Overview of the approach

As shown in Figure 4.1, the proposed approach contains two main steps: space decomposition and procedural reconstruction. In the first step, the potential planar- surfaces of building elements (e.g., wall surfaces) are extracted from clustered points of the point cloud, and the surfaces are then used to decompose the indoor environment into a set of 3D cells. Intuitively, building surfaces usually separate different interior elements (e.g., between spaces and walls). Thus, different types of building elements (e.g., space, walls) are likely to be represented as separated cells. The point cloud and the 3D cells are then used in the reconstruction process.

Figure 4.1 Overview of the proposed approach

In the procedural reconstruction, a 3D layout of the building interiors is derived based on a combination of a shape grammar, hereafter referred to as indoor grammar, and a stochastic process, i.e., the rjMCMC-based algorithm. The indoor grammar is an extension of the Manhattan grammar proposed in Chapter 3 to be 47

4.2 Space partitioning applicable to both Manhattan and non-Manhattan designs. In addition to the iterative application of placement, classification, and merging processes (Tran et al., 2018), we further include a splitting process and a re-classification of non- terminal shapes in the derivation of the model. This facilitates reversible production, which provides the ability to correct modelling errors.

The modelling procedure starts with a reconstruction of navigable spaces (i.e., rooms, corridors). The shape grammar allows reconstruction of the model in a hierarchical structure by using not only the geometric information, but also the knowledge about structural arrangement, semantic information, and topological relations, owing to inherent characteristics of a shape grammar in maintaining topological correctness between the elements. The rjMCMC-based algorithm guides the automated application of grammar rules, which induces the next proposed model among several matching possibilities from the current existing elements of the model, hereafter referred to as the current state. The reconstructed model is further evolved by a production of the structural elements (e.g., walls, ceilings, and floors), which are derived based on the geometry and their topological relations with navigable spaces. Finally, the remaining cells within the scope (i.e., bounding box) of the environment, which are neither navigable spaces nor structural elements, are labelled as exterior spaces.

4.2 Space partitioning

The input point cloud is first clustered into three groups: horizontal points, vertical points, and others. A dual process, consisting of local-scale extraction and global refinement, is then applied for the extraction of surfaces of horizontal and vertical structures separately. The cell decomposition of an indoor environment is created as an arrangement of the extracted structures. The remainder of this section explains the three steps in detail.

Point clustering: The point cloud is clustered into three groups, i.e., horizontal points - representing the horizontal structures (e.g., ceilings and floors), vertical points - representing the vertical structures (e.g., walls), and others belonging to noises and clutter, based on their normals. The goal is to eliminate irrelevant points (e.g., noises, clutter) in the extraction of building surfaces as well as the reconstruction of the final model.

48

4.2 Space partitioning

Given that point clouds captured by various lidar sensors normally have an upright orientation, a point is classified as vertical or horizontal if its normal 푛푝 is parallel with the vertical direction 푛푣 or horizontal direction 푛ℎ respectively, up to a small deviation 휃 (e.g., 휃 ≤ 10 degrees). The points which belong to neither horizontal nor vertical groups are classified as others, and these points are then excluded in further processes. Figure 4.2 shows an example of point clustering, which demonstrates the exclusion of a considerable amount of clutter from the points representing building structures (i.e., horizontal and vertical points) of a hexagonal building.

(a) (b) (c) (d) Figure 4.2 Point clustering: (a) An input point cloud of a hexagonal building; (b) Vertical points; (c) Horizontal points; (d) Other points (e.g., clutter and noise)

Surface extraction: A RANSAC-based plane-fitting algorithm (Schnabel, 2007) is used for generating reasonable structure surfaces of the building interior. In general, the algorithm iteratively estimates the best-fitting planar surfaces from an input point cloud, based on the orthogonal distances 푑(푝푖, 휋푗) between each data point 푝푖 to its corresponding surface plane 휋푗. The point-to-plane distance is defined as follows:

푇 푑(푝푖, 휋푗) = 휋푗 푝푖 (4.1)

푇 Where 푝푖 = (푥, 푦, 푧, 1) is the homogeneous representation of a 3D point and the 푇 푇 surface plane 휋푗 = (푛 , −휌) is defined by a pair of parameters, consisting of a 푇 unit normal 푛 = (푛푥, 푛푦, 푛푧) and the orthogonal distance 휌 from the origin (Khoshelham, 2015, 2016).

In this step, we extract horizontal structures and vertical structures from horizontal and vertical clusters respectively to lessen mutual influences between them. However, the process may still produce undesired surfaces, which have arbitrary orientations and a small number of supporting points, when applied to large

49

4.2 Space partitioning buildings and heterogeneous point clouds. To tackle this problem, we propose a multi-scale surface extraction, consisting of a local extraction step and global refinement step.

Local extraction: The building surfaces are initially extracted at a local scale, which is at voxelized structures of horizontal and vertical data. The voxel size is empirically defined and can be set as the dimensions of an ordinary room (e.g., 3m to 5m). Thus, it allows reducing the involvement of irrelevant elements in the extraction of a building surface. A valid vertical or horizontal building surface is in vertical or horizontal directions respectively, and it contains a certain amount of supporting points. Figure 4.3 illustrates the surface extraction at the local scale from vertical points of the hexagonal building.

(a) (b) (c)

Figure 4.3 Local surface extraction. (a) voxelized structure of the input point cloud; (b) extracted vertical surfaces; (c) extracted horizontal surfaces.

Global refinement: The initial surface segments extracted locally are iteratively refined in order to generate an optimal set of surfaces for a building interior. Two local surfaces (푠1, 푠2), which are parallel with each other up to a small threshold angle (푎푛푔푙푒(푠1, 푠2) < 휃) and have a threshold distance between them

(푑푖푠푡푎푛푐푒(푠1, 푠2) < 푑), are merged and replaced by a new surface. In general, the thresholds do not have a significant impact on the extraction result, and the angle 휃 is recommended that it be less than 10 degrees and the threshold distance 푑 should be no larger than the nominal accuracy of the data points. The angle between two surfaces 푎푛푔푙푒(푠1, 푠2) is computed using the dot product of their normals

(푛(푠1) ∙ 푛(푠2)), and their distance 푑푖푠푡푎푛푐푒(푠1, 푠2) is defined as the minimum point-to-plane distance (see Eq. 4.1) between the supporting points of one surface to the other. The merged surface is fitted from the combined set of supporting points of its two component surfaces. The refinement is repeated until no more

50

4.2 Space partitioning surfaces can be merged. Figure 4.4 shows the final extracted surfaces of the hexagonal building after the refinement.

Figure 4.4 Globally refined surfaces of the hexagon building: (a) vertical surfaces; (b) horizontal surfaces

Cell decomposition: An arrangement of the extracted horizontal and vertical surfaces decomposes the building interior into a set of 3D cells. The intersections between all vertical surfaces with a horizontal structure, such as the floor surface of the building, are first computed. These line intersections are then used to partition the horizontal structure into a set of 2D cells using the 2D arrangement (Agarwal and Sharir, 2000) of the Computational Geometry Algorithms Libraries CGAL (CGAL, 2019). These horizontal 2D cells are then extruded in vertical directions and intersect with the other horizontal surfaces (e.g., the ceiling) to generate 3D cell decomposition of the building interior. Figure 4.5 illustrates the cell decomposition of the hexagonal building from its vertical and horizontal surfaces.

(a) (b) (c)

Figure 4. 5 3D cell decomposition: (a) horizontal and vertical surfaces; (b) 2D cell decomposition of a horizontal structure (i.e., floor); (c) 3D cell decomposition.

51

4.3 The indoor shape grammar

4.3 The indoor shape grammar

The indoor shape grammar is an extension of the shape grammar proposed in Chapter 3 to model both Manhattan and non-Manhattan world indoor environments. The extension includes a new shape definition, extended grammar rules, and the integration of a stochastic process to the production procedure using a rjMCMC-based algorithm.

4.3.1 Shapes

In general, a shape grammar contains a finite set of shapes, including a starting shape, non-terminal and terminal shapes (Stiny, 2008). In the indoor shape grammar, a shape is a polyhedron: it is assigned a unique identifier id and consists of 4 elements, i.e., a scope (i.e., bounding box), geometry (i.e., geometric attributes), semantics (i.e., semantic attributes), and topological relations (i.e., topological attributes). Similar to Müller et al. (2006), the scope is defined as the oriented bounding box of a shape, represented by a corner point P, three orthogonal vectors 푋, 푌, 푍 indicating its orientation, and a scale vector 푆 = {푠푥, 푠푦, 푠푧 } representing the dimensions along the 3 orthogonal directions. The shape geometry is described with a boundary representation {푉, 퐹}. Figure 4.6 shows an example of a shape with its scope (blue bounding box) and geometry representation (cyan bounding surfaces).

Figure 4.6. Geometric representation of a 3D shape and its scope

Similar to the definition in Chapter 3, the semantic attribute implies the type of a shape, which can take one of three values {interior space, exterior, and wall}, and the topological attributes indicate the shape’s adjacency, connectivity, and

52

4.3 The indoor shape grammar containment relations. The starting shape is a unit scope with the point P located at the origin of the local coordinate system and the unit scale 푆 = {1, 1, 1} represents the dimensions along the x-, y-, and z- axes. The unit scope has empty geometry, no semantic information and no topological relations.

4.3.2 Grammar rules

The grammar rules define operations on shapes, which transform antecedent shapes into subsequence shapes. We propose 8 grammar rules categorized into three types: geometric transformation, semantic conversion, and topological relation establishment rules. The grammar rules are defined in the following form:

푅푢푙푒: 퐴푛푡푒푐푒푑푒푛푡푠: 푐표푛푑 → 푆푢푏푠푒푞푢푒푛푐푒푠: 푝푟표푏

The antecedent shapes are non-terminal shapes, while the subsequent shapes can be either non-terminal or terminal shapes. The rule is applied on the condition cond and is selected with probability prob, which is automatically learned during the derivation of the 3D models.

Geometric transformation rules: These rules operate on the shapes’ geometries, and therefore transform the geometry of a model. The applications of the rules can also involve varying the number of shapes constituting a model. The place and merge rules are inheritably modified from the shape grammar proposed in Chapter 3, while the split rule is new and is defined to allow reversibility of the reconstruction.

(1) Place rule: 1 푅푝푙푎푐푒: 퐼: 푐표푛푑(퐇 ≠ ퟎ) → 푁: 푝푟표푏 1 The place rule 푅푝푙푎푐푒 places a non-terminal shape N into the scope of the indoor environment, by transforming the starting shape 퐼 (the unit scope) using a set of transformation parameters H. The transformation H consists of the scope information of the shape N and its geometry representation. The scope is determined by a translation vector 푇 = {푡푥, 푡푦, 푡푧} and a rotation 푅 = {푟푥, 푟푦, 푟푦} , indicating the location of the shape bounding box, coupling with a scale vector 푆 =

{푠푥, 푠푦, 푠푧 }, encoding its dimensions. The shape geometry is represented with a set

53

4.3 The indoor shape grammar of vertices and bounding faces {푉, 퐹}, which are obtained from the 3D cell decomposition of the building.

(2) Merge rule: 2 푅푚푒푟푔푒: {푁1, 푁2}: 푐표푛푑 → 퐵: 푝푟표푏

The merge rule replaces two non-terminal shapes 푁1 and 푁2 with their union B on the conditions that they are both type interior navigable space and are in a connectivity relation.

(3) Split rule:

3 푅푠푝푙푖푡: 퐵: 푐표푛푑 → {푁1, 푁2}: 푝푟표푏

The split rule is the reciprocal of the merge rule. The rule applies a split to a shape

퐵 and then replaces it with two component shapes 푁1 and 푁2.

Figure 4.7 shows an example of a reversible process by application of the merge and split rules.

(a) (b) Figure 4.7. Reversible merging and splitting. Merging: from (a) to (b) the two

adjacent spaces 푁1 and 푁2 are merged to form the space B. Splitting: from (b)

to (a) the shape B is split into its two component shapes 푁1 and 푁2.

Semantic conversion rules: These rules operate on the semantic information of non-terminal shapes without changing their geometries or number of shapes in a model. The semantic conversion rules contain two reciprocal rules, classification and declassification rules.

(1) Classification rule:

4 푅푐푙푎푠푠: 푁[푡푦푝푒 = ∅]: 푐표푛푑 → 푁[푡푦푝푒]: 푝푟표푏

54

4.3 The indoor shape grammar

The classification rule assigns a non-terminal shape N, which has no semantic information, (i.e. type is empty (∅)) a certain type (i.e., wall, interior space, or exterior). The condition cond and probability prob of the rule are automatically learned using the stochastic process using rjMCMC during the reconstruction of interior spaces. Meanwhile, the labelling of structural elements and exteriors are inferred from their geometries and topological relations. For example, a wall should be in an adjacency relation with interior spaces and the vertex-to-surface distances (Khoshelham, 2016) between its vertices and the adjacent surface should not be larger than the wall thickness of the building, in accordance with predefined knowledge of the building architecture.

(2) Declassification rule:

5 푅푑푒퐶푙푎푠푠: 푁[푡푦푝푒]: 푐표푛푑 → 푁[type = ∅]: 푝푟표푏

The declassification rule is the reverse process of the classification rule. The rule nulls the type attribute of a non-terminal shape N. Similar to the classification rule, the condition cond and the probability prob can be automatically learned during the reconstruction of interior spaces.

Figure 4.8 illustrates a reversible process involving classification and declassification of an interior space.

(a) (b) Figure 4.8 Reversible classification-declassification processes. Classification: from (a) to (b) by labelling a shape as a navigable space (green). Declassification: from (b) to (a) by nulling the label of the navigable space.

Topological relation rules: The topological rules are defined to establish the three topological relations between shapes, consisting of adjacency, connectivity, and

55

4.3 The indoor shape grammar containment relations. The condition to invoke the rules are similar in Chapter 3. However, in this paper, the rules are assigned a probability prob for its application, which can be learned during the reconstruction process.

(1) Adjacency rule

6 1 2 푅푎푑푗: {푁1, 푁2}: 푐표푛푑 → {푁1 [푎푑푗 ], 푁2[푎푑푗 ]}: 푝푟표푏

The adjacency rule establishes the adjacency relation between two non-terminal shapes 푁1 and 푁. If the relation is established before the reconstruction of structural elements, the two non-terminal shapes are determined to be adjacent if they share a common face. After modelling of structural elements, two final interior navigable spaces (i.e., rooms, corridors) are in an adjacency relation if there is a common wall between them.

(2) Connectivity rule

7 1 2 푅푐표푛푛: {푁1, 푁2}: 푐표푛푑 → {푁1[푐표푛푛 ], 푁2[푐표푛푛 ]: 푝푟표푏

The connectivity rule establishes the connectivity relation between two non- terminal shapes 푁1 and 푁2. Before the reconstruction of structural elements, two interior spaces are connected if they are adjacent and there is no actual surface between them. After the modelling of structural elements, two final spaces (i.e., rooms, corridors) are in connectivity relation if there is a common wall and the wall contains a door. Similar in Chapter 3, the doors are manually inserted to the model by updating the containment attribute of the walls.

Figure 4.9 shows that both adjacency and connectivity relations are established between the shapes 푁1 and 푁2 as these two interior spaces share a common face and there is no actual surface between them.

56

4.3 The indoor shape grammar

(a) (b) (c) Figure 4.9 An example of topological relations between two non-terminal

navigable spaces 푁1 and 푁2 (a), which are in an adjacency relation (b) and in a connectivity relation (c) as there is no actual surface between them.

(3) Containment rule

8 1 2 푅푐표푛푡: {푁1, 푁2}: 푐표푛푑 → {푁1[푐표푛푡 ], 푁2[푐표푛푡 ]}: 푝푟표푏

The containment attributes of two non-terminal shapes are updated via the application of the containment rule. If the shapes 푁1 and 푁2 are antecedent and subsequent (or vice versa) of a previously applied merge rule, they are in a containment relation. In addition, the containment rule can be invoked to establish the containment relation between a door and a wall.

Figure 4.10 shows an example of a containment relation where a shape B contains two component shapes 푁1 and 푁2.

(a) (b) (c) Figure 4.10 An example of containment relation (c) between the two navigable

spaces 푁1 and 푁2 (a) and their merged shape B (b).

57

4.4 Procedural model generation using rjMCMC

4.4 Procedural model generation using rjMCMC

Figure 4.11 shows the production procedure for the generation of a 3D indoor model. An indoor model 푀 is a configuration of a finite set of shapes. Initially, non-terminal shapes 푁 are placed into the scope of the indoor environment by 1 iterative application of the place rule 푅푝푙푎푐푒. At this stage, all non-terminal shapes N, transformed from the starting shape I (the unit scope), are represented with a scope, geometric attributes encoding its boundary representation {푉, 퐹}, but no semantic information and no topological relations. The probability 푝푟표푏 of the rule application is initially set as 1 in the case that the shape’ geometry {푉, 퐹} is totally matched with a cell’s geometry created from the 3D cell decomposition. If no such 3D cell can be found, then the rule cannot be applied (푝푟표푏 = 0). Once a shape is placed in the scope of the building interior, its adjacent relations will be automatically established with its neighbours, which share a common face.

Figure 4.11 Production procedure. The arrow on the top of a rule indicates iteration.

The reconstruction of the building interior then proceeds with a search for an optimal configuration of interior spaces (i.e., rooms, corridors) in the model 푀. In this step, we aim at maximizing the model fitness to the input point cloud 퐷, while maintaining the model plausibility and its simplicity. In other words, we search for the most probable model given the data 푝(푀|퐷) based on a stochastic approach using rjMCMC sampling with Metropolis-Hastings algorithm (Metropolis et al., 1953; Hastings, 1970). The rjMCMC-based algorithm simulates random walks from the current model to the next proposed configuration with a capacity to cope with the variability of the number of shapes in each model configuration. The transition from a current state 푀푡 to the following configuration 푀푡+1 is decided by both the model probability 푝(푀|퐷), a transition probability 퐽(푀푡+1|푀푡), and an acceptance probability 훼 as defined in the equation (4.2). The description of the model probability and the transition kernel are detailed in the following sections.

58

4.4 Procedural model generation using rjMCMC

푝(푀 |퐷) ∙ 퐽(푀 |푀 ) (4.2) 훼 = 푚푖푛 {1, 푡+1 푡+1 푡 } 푝(푀푡|퐷) ∙ 퐽(푀푡|푀푡+1)

A proposed transition is accepted if the acceptance probability 훼 is larger than a random number 푈 ∈ [훽, 1], where 훽 ∈ [0,1] is the convergence factor. This triggers the application of grammar rules, whose guarding conditions are satisfied, in order to generate a new configuration. The acceptance probability 훼 is assigned as the application probability of the triggered rules. The introduce of the convergence factor 훽 provides the flexibility to search for the optimal model either in a sub-space of models (훽 > 0) (Tran and Khoshelham, 2019a) or in the whole model space as default (훽 = 0) (Hastings, 1970). In general, the application of grammar rules in the reconstruction of interior spaces is in a stochastic order. However, in order to maintain the global plausibility of the model and the model conformity to architectural principals, the process requires to follow certain procedural priorities. For example, once a new interior space is added to the model, its topological relations (i.e., adjacency, connectivity) must be established spontaneously. Likewise, when two connected navigable spaces are merged together to form a new space, the containment rule must be applied next to establish the containment relation between the new merged space with its components. Once the reconstruction process of interior spaces is final, the geometry transformation rules are no longer applied in the reconstruction process.

The reconstructed interior spaces (i.e., rooms, corridors) are then taken into account for the generation of structural elements. The semantic conversion rules are applied to assign semantic information (e.g., walls, exteriors) to non-navigable spaces which have no semantic information, followed by the application of topological relation rules to establish the topological relations among reconstructed shapes (i.e., final interior spaces, walls) of the model. At this stage, the probability 푝푟표푏 of the rule application is set as 1 (푝푟표푏 = 1), if their guarding conditions are satisfied. Otherwise, the probability is set to 0 (푝푟표푏 = 0). Once all the shapes are labelled, and no more topological relations can be established, all the reconstructed interior spaces, structural elements, and exteriors are terminal shapes.

59

4.4 Procedural model generation using rjMCMC

4.4.1 Model probability

The model probability 푝(푀|퐷) indicates the probability of a model configuration 푀 given the input point cloud 퐷. The probability is defined to encode the prior knowledge of building architecture, and to measure the model fitness to the input and the model plausibility. According to Bayes’ rule, the model probability 푝(푀|퐷) is proportional to the product of the prior 푃(푀) and the likelihood 푃(퐷|푀), that is: 푃(푀|퐷) ∝ 푃(푀)푃(퐷|푀).

4.4.1.1 Prior probability

The prior 푃(푀) of a model 푀 is defined as a joint probability distribution of the prior of the shapes 푀(푖) comprising the model 푀, in this case, interior spaces:

푛 (4.3) 푃(푀) = ∏ 푃(푀(푖)) 푖=1

Where 푛 is number of interior spaces of the model 푀. 푃(푀(푖)) is the prior of the shape 푀(푖).

Without any observed data, the prior of an interior space is derived from the geometric information {푉, 퐹} and the constraints on knowledge of building architectures. We first define the width of a shape as the smaller horizontal dimension of its scope (i.e., the bounding box).

Intuitively, an interior space should have a large width in comparison with a structural element (i.e., walls). We, therefore, assign the prior of an interior space to 푃(푀(푖)) = 1, if the shape has a width larger than the maximum wall thickness. Otherwise, the prior is set to 푃(푀(푖)) = 0. For the special case, when the model 푀 has no interior space, the prior of the model is set to 푃(푀) = 1.

4.4.1.2 Likelihood

The likelihood encodes the model fitness as well as its conformity to the input data. The model fitness is measured at both local scales for each individual shape (so- called local likelihood) and the global fitness of the whole model (so-called global likelihood). In other words, the approach facilitates the integration of both local characteristics of data and global plausibility of the model in the reconstruction of

60

4.4 Procedural model generation using rjMCMC the model. We therefore define the likelihood of the model 푃(퐷|푀) as a joint distribution of the local likelihood 푃퐿(퐷|푀) and the global likelihood 푃퐺(퐷|푀):

푃(퐷|푀) = 푃퐿(퐷|푀) 푃퐺(퐷|푀) (4.4)

Local likelihood: The local likelihood 푃퐿(퐷|푀) is computed based on the interpretation from local characteristics of data points enclosing each individual shape of the model. The local likelihood of a model is a joint distribution of the likelihood of each interior shape to its enclosing data points. Intuitively, an interior shape is likely to contain points on its top surfaces (i.e., ceiling) (Khoshelham and Díaz-Vilariño, 2014). We therefore define the likelihood of each interior shape as the coverage of its top surfaces (Tran and Khoshelham, 2019b), which is computed as the ratio between the alpha-shape area covered by the shape’s data points (Edelsbrunner and Mücke, 1994), and the area of the top surface. The local likelihood of the model, therefore, is formulated as:

푛 퐶표푣|푀(푖). 푡표푝| (4.5) 푃퐿(퐷|푀) = ∏ 1 |푀(푖). 푡표푝| where 푛 is the number of interior spaces in the model 푀. | ∙ | denotes the area of a surface of a shape. Meanwhile, 퐶표푣| ∙ | denotes the area of the surface, which is covered by points.

The local likelihood 푃퐿(퐷|푀) ranges from 0, indicating that the proposed model has at least one interior space without any supporting points, to 1, indicating that all the navigable spaces are totally covered by the input point cloud.

Global likelihood: The global likelihood 푃퐺(퐷|푀) measures the fitness of the whole model 푀 as well as its general plausibility with respect to the entirety of the input data 퐷. We define the global likelihood as the combination of three data terms, i.e., (1) horizontal fitness 푃ℎ표푟 measuring how good the horizontal structure fit with the horizontal points ℎ푝표푖푛푡푠, (2) vertical fitness 푃푣푒푟푡, measuring the fitness of the vertical structures to the vertical points 푣푝표푖푛푡푠, and (3) model plausibility 푃푝푙푎푢푠 indicating the reliability and conformity of the model with respect to the data (i.e., both horizontal and vertical points). Depending on the intrinsic characteristics of data such as the level of incompleteness, noise, or clutter,

61

4.4 Procedural model generation using rjMCMC the contribution of each data term to the likelihood varies. The global likelihood, therefore, is formulated as follows:

푃퐺(퐷|푀) = 푘1푃ℎ표푟 + 푘2푃푣푒푟푡 + 푘3푃푝푙푎푢푠 (4.6)

Where 푘1, 푘2, 푘3 are the normalization factors, which are used to weight the contribution of each term to the global likelihood. The normalization factors must satisfy the condition 푘1 + 푘2 + 푘3 = 1.

Horizontal fitness: The horizontal fitness 푃ℎ표푟 is computed as the normalized horizontal coverage of the model 푀. The horizontal coverage of a model is the area of the top surfaces of its interior spaces (i.e., rooms, corridors) covered by horizontal points ℎ푝표푖푛푡푠. The normalization is the ratio between the model’s horizontal coverage and the horizontal areas of the building, which are covered by the horizontal points ℎ푝표푖푛푡푠. In other words, the more horizontal surfaces are covered by the horizontal points, the higher the horizontal fitness of the mode model 푀 is. We formulate the horizontal fitness 푃ℎ표푟 of a model 푀 as the equations (4.7) as below:

푛 ∑푖=1 퐶표푣|푀(푖). 푡표푝| (4.7) 푃ℎ표푟 = 퐶표푣푝|ℎ푝표푖푛푡푠| where 퐶표푣푝| ∙ | denotes the alpha-shape area covered by points.

Vertical fitness: Akin to horizontal fitness, the vertical fitness 푃푣푒푟푡 is measured based on the vertical coverage of the model 푀. The coverage is computed as the area of vertical surfaces (i.e., wall surfaces) which is covered by the vertical points 푣푝표푖푛푡푠, summed over all interior spaces of the model. We to formulate the vertical fitness as the normalization of the vertical coverage of the model, which is defined as the proportion of the model’s vertical coverage to the vertical coverage of the building. The horizontal fitness of the model M is formulated as equation (4.8).

푛 ∑푖=1 퐶표푣|푀(푖). 푣푒푟푡푖푐푎푙| (4.8) 푃푣푒푟푡 = 퐶표푣푝(푣푝표푖푛푡푠)

Model plausibility: The general plausibility 푃푝푙푎푢푠 of model with respect to the input data is measured based on the coverage areas of both horizontal and vertical

62

4.4 Procedural model generation using rjMCMC points, which fall inside the interior spaces (i.e., rooms, corridors) and therefore do not represent the vertical or horizontal structures. The more vertical and horizontal areas covered by these inside points, the lower the model plausibility is. We formulate the model plausibility as follows:

푛 ∑1 퐶표푣푝|푀(푖). 푖푛푃표푖푛푡푠| (4.9) 푃푝푙푎푢푠 = 1 − 퐶표푣푝|ℎ푝표푖푛푡푠| + 퐶표푣푝|푣푝표푖푛푡푠|

Where 푀(푖). 푖푛푃표푖푛푡푠 is the vertical and horizontal points which fall inside an interior space 푀(푖).

The value of 푃푝푙푎푢푠 may be influenced in the environments with a high level of clutter. In these cases, the contribution of the model plausibility 푃푝푙푎푢푠 to the global likelihood should be small, and it can be adjusted by reducing the value of its normalization factor 푘3 in Eq. (5).

4.4.2 Model transition

The transition kernel 퐽(푀푡+1|푀푡) represents the probability for the change from the current model 푀푡 to the next proposed model 푀푡+1. The variation in the number of shapes of the reconstructed models during the modelling process is integrated in the transition kernel (Green, 1995; Ripperda and Brenner., 2009). Particularly, in the reconstruction of interior spaces (i.e., rooms, corridors), there are 4 possible changes in the configuration of a model (Tran and Khoshelham, 2019a) defined as follows:

▪ adding: a shape which has no semantic information and is not in an adjacency relationship with any navigable space is classified as a navigable space; ▪ removing: a navigable space which is not adjacent with any navigable space is changed to a shape with empty semantics; ▪ adding - merging: a shape which has no semantic information is classified as a navigable space, and is then merged with its adjacent navigable spaces to form a new navigable space;

63

4.4 Procedural model generation using rjMCMC

▪ splitting - removing: This is the reciprocal of the transition adding - merging. A navigable space which was formed by merging two or more navigable spaces is split into its components, and the semantic attribute of the navigable space which is added before the merging is nulled.

The transitions of adding a new interior space or removing an existing interior space by application of classification or declassification rules, as well as splitting an interior space into two components or merging two spaces into a merged one by invoking the split rule and merge rule respectively. These changes will cause the variation in the number of shapes configuring the model. Figure 4.12 gives examples of the transitions between two 3D models of an interior space.

(c) (d)

(c) (d) Figure 4.12 Examples of transitions between two models in the model space of an interior space. Adding: from (a) to (b) by adding a navigable shape (dark green). Removing: from (b) to (a) by removing a navigable shape (dark green). Adding - merging: from (c) to (d) by adding a navigable shape and merging it with the adjacent space. Splitting – removing: from (d) to (c) by splitting a merged navigable space and nulling the semantics of one component.

64

4.4 Procedural model generation using rjMCMC

In addition to maximize the fitness of the model with respect to the data, which are encoded via the model probability (section 4.4.1 Model probability), our goal is to reconstruct a compact model with low complexity. The concept of minimum description length (Rissanen, 1978) is therefore adopted to define the transition kernel. The description length of the model is described as the number of its interior spaces. In other words, the model of a building interior, which has a smaller number of interior spaces (i.e., rooms, corridors), is favoured as being more likely than a complex one with many small spaces. This strategy ensures that the final model is formed as the union of the final interior spaces rather than separated components.

We formulate the complexity of a model as:

푛 (4.10) 퐶(푀) = 푙표푔2 where n is the number of interior spaces in the model M. In a special case, 퐶(푀) = 1, when 푛 = 1.

The transition kernel 퐽(푀푡+1|푀푡) is defined as follows:

퐶(푀푡) (4.11) 퐽(푀푡+1|푀푡) = 휇 퐶(푀푡+1) where 휇 is a smoothing coefficient. To set the value of 휇 we consider the following three cases: case 1, the current model 푀푡 contains the same number of interior spaces as the proposed model 푀푡+1; case 2, 푀푡 contains fewer spaces than 푀푡+1; and case 3, 푀푡 contains more spaces than 푀푡+1.

4.5 Demonstration of model reconstruction

An example of model reconstruction using the indoor shape grammar on a simple non-Manhattan world building is shown in this section. Note that appropriate experiments and results on both a synthetic dataset and real datasets will be later reported in Chapter 6. Figure 4.13 shows the results of surface extraction, space partitioning, cell classification, and the final model of the synthetic hexagon building shown in Figure 4.1.

65

4.5 Demonstration of model reconstruction

(a) (b) (c)

(d) (e) (f) Figure 4.13 Reconstruction results for the synthetic hexagon building: (a) the point cloud, (b) vertical structures, i.e., walls, (c) horizontal structures, i.e., the ceiling and the floor, (c) 3D cell decomposition, (e) the classification of cells into navigable spaces (green) and non-navigable spaces (light pink), (f) the final model.

The potential surfaces of the floor and the ceiling, and the potential wall surfaces are first extracted as shown in Figure 4.13b-c. Figure 4.13d shows the cell decomposition of the building space, which comprises 89 shapes in total. As can be seen from the figure, the main hall of the building space is partitioned into 24 individual shapes. In the reconstruction process, these 3D shapes are classified as navigable spaces (green) and non-navigable spaces (light pink) as shown in Figure 4.13e. Those navigable spaces which are adjacent to each other will iteratively be merged together to form the final unified navigable space, i.e., the large hall. The non-navigable spaces are further classified into building elements (red) or exterior spaces. The final model, containing both navigable spaces (i.e., rooms and corridors) and building elements (i.e., walls) is shown in Figure 4.13(f). The exterior spaces are excluded from visualization.

4.6 Summary

In this chapter, a new method for procedural reconstruction of 3D indoor models from point clouds is presented. The method is based on the combination of a shape

66

4.6 Summary grammar and a data-driven process using the rjMCMC-based algorithm. This integration facilitates the automated application of grammar rules. Consequently, it provides flexibility in modelling different building architectures, including Manhattan and non-Manhattan World designs. Additionally, taking advantages of both the local characteristic of data and the model plausibility with respect to entire input data, as well as taking into account the current state of the model enhances the robustness of the method to incompleteness and inaccuracy of data.

67

Chapter 5 Geometric quality evaluation and change detection of 3D indoor models

According to geometric data exchange standards (e.g., CityGML, IFC, and IndoorGML), an interior model consists of three main components: geometric elements (e.g., walls, floors, ceilings, and slabs), semantic attributes (e.g., type, function, and material) and topological relations (e.g., adjacency, connectivity, and containment) between indoor elements (Kolbe et al., 2005; Liebich, 2009; Lee et al., 2014). While manual inspection may be more appropriate for semantic attributes and topological relations, an automated quantitative evaluation is preferred for geometric elements, as it allows comparison of different indoor models in an unbiased and objective manner.

In practice, various data sources (e.g., trajectories, images, and lidar data) can be used for indoor modelling. A common approach to the evaluation of indoor models is to compare the model with the input data (Valero et al., 2012; Macher et al., 2017; Tran et al., 2018). However, such an evaluation is influenced by the data quality, and the result will likely reveal the quality of the data instead of the quality of the model. To achieve independence from data quality, an evaluation of the geometric quality of a 3D model (hereafter referred to as the source) based on comparison with a ground truth model (hereafter referred to as the reference) is proposed. The ground truth is a true representation of the indoor environment created manually by an expert. The comparison of an indoor model with a reference facilitates not only quantitative evaluation of the geometric quality of the model but also detection of the differences between the two models.

The identification of differences between the two models can be useful for temporal analysis of building changes. However, in practice, the models are usually available

68

5.1 Geometric quality aspects of indoor models only at the design stage (i.e., as-designed 3D models), but not at other stages (e.g., construction and maintenance stages) during a building’s lifecycle. In order to identify discrepancies between an indoor environment and its existing 3D model, a method based on a comparison between the as-designed 3D model and a point cloud of the environment is also proposed in this chapter.

5.1 Geometric quality aspects of indoor models

Intuitively, an ideal source contains all the elements of the reference without any additions. All the elements in the source are expected to have the same location, dimensions and orientation as those in the reference. However, many realistic models are far from this ideal. We assert that the following three aspects are necessary and sufficient to evaluate the geometric quality of a 3D indoor model as compared to a reference: accuracy, completeness, and correctness.

Accuracy: The accuracy aspect expresses how geometrically close the source model is to the reference. Discrepancies between the location and the orientation of each source element and the corresponding element in the reference determine the accuracy of the source. For example, in Figure 5.1 the discrepancy between the locations of the wall (dark blue) in the source (right) and that in the reference (left) contributes to the low accuracy of the source.

Figure 5.1 An example of low geometric accuracy of the source (right) with respect to the reference (left) due to the discrepancy between the location of a wall (dark blue) in the models.

Completeness: The completeness aspect indicates to what extent the geometric elements of the reference are present in the source. It provides information regarding complete or partial reconstruction of the indoor environment in the source. A

69

5.1 Geometric quality aspects of indoor models complete source model is one that contains all the geometric elements that are present in the reference. The absence of any reference elements from the source will reduce its completeness. Figure 5.2 shows a source model that is incomplete as it is missing a wall with respect to the reference.

Figure 5.2 An example of the incomplete source (right) with respect to the reference (left) due to a missing wall in the source.

Correctness: The correctness aspect expresses to what the extent the elements in the source model are present in the reference. A correct model is one that contains only those elements that are present in the reference. Any additional elements in the source, which do not exist in the reference, reduce the correctness of the source model. In Figure 5.3 an additional wall in the source with respect to the reference results in low correctness for the source model.

Figure 5.3 An example of the source model (right) that has low correctness in comparision with the reference model (left) due to the inclusion of an additional wall.

70

5.2 Quantitative evaluation and comparison of indoor models

5.2 Quantitative evaluation and comparison of indoor models 5.2.1 Challenges in the comparison of indoor models

An intuitive approach to measuring the accuracy, completeness, and correctness of a source model is to perform an element-by-element comparison of the source with the reference model. While this approach might seem intuitive and straightforward it faces several important challenges: (1) existence of multiple standards for the representation of 3D indoor models; (2) presence of surfaces in volumetric models, which are reconstructed based on interpretation due to the lack of data; (3) lack of one-to-one correspondence between the source and the reference elements; (4) mutual interdependence between the quality aspects. The rest of this section discusses these challenges in detail.

Multiple standards for the representation of 3D indoor models: The existence of both surface-based (i.e., CityGML) and volumetric representations (i.e., IFC) necessitates an evaluation method which can be applied to the two types of indoor models. While a surface-based model is constituted by visible surfaces of geometric elements (Kolbe et al., 2005; Bacharach, 2008), in a volumetric model these elements are represented as volumetric solids, which can be represented by a small set of parameters (Liebich et al., 2009). A direct comparison of models represented in different ways (i.e., surface-based and volumetric representations) is sometimes infeasible. Meanwhile, the conversion from one representation to the other is not straightforward and heavily depends on the complexity of the models and their topological consistency.

Interpreted surfaces in volumetric models: In a volumetric model, each volumetric element is bounded by surfaces of which only some are observable. For example, a wall might be observed only from one side, and is therefore only partially present in the data. In the modelling phase, the thickness of the wall will be left to the interpretation of the modeller or to heuristics used in modelling algorithms. In particular, the intersection of volumetric walls is unobservable, which can lead to different interpretations and compositions of the wall elements. Unfortunately, there is no guarantee that the interpretations by different modelling algorithms and human experts will be consistent and accurate. As a consequence, the comparison between volumetric source and reference models will be influenced by these inconsistent

71

5.2 Quantitative evaluation and comparison of indoor models interpretations. Figure 5.4 shows two different interpretations of the source and the reference from observable wall surfaces.

(a) (b) (c)

Figure 5.4 Different interpretations of unobservable elements in 3D indoor models: (a) observed wall surfaces; (b) source model with thick interpreted outer walls; (c) reference model with thin interpreted outer walls.

Lack of one-to-one correspondence between the models: Another difficulty with the comparison lies in the lack of one-to-one correspondence between the elements in the source and those in the reference. A straightforward approach to comparing two models is to seek, for each element in one model, a corresponding element in the other. However, an element in the source might correspond to more than one reference element and vice versa. Figure 5.5 shows an example, where the walls of a room are modelled with different composition of wall elements in the source and in the reference.

(a) (b) Figure 5.5 Example of lack of one-to-one correspondence between the source and the reference elements: (a) source model with four wall elements; (b) reference

model with eight wall elements.

72

5.2 Quantitative evaluation and comparison of indoor models

Mutual interdependence between the quality aspects: The comparison of indoor models based on one single measure for each of the quality aspects can be biased because of their interdependence. On the one hand, the completeness and correctness of a source model might be influenced by the low accuracy of its elements with respect to the reference. On the other hand, additional elements and missing elements in the source in comparison with the reference influence the accuracy of the source model. For example, in Figure 5.6, because of the low accuracy of the wall (dark blue) in the source with respect to that in the reference, the wall might be interpreted as missing from the correct location, contributing to low completeness, and being redundant in another location, causing low correctness of the source.

Figure 5.6 An example of an inaccurate wall (dark blue) in the source (right) with respect to that in the reference (left).

5.2.2 Quality evaluation metrics

We propose a method for quantitative evaluation of the geometric quality of 3D indoor models, which can overcome the challenges mentioned in the previous section; i.e., it is applicable to both surface-based and volumetric models; it accounts for interpreted surfaces in volumetric models; it is independent of one-to-one correspondence between the source and the reference elements; and it diminishes the interdependence between the quality aspects, in order to provide an objective comparison of indoor models.

Intuitively, the comparison between two elements, which semantically represent different types of building elements (e.g., wall vs door) or belong to different spatial structure elements (e.g., building storey, building part) is likely to provide

73

5.2 Quantitative evaluation and comparison of indoor models meaningless practical uses and potentially to bias the evaluation. Our method evaluates the quality of the source based on a semantic based comparison with the reference. This way, the elements in the source and the reference, which contain identical semantic information (e.g., walls, ceilings/floors, windows/doors) and belong to the same spatial structure elements, are compared.

To overcome the challenge of interpreted surfaces in volumetric models, we define an attribute for each surface in the reference model to specify whether it is visible (1) or interpreted (0). The visible surfaces are captured in the input data, while the interpreted ones, which are unobservable from the data, are heuristically interpreted from the visible surfaces. For models generated from point clouds, this can also be done automatically by verifying the presence of points on each surface, e.g., using the point-on-surface index (see Eq.3.1). The interpreted surfaces are then excluded from the evaluation process to avoid the influence of inconsistent and uncertain interpretations. The quality evaluation of surface-based and volumetric source models with respect to the reference is, therefore, based on comparison of the source surfaces with only the visible surfaces in the reference.

Completeness and correctness are measured based on the intersection of the source and reference elements. To relieve the burden of searching for the correspondence between the elements of the two models, a buffer is created around each visible reference surface, and those source elements which intersect with the buffers will be taken into account in the computation. This makes the computation efficient and independent of one-to-one correspondence between the source and reference elements.

To eliminate the bias due to the interdependence between the quality aspects, the completeness and correctness will be measured at various levels of accuracy, while the accuracy of the source is measured at different levels of completeness and correctness with respect to the reference. As a consequence, it will be possible to reveal the quality of indoor models in terms of the independent quality aspects.

To measure the geometric quality of the source model in terms of completeness, correctness, and accuracy, we propose the following metrics:

74

5.2 Quantitative evaluation and comparison of indoor models

Completeness metric: Completeness is measured as the proportion of the reference model R that overlaps with the source model S. The overlap of the two models is computed based on the area of intersection between a reference surface 푅푗 and all surfaces 푆푖 in the source S summed over all surfaces in the references. The area of the intersection is measured as the union U of the overlap areas between the surface 푅푗 in the reference and the orthogonal projection 풫(푆푖) of each surface 푆푖 in the source S on the reference surface. To avoid the influence of irrelevant surfaces and to optimize the cost of computation, the source surfaces must fall inside a buffer 푏(푅푗) created around a visible surface 푅푗 in the reference and only surfaces that are parallel up to a predefined threshold angle 휃 are used in each instance of intersection computation. In general, the threshold does not have a significant impact on the evaluation result, and it is recommended that it be not larger than 10 degrees. Meanwhile, a source surface 푆푖 satisfies the condition of falling inside a buffer 푏(푅푗) if there exists an intersection area between them. Figure 5.7 shows an example of a reference surface 푅푗 (orange) and its relevant source element 푆푖 (blue), which are parallel with each other, and the source element intersects with the buffer 푏(푅푗) (light grey) created around the reference surface. The intersection area is the overlap area (bounded by red dash line) between the source surface 푆푖 and the intersected polygon (yellow) of the crossing plane of the source surface and the buffer.

(a) (b) (c) Figure 5.7 An example of relevant source and reference surfaces: (a) a reference element Rj; (b) buffer 푏(푅푗) created around Rj; (c) the source surface 푆푖, which intersects with the buffer 푏(푅푗), an overlap area (bounded by red dash line).

The completeness metric is defined as a function of buffer size b created around each visible reference surface 푅푗:

75

5.2 Quantitative evaluation and comparison of indoor models

∑푚 | ⋃푛 (풫(푆푖) ∩ 푏(푅푗)) | 푗=1 푖=1 (5.1) 푀퐶표푚푝(푆, 푅, 푏) = 푚 푗 ∑푗=1 |푅 | where 푏(. ) denotes the buffer with size b created around a visible reference surface Rj, and n, m are the number of surfaces in the source S and in the reference R respectively. The operator |.| denotes the area of the surface and 풫(. ) denotes the orthogonal projection operation of the source surface on its relevant reference surface.

The completeness is computed based on the application of Boolean union on the areas of intersection between each instance of a reference surface and its relevant surfaces in the source. This way facilitates the elimination of the multiple identical overlap areas of different source surfaces with a reference element 푅푗. Likewise, to avoid multiple identical intersection between a source element with different reference surfaces, the buffer size b should be chosen in an acceptable range, which is recommended to be not larger than a half of the distance between the two closest parallel reference surfaces (e.g., the thickness of walls). Consequently, it ensures that each area of reference surfaces and each area of source surfaces contribute to the computation of overlap areas between the source and reference models no more than once. The completeness metric ranges from 0 to 1, with a higher value indicating that a larger proportion of the reference elements is present in the source.

Correctness metric: Correctness is measured as the proportion of the source model S which overlaps with the reference model R. Akin to the completeness, the overlap of the source and reference models is computed as the union U of the area of intersections between each surface 푅푗 in the reference and the orthogonal projections 풫(푆푖) of its relevant source surfaces 푆푖 in the source S on the reference surface summed over all the surfaces in the reference. The correctness metric is defined as a function of buffer size b created around each visible reference surface 푅푗 as follows:

∑푚 | ⋃푛 (풫(푆푖) ∩ 푏(푅푗) | 푗=1 푖=1 (5.2) 푀퐶표푟푟(푆, 푅, 푏) = 푛 푖 ∑푖=1 |푆 |

76

5.2 Quantitative evaluation and comparison of indoor models

The application of Boolean union operation on the intersections between each reference surface 푅푗 and the orthogonal projections 풫(푆푖) of its relevant source surfaces 푆푖ensures the robustness of the correctness metric to duplicate reconstructed elements in source models. Suitable values of the buffer size b which is set to be not larger than the minimum distances between two parallel reference surfaces (e.g., the thickness of walls) enables the elimination of multiple identical intersections between a source element and different reference surfaces in the computation of correctness. The correctness metric ranges from 0, implying an incorrect reconstruction of the entire source, to 1, indicating that the source contains no redundant elements as compared to the reference.

Accuracy metric: Accuracy is defined based on Euclidean distance between the dense points representing the reference model and the closest surfaces in the source model. In the accuracy measurement, the reference is represented as a set of points by applying a uniform space sampling on the surfaces of the 3D reference model. Figure 5.8 shows an example of a reference model and its sampled points.

(a) (b) Figure 5.8 An example of sampled points of the reference model: (a) 3D reference model; (b) the sampled points.

We use the orthogonal distance between each point 푝푖 of the sampled reference to the closest corresponding surface 휋푗 of the source on the condition that the orthogonal projection of the point on the surface falls within the surface boundary (Oude Elberink et al., 2013; Oude Elberink and Khoshelham, 2015). In comparison with a point-point distance (Rusinkiewicz and Levoy, 2001), the point-surface

77

5.2 Quantitative evaluation and comparison of indoor models distance mitigates the influence of the point density of the sampled reference in the accuracy computation. However, these distances are still influenced by the correctness and completeness of the source. Thus, similarly to Lehtola et al. (2017), we define the accuracy as a function of a cut-off distance r, which is the largest acceptable distance between a reference point and the closet source surface. The point-surface distances larger than the cut-off value r will be excluded from the computation. To further avoid the influence of outliers, we use the median absolute distance to measure the accuracy metric:

푇 푇 푀퐴푐푐(푆, 푅, 푟) = 푀푒푑(|휋푗 푝푖|) if |휋푗 푝푖| ≤ 푟 (5.3)

푇 where 휋푗 푝푖 is the orthogonal distance between the point 푝푖 and the surface plane

휋푗 both represented by homogeneous coordinates (see Section 4.2), and |.| denotes the absolute value.

The computation of the completeness and correctness on a basis of each instance of intersection computation between the source and reference surfaces, coupled with the measure of accuracy based on each instance of point-surface distance, makes it possible to measure the quality of different geometric elements (e.g., walls, ceiling, and floors) separately. This facilitates the localization of missing and redundant surfaces of the source with respect to the reference. The reference surfaces which have no intersection with the source surfaces are most likely absent in the source, and, likewise, the source surfaces which have no intersection with the visible reference surfaces are most likely redundant. The measurement of point-to-plane distances in the accuracy measure also provides insight into the deviation location between the source and different parts of each element in the reference.

In the comparison of the source to the reference, the completeness and correctness will be measured with a range of the increasing buffer size b, and the accuracy is measured with the same range for the increasing cut-off value r. This ensures the consistency of the elements, which contribute to the computations of the quality metrics

78

5.2 Quantitative evaluation and comparison of indoor models

5.2.3 Demonstration of quality evaluation of indoor models

This section demonstrates the application of the proposed approach for quantitative evaluation and geometric comparison of 3D indoor models on a synthetic dataset, which contains 3 different cases illustrating inaccuracy, incompleteness, and incorrectness of the sources with respect to the reference. Quantitative evaluation and comparison of geometric quality of indoor models, which are reconstructed by indoor modelling approaches, will be later presented in Chapter 6.

Figure 5.9 shows the reference and the 3 synthetic models corresponding to inaccurate, incomplete, and incorrect sources respectively. The reference consists of building elements such as visible wall surfaces (grey) and floors (yellow) as the composition of a single room and a corridor. Its dimensions are 6m in width, 9m in length, and 2.5m in height. The inaccurate source (Inacc) contains all of the elements in the reference without any addition. However, the locations of the outer walls are all shifted outwards by 5cm. The incomplete source (Incomp) is missing a room, while the incorrect source (Incorr) contains an additional room that is not present in the reference. We added random noise to the dimensions (+/-5cm) and the location ([+/-1mm, +/-5mm]) of building elements (e.g., walls and floors) of the Incomp and Incorr sources.

(a) (b) (c) (d)

Figure 5.9 The synthetic dataset: (a) the reference; (b) the inaccurate source (Inacc); (c) the incomplete source (Incomp); (d) the incorrect source (Incorr).

79

5.2 Quantitative evaluation and comparison of indoor models

Figure 5.10 shows the quantitative measures computed for the walls of source models in the synthetic dataset in terms of the three quality aspects: completeness, correctness, and accuracy.

As can be seen in Figures 5.10 (a) and 5.10 (b), the completeness and correctness of the walls in Inacc model are both close to 0.38 when the buffer size is smaller than 5cm. This shows the influence of the inaccuracy of the model on the completeness and correctness, since all of the outer walls in the Inacc source are shifted outwards by 5cm, and these elements do not fall within the buffer created around each reference surface to compute the intersection. However, with the buffer size larger than 5cm, both the completeness and the correctness closely reach 1.0, indicating that the walls of Inacc source model present all the wall elements of the reference without any additions. As shown in Figure 5.10 (c), the accuracy metric for the walls of the Inacc source increases from 0 cm to about 5cm, correctly reflecting the actual shift introduced to the elements of this model as compared to the reference.

The completeness of the wall elements of the Incomp model varies from 0.31 for smaller buffer sizes to the maximum value of 0.62 for larger buffer sizes, indicating that this model represents the reference only partially. The missing room in the source lowers the completeness about 38%. The correctness metric for the walls in Incomp increases closely to 1, which means there are no additional walls in this model with respect to the reference. The accuracy metric varies from 0.3cm to 4.7cm because of the random noise added to the dimension and location of the wall elements in Incomp.

In the case of the Incorr model, the completeness increases from 0.31 to 1, indicating that all the wall elements in the reference are present in Incorr. However, the correctness ranges from 0.23 to 0.72 because of the presence of an additional room, which takes up about 28% of the wall surface areas in the source. Similar to Incomp, the accuracy metric for Incorr increases from 0.5cm to 4.5cm.

80

5.2 Quantitative evaluation and comparison of indoor models

(a)

(b)

(c) Figure 5.10 Quantitative evaluation of the source models in the synthetic dataset in terms of completeness (a), correctness (b), and accuracy (c).

81

5.3 Building change detection

5.3 Building change detection

The comparison between two 3D models of a building (i.e., source and reference) can facilitate change detection of the building through the identification of redundant and missing elements in one model with respect to the other. However, in practice, 3D models of a building are usually created only at the design stage (i.e., as-designed models) and a 3D model which representing the as-is state of the building is often not available. Meanwhile, a point cloud representing the as-is condition of a building can be effectively captured by using state of the art spatial data acquisition techniques (e.g., photogrammetry and laser scanning). The comparison between a 3D model of a building and a point cloud representing the as-is state of the building is a more effective approach to detect changes in the building and update the model. This section presents a method for automated building change detection through a comparison between a 3D model (e.g., as-designed models) and a point cloud of an indoor environment. The approach is based on point classification and surface coverage to identify discrepancies between the model and the point cloud.

5.3.1 A method for comparison of a 3D indoor model and a point cloud

The comparison between a 3D model 푀 and a point cloud 푃 of a building interior through a dual process, containing point classification (which is for identification of whether each point in the point cloud represents an element in the 3D model) and the measure of surface coverage (which is identify whether a surface of the 3D model is covered by points in the point cloud).

To facilitate the comparison, the 3D model 푀 and the point cloud 푃 are first registered into a common coordinate system. A coarse registration can be done manually with available software, e.g. CloudCompare (CloudCompare Development Team, 2019), by picking at least three corresponding points between them. In practice, the model 푀 and the point cloud P should be accurately registered as any misregistration error may be wrongly interpreted as a change in the environment. Therefore, this coarse registration step is followed by a fine registration step, which is typically done based on the Iterative Closet Point (ICP) algorithm by minimizing the point-to-point distances (Besl and McKay, 1992) between the point cloud P and sampled points on surfaces in the model M, or by minimizing point-plane distances

82

5.3 Building change detection

(Chen and Medioni, 1992), or using plane-to-plane based metrics (Forstner and Khoshelham, 2017).

Intuitively, the points of the point cloud 푃 representing a surface in the 3D model 푀 are likely to be close to the surface. Likewise, a surface of the model 푀 which is existing in the building interior is mostly covered by the points. Therefore, to compare a 3D model of a building interior and a point cloud, we propose a method consisting of two main steps:

Point classification: the point classification process classifies each point 푝푖 of the point cloud 푃 into two types: existing (0) and new (1). The existing points (푡푦푝푒 = 0) represent surfaces in the model M, while the new points (푡푦푝푒 = 1) belong to new elements of the building interior in comparison to the model 푀. The classification is based on a point-surface distance, which is the orthogonal distance between each point 푝푖 to the closest corresponding surface 휋푗 of the model 푀 (Khoshelham, 2015, 2016). The point is classified as 푡푦푝푒 = 0 if the orthogonal distance is less than a cut-off threshold r and the orthogonal projection of the point on the surface falls within the surface boundary (Oude Elberink et al., 2013; Oude Elberink and Khoshelham, 2015). Otherwise, if the point-to-surface distance is larger than the cut-off threshold r or its orthogonal projections do not fall within any surface boundary of the 3D model 푀, the point is labelled as 푡푦푝푒 = 1 accordingly. The threshold r is set according to the noise of the point cloud P and the error of the registration between the point cloud P and the model M. We formulate the classification of a point 푝푖 in the point cloud 푃 as:

푇 0 푖푓 ∃휋푗 |휋푗 푝푖| ≤ 푟 푎푛푑 퐼푛(푝푖, 휋푗) (5.4) 푡푦푝푒 (푝푖) = { 1 푂푡ℎ푒푟푤푖푠푒

푇 where |휋푗 푝푖| is the absolute orthogonal distance between the point 푝푖 and the surface 휋푗 of the model M. 퐼푛(. ) denotes the operation on whether an orthogonal projection of the point is within the boundary of the corresponding surface.

Using the point classification method, any new elements of a building interior which are missing in the 3D model will be identified in the point cloud as clusters of points of type 1. Meanwhile, the existing elements of a building which are recorded in the model will only contain points of type 0. The computation of the point-to-surface

83

5.3 Building change detection distance at this step is not only useful for labelling of the points, but also provides information about location deviation of each point in the point cloud P and its corresponding elements in the 3D model.

Surface coverage measurement: The coverage 푀푐표푣 of a surface 휋푗 in the model M is measured based on the surface area that is covered by the point cloud 푃. The existing points are taken into account in the computation of the coverage of the corresponding surface. The points are first orthogonally projected on the corresponding surface in order to construct a 2D alpha-shape 훼, which can be derived from the Delaunay triangulation of the projected points on the condition that the circumradius of each triangle face is smaller than an alpha radius 푟∝ (Edelsbrunner,

1992; Edelsbrunner and Mücke, 1994). The coverage 푀푐표푣 of a surface 휋푗 is the ratio of the area of the alpha-shapes 훼 to the area of the surface:

푎푟푒푎(훼) (5.5) 푀푐표푣 (휋푗) = 푎푟푒푎(휋푗) where 훼 denotes the alpha-shape reconstructed from the orthogonal projections of the existing points on the corresponding surface and 푎푟푒푎(. ) denotes the area of a surface.

Figure 5.11 shows an example of coverage measurement of a surface (orange), which is partially covered by a point cloud. The alpha-shape 훼 (green) is reconstructed with the alpha radius 푟∝ < 0.2푚 from the projections of the point cloud on the surface.

The coverage of the surface is 푀푐표푣 = 0.56, indicating that 56% of the surface area is captured in the point cloud.

(a) (b)

Figure 5.11 An example of surface coverage: (a) a surface (orange) and its corresponding point cloud; (b) a 2D alpha-shape (green), which is constructed from the projection of the point cloud on the surface, cover up 56% of the surface

(푀푐표푣 = 0.56).

84

5.3 Building change detection

The coverage 푀푐표푣 (휋푗) of a surface ranges from 0 to 1. Using the surface coverage measure any redundant surface in the 3D model M, which is not present in the point cloud 푃, will be identified by a low coverage (푀푐표푣 (휋푗) ≈ 0) as there are no existing points (푡푦푝푒 = 0) corresponding to the surface. The surfaces with higher coverage are most likely present in the real environment.

5.3.2 Demonstration of building change detection

A synthetic dataset, comprising a 3D model and a point cloud of a synthetic building, is used to demonstrate the application of the comparison between a 3D building model and a point cloud for building change detection. We assume that the 3D model represents the previous stage of a building, which consists of a room and a corridor. Meanwhile, the point cloud captures the current stage of the building with two new walls and one removed wall in comparison with the existing 3D model. The point cloud has no noise and is created by random space sampling on surfaces of the model. Therefore, the cut-off distance can be set close to zero (here r ≈ 1mm). The synthetic dataset is shown in Figure 5.12.

(a) (b)

Figure 5.12 The synthetic dataset: (a) the 3D model; (b) the point cloud.

Figure 5.13 shows the result of point classification based on the point-surface distance between the point cloud and the 3D model of the synthetic dataset. Changes are detected in the building in comparison with the 3D building model as there are clusters of points in the point cloud with large point-surface distances. As can be seen in Figure 5.13(a), the points belonging to the new elements of the building have a large distance from the existing 3D model, up to 2.5m, while the remaining points of the point cloud have very small distances (close to 0) from the corresponding

85

5.3 Building change detection surfaces in the model. The points with a distance smaller than r=1mm are classified as existing points (blue) representing existing elements in the 3D model as shown in Figure 5.13(b). Points classified as new points (yellow), which have larger point- surface distances, represent the new walls, which are not present in the existing 3D model.

(a)

(b)

Figure 5.13 Comparison results for the synthetic dataset: (a) the point cloud colorized according to point-surface distances; (b) the point cloud with the result of point classification.

The coverage 푀푐표푣 of each surface of the 3D model is computed based on the result of the point classification. Figure 5.14(a) shows the surfaces colorized according to the coverage, which ranges from 푀푐표푣 ≈ 0 , indicating that there is no point in the point cloud representing the surface, to 푀푐표푣 ≈ 1, indicating that the surface is fully covered by points. The redundant and existing surfaces of the model in comparison with the point cloud are then derived from the coverage as shown in Figure 5.14(b).

86

5.3 Building change detection

The redundant surfaces of the 3D model, which do not exist in the real environment as represented by the point cloud, are identified as those surfaces that have a coverage smaller than a certain threshold set by the user (here 푀푐표푣 ≤ 0.3), while the coverages of all correct surfaces which exist in the point cloud are larger than the threshold (푀푐표푣 > 0.3).

(a)

(b)

Figure 5.14 Comparison results of the synthetic dataset: (a) the 3D model with surfaces colorized according to the coverage 푀푐표푣; (b) the 3D model with the location of redundant elements (blue) and existing surfaces (yellow).

The updating of the existing 3D model to represent the current stage of the building can be guided by the locations where the changes are detected. This will facilitate efficient generation of an as-is 3D model of the indoor environment.

87

5.4 Summary

5.4 Summary

In this chapter, a method for automated evaluation of the geometric quality of 3D indoor models based on comparison with a ground truth, and an approach to building change detection through comparison between a 3D model and a point cloud of an indoor environment were presented. The former enables quantitative measurement of the geometric quality in terms of completeness, correctness, and accuracy. The method can be applicable to both surface-based and volumetric models, does not require a one-to-one correspondence between the elements, and is not influenced by the presence of interpreted surfaces in volumetric models. The proposed method aims not only at geometric quality evaluation of 3D indoor models, but also at localization of errors, i.e., missing and redundant elements in the 3D models. This ability enables the proposed method to detect changes and analyse temporal variations of the building through a comparison between 3D models of a building interior created at different times. The latter aims to identify the discrepancies between an existing indoor model and a physical environment when 3D models of an indoor environment are not available at different stages of its lifecycle. For current practices, in terms of building change detection, this strategy is more suitable than the method of comparison between two 3D indoor models. The approach can detect the discrepancies and locate the new elements of the building as represented by the point cloud, which are missing in the 3D model, as well as redundant elements in the 3D model, which do not exist in the point cloud. In the next chapter (Chapter 6), we demonstrate the application of the proposed evaluation methods to quantitatively measure the completeness, correctness and accuracy of 3D indoor models reconstructed by using the proposed shape grammars (Chapter 3&4). The application of the change detection method in a real indoor environment based on comparison of its 3D model and the point cloud capturing the as-is condition of the environment will also be demonstrated.

88

Chapter 6 Experiments and results

This chapter presents the experiments and evaluation results for the approaches described in the previous chapters (3, 4 and 5). A series of experiments on both synthetic datasets and real datasets, including the ISPRS Benchmark dataset on indoor modelling (Khoshelham et al., 2017), were carried out to evaluate the performance of each approach. The quality of 3D models reconstructed by the proposed approaches for automated reconstruction of indoor environments

(described in chapters 3&4) will be quantitatively evaluated and compared using the proposed evaluation approach described in Chapter 5. The results presented in this chapter have been partially published in Tran et al. (2018), Tran et al. (2019), and Tran and Khoshelham (2019b).

6.1 Datasets

13 different datasets representing indoor environments with various complexity were used to evaluate the proposed approaches for procedural reconstruction and quality evaluation of 3D indoor models. Table 6.1 provides an overview of the datasets. Each dataset contains a point cloud, representing either a Manhattan

(MW) or non-Manhattan world (non-MW) building. For some of the datasets a ground truth model was also available. The ground truths representing the true models of indoor environments were created manually by an expert. Among the 13 datasets, there are 3 synthetic point clouds (namely, Floor-1, Floor-2, SYN) simulating different scenarios of indoor environments, and 5 real point clouds

89

6.1 Datasets

(namely, House-1, House-2, Office-1, Office-2, Museum), representing residential and public buildings, which were collected for this research, and the ISPRS

Benchmark dataset on indoor modelling (Khoshelham et al. 2017), which is a public dataset. The point clouds in these datasets were captured by different sensors, including TLS and MLS sensors. Most of the environments contain moderate or high levels of clutter and occluded surfaces. The general characteristics of the indoor environments represented in the datasets are described in the following paragraphs.

Floor-1 and Floor-2: The two synthetic point clouds simulate a two-storey

Manhattan world building. Each floor consists of seven rooms connected to a long corridor. A random error of 5 cm is added to the point clouds. The datasets do not contain ground truth models of the environments.

House-1, House-2, and Office-1: These three datasets contain point clouds representing real indoor environments with MW designs, including two residential buildings and an office respectively. These point clouds were acquired by a Faro focus TLS (Faro, 2019) and contain a moderate level of clutter and occluded surfaces. While House-1 contains eight separated rooms, House-2 has four rooms containing ceiling structures with different heights. Office-1 has a typical structure of an office building, consisting of five separated rooms connected to a long corridor. In these datasets, the ground truths are not available.

SYN, Office-2, Museum: SYN simulates a simple residential house with non-MW designs. The building consists of a common room, 3 private rooms, and an open indoor space with no clutter. The point cloud has a random error of 5mm and its average point spacing is 5cm. Office-2 and Museum datasets represent a research lab’s office and an art museum at the University of Melbourne respectively. These non-MW buildings were captured by a Zeb Revo RT handheld laser scanner

(GeoSLAM, 2019) with a nominal accuracy of 3 cm. The real buildings contain a

90

6.1 Datasets high level of clutter, which therefore occlude the building surfaces from the scanner. Each dataset has a ground truth 3D model, which is used for quality evaluation of the reconstructed model.

ISPRS Benchmark datasets: The ISPRS indoor modelling benchmark dataset

(Khoshelham et al. 2017) contains five point clouds with ground truth models of indoor environments (namely, TUB1, TUB2, Fire Brigade, Uvigo, UoM) of various complexity captured by different sensors, including TLS Leica laser scanner (Leica, 2019b), Zeb-Revo (GeoSLAM, 2019), Zeb-1 sensor (Bosse et al.,

2012), Uvigo Backpack (Filgueira et al, 2016) and Viametris iMS3D systems

(Viametrics, 2019). The environments are typically large and complex public buildings. The point clouds of TUB1 and TUB2 were captured by Viametris iMS3D system and Zeb-Revo respectively. They both have nine rooms and a long corridor with low level of clutter. The Fire Brigade building also contains nine rooms bounded by curtain walls. This environment has a high level of clutter and was captured by a Leica TLS. Uvigo containing a large hall and a long entrance was captured by the Uvigo backpack system. The mobile handheld sensor Zeb-1 was used for capturing the point cloud of UoM building, which comprise seven rooms connected to a long corridor. Both UoM and Uvigo have a moderate level of clutter.

91

6.1 Datasets

Table 6.1 Overview of the datasets

Sensor Type Number of Level of MW/ non- Ground points clutter MW truth Dataset

Floor-1 - Synthetic 337,095 None MW N

Floor-2 - Synthetic 303,286 None MW N

House-1 Faro Focus Real 3,642,155 Moderate MW N TLS

House-2 Faro Focus Real 2,642,840 Moderate MW N TLS

Office-1 Faro Focus Real 1,500,938 Moderate MW N TLS

TUB1 Viametris Real 356,959 Low MW Y iMS3D system (ISPRS)

TUB2 Zeb revo Real 795,721 Low MW Y (ISPRS)

Fire Leica TLS Real 759,871 High MW Y Brigade (ISPRS)

Uvigo UVigo Real 699,132 Moderate MW Y Backpack (ISPRS)

UoM Zeb1 Real 499,910 Moderate MW Y (ISPRS)

SYN - Synthetic 917,918 None non-MW Y

Office-2 Zeb Revo RT Real 988,290 High non-MW Y

Museum Zeb Revo RT Real 1,689,022 High non-MW Y

Figure 6.1 shows the point clouds of the datasets while the available ground truth models of the ISPRS Benchmark dataset and the non-Manhattan world datasets are shown in Figure 6.2

92

6.1 Datasets

(a) (b)

(c) (d) (e)

(f) (g) (h)

(i) (j)

(k) (l) (m) Figure 6.1 The point clouds of the datasets. (a) Floor-1; (b) Floor-2; (c) House- 1; (d) House-2; (e) Office-1; (f) TUB1; (g) TUB2; (h) Fire Brigade; (i) Uvigo; (j) UoM; (k) SYN; (l) Office-2; (m) Museum.

93

6.2 Evaluation of shape grammar for MW buildings

(a) (b)

(c) (d)

(e)

(f) (g)

(h) Figure 6.2 The ground truth models of the datasets. (a) TUB1; (b) TUB2; (c) Fire Brigade; (d) Uvigo; (e) UoM; (f) SYN; (g) Office-2; (h) Museum.

94

6.2 Evaluation of shape grammar for MW buildings

6.2 Evaluation of the shape grammar for MW buildings

The shape grammar proposed in Chapter 3 for modelling Manhattan world buildings is implemented in Matlab environment on a personal computer (i7-

7500U GPU, 2.7GHZ, 16.0 GB). Several experiments were carried out using the point clouds in the MW datasets, including two synthetic point clouds (i.e., Floor-

1, Floor-2), three real point clouds captured by ourselves (i.e., House-1, House-2,

Office-1), and the five ISPRS benchmark datasets (i.e., TUB1, TUB2, Fire Brigade,

Uvigo, and UoM). The experiments were designed to evaluate the performance of the shape grammar in modelling large and complex building interiors captured by different types of laser sensors and containing various levels of noises, clutter and occluded surfaces.

Table 6.2 lists the parameters used in the grammar rules, and the corresponding values that were experimentally found suitable for each dataset. The bin size is used in the placement rule for the computation of the histograms. In general, the value does not significantly impact on the quality of the reconstructed model, and a value not larger than ½ of maximum wall thickness is recommended. The maximum wall thickness is determined based on prior knowledge of the indoor environments. The shape buffer, which is the size of a buffer around a shape, should be larger than the noise of data, but small enough (i.e., smaller than maximum wall thickness) to differentiate between horizontal structures and between vertical structures. The maximum wall thickness, the shape buffer, and average point spacing of the point cloud are used in the computation of the point-on-face index to determine the type of the shapes in the classification rule.

95

6.2 Evaluation of shape grammar for MW buildings

Table 6. 2 Parameter settings used in the experiments.

Histogram Max wall Shape bin size thickness buffer Dataset (m) (m) (m)

Floor-1 0.05 0.5 0.05

Floor-2 0.05 0.5 0.05

House-1 0.03 0.25 0.2

House-2 0.03 0.125 0.1

Office-1 0.03 0.25 0.08

TUB1 0.03 0.35 0.1

TUB2 0.03 0.35 0.1

Fire Brigade 0.03 0.35 0.1

UVigo 0.03 0.35 0.1

UoM 0.03 0.35 0.1

6.2.1 Qualitative evaluation

6.2.1.1 3D reconstructed models

Figure 6.3 shows the reconstructed models of the MW indoor environment from the point clouds by applying the shape grammar. In addition to the terminal navigable spaces and walls, which are shown in the figure, the models include also external spaces and non-terminal shapes, which are excluded from the visualization.

96

6.2 Evaluation of shape grammar for MW buildings

(a) (b)

(c) (d) (e)

(f) (g)

(h) (i)

(j)

Figure 6.3 The reconstructed models: (a) Floor-1; (b) Floor-2; (c) House-1; (d) House-2; (e) Office; (f) TUB1; (g) TUB2; (h) Fire Brigade; (i) Uvigo; (j) UoM. Blue shapes represent navigable spaces, and brown shapes represent walls.

97

6.2 Evaluation of shape grammar for MW buildings

Visual examination of the reconstructed models indicates that all navigable and non-navigable spaces (e.g., walls) are correctly reconstructed (external spaces are excluded from the visualization). Manual interaction was needed in a few places to correct the reconstruction errors. These were mainly limited to manually correcting the classification labels by interactively invoking the classification rule. This correction was done by simply clicking on misclassified elements and updating the type attribute.

In order to gain insights into the quality of the reconstructed models, the models having the ground truth model in the datasets (e.g., TUB1, TUB2, Fire Brigade, Uvigo, and UoM) will be further quantitatively evaluated in terms of completeness, correctness, and accuracy using the proposed quality evaluation and geometric comparison approach (Chapter 5), while the models whose ground truths are not available (i.e., Floor-1, Floor-2, House-1, House-2, Office-1), , will be further qualitatively evaluated mainly based on visual inspection regarding their geometric reconstruction, semantic reconstruction, and topological relations establishment in the following subsections.

6.2.1.2 Geometric reconstruction

To facilitate the qualitative inspection of the geometric accuracy of the reconstructed models (i.e., Floor-1, Floor-2, House-1, House-2, Office-1), signed orthogonal distances between individual points and their corresponding model faces were computed following the definition of the point-to-plane distances in Chapter 4. Figure 6.4 shows the input point clouds colorized according to the point- model distances. In general, the reconstructed elements are mostly within 5-10 cm from their corresponding points. Larger distances can be seen on an open door in House-1, and on extruded windows in House-2, indicating that these elements are not reconstructed.

98

6.2 Evaluation of shape grammar for MW buildings

(a) (b)

(c) (d)

(e)

Figure 6.4 Point clouds colorized according to the signed point-model distances: (a) Floor-1; (b) Floor-2; (c) House-1; (d) House-2; (e) Office.

Table 6.3 presents a few statistics of point-model distances. The median distances are smaller than 1 cm in all datasets indicating that the model faces are reconstructed through the points. The median absolute distances also agree closely with the expected precision of the point clouds.

99

6.2 Evaluation of shape grammar for MW buildings

Table 6. 3 Statistics of signed point-model distances.

Dataset Floor-1 Floor-2 House-1 House-2 Office Statistic

Median distance (cm) -0.7 -0.3 0.3 -0.3 -0.7

Median absolute distance (cm) 3.3 3.1 1.3 1.6 1.3

Percentage in [-10cm, 10cm] 92.5 94 96.4 92 99.5

6.2.1.3 Semantic reconstruction

The performance of the proposed method is further evaluated in terms of semantic information, the reconstructed elements which were assigned automatically an incorrect type by the classification rule were manually inspected and analysed. Table 6.4 summarises the accuracy of the automatically assigned labels. The classification errors are mostly caused by the varying point density, data noise, and incompleteness of the point clouds, which result in an unrealistic point-on-face index. The misclassified shapes, 9 out of 415, are corrected interactively by invoking the classification rule and reassigning the correct labels. This is done by simply clicking a misclassified element and updating the label.

Table 6. 4 Accuracy of semantic reconstruction.

Dataset Floor-1 Floor-2 House-1 House-2 Office Labels

Internal space (assigned labels/true labels) 94/94 28/28 28/28 44/48 10/12

Wall (assigned labels/true labels) 49/49 21/21 18/18 14/17 18/18

External space (assigned labels/true labels) 0/0 0/0 9/9 47/47 26/26

Total (assigned labels/true labels) 143/143 49/49 55/55 105/112 54/56

Accuracy 100% 100% 100% 94% 96%

A useful property of the shape grammar is the prediction of spaces that are not captured in the point cloud. For example, if a room is not scanned, it is still likely

100

6.2 Evaluation of shape grammar for MW buildings

that a cuboid will be placed based on data points capturing the other parts of the indoor environment, but since there are no point on the ceiling and walls the cuboid will be classified as an exterior space. Using the interactive classification, these predicted interior spaces can be easily added to the model, by simply reassigning the labels. Figure 6.5 illustrates an example, where a room in the office building is not scanned, yet it is modelled by cuboids initially classified as exterior space. Figure 6.5 (b) shows that the room can be easily added to the model by interactive classification of the corresponding cuboids into an interior space and a wall as highlighted in the red circle.

(b) (a)

Figure 6.5 Prediction of spaces that are not captured in the point cloud: (a) the grammar places cuboids but misclassifies them as exterior space; (b) two cuboids are interactively reclassified as an interior space and a wall to add the room to the model as highlighted in the red circle.

6.2.1.4 Reconstruction of Topological relations

The proposed shape grammar enables the automated establishment of topological relations (i.e., adjacency, connectivity, and containment) between elements of the reconstructed models. In this section, the reconstruction of topological relations of several real datasets (i.e., Office-1, House-1, and House-2) are presented. A qualitative evaluation based on visual examination indicated that the topological relations are correctly reconstructed.

101

6.2 Evaluation of shape grammar for MW buildings

Figure 6.6 shows the reconstructed model of Office with the grid coordinates (i,j,k) of the cuboids, which are used for establishing the topological relations. The doors were inserted manually to connect the rooms with the long corridor by simply clicking the corresponding wall element and updating the containment attribute. Room 1 and room 2 share a common wall also with a door. Note that a unique id, which is in accordance with the grid coordinates, is generated for each cuboid. New shapes produced by the merge rule are also assigned a unique id sequentially.

(b) (a) Figure 6.6 The grid coordinates of the cuboids: (a) the i-j view; (b) the j-k view.

Figure 6.7 shows the reconstructed topology graphs for Office. The blue circles represent the containment relations between the final interior spaces (the rooms and the corridor) and their subspaces (cuboids before merging) shown by yellow circles. The shape ids 1, 65, 28, 54, 56 and 68 correspond to rooms 1, 2, 3, 4, 5 and the corridor in Figure 6.6. The adjacency and connectivity relations are represented by red dashed lines and black solid lines respectively. Figure 6.8 shows the connectivity relations between the interior spaces (rooms, corridors) of Office.

102

6.2 Evaluation of shape grammar for MW buildings

(a) (b)

Figure 6.7 Topology graphs reconstructed for Office: (a) adjacency and containment relations; (b) connectivity and containment relations.

Figure 6.8 Connectivity relations (black solid lines) between the interior spaces of Office.

Figure 6.9 shows the reconstructed topology graphs and the connectivity relations between the interior spaces for House-1 and House-2.

103

6.2 Evaluation of shape grammar for MW buildings

(a) (b)

(c) (d)

(e) (f) Figure 6.9 Topology graphs and connectivity relations between the interior spaces reconstructed for House-1 (left column) and House-2 (right column): (a- b) adjacency and containment relations; (c-d) connectivity and containment relations; (e-f) connectivity relations between the interior spaces.

104

6.2 Evaluation of shape grammar for MW buildings

To demonstrate the potential of the connectivity relations for path planning, a shortest path algorithm (Dijkstra’s algorithm) was implemented using the connectivity graphs. Figure 6.10 shows the resulting path between a start point S and a destination point D in Office. Using the containment relations the computed path can be extracted at a subspace granularity or at a terminal space granularity. At subspace granularity the path is obtained as a sequence of cuboids: 1→2 →3→ 10→ 17→ 24→ 31→ 38→ 39→ 40→ 41→ 42→ 49→ 56. At terminal space granularity, the path is obtained as a sequence of interior spaces: Room 1 → Corridor → Room 5.

Figure 6.10 Path planning (Room 1 → Corridor → Room 5) by applying a shortest path algorithm to the connectivity graph of Office.

6.2.2 Quantitative evaluation

The models of MW building interiors reconstructed from the point clouds of the ISPRS Benchmark datasets (i.e., TUB1, TUB2, Fire Brigade, Uvigo, and UoM), for which ground truth models were available, were quantitatively evaluated in terms of three quality aspects (i.e., completeness, correctness, and accuracy) based on a comparison between the models (i.e., sources) and their corresponding ground truth 3D models (i.e., references). The comparison is between the visible surfaces of the reconstructed models (i.e., sources) with the visible surfaces of the corresponding reference. The annotation of visible surfaces in the volumetric reconstructed models, which contain both visible surfaces and interpreted surfaces can be done automatically using the point-on-face indices (see Chapter 3). Figure

105

6.2 Evaluation of shape grammar for MW buildings

6.11 shows the visible surfaces of the reconstructed volumetric models and the corresponding references with surfaces marked as either visible (light grey and yellow) or interpreted (dark grey) for the ISPRS Benchmark datasets.

(a)

(b)

(c)

(d)

(e) Figure 6.11 The reference models (left) with surfaces marked as either visible (light grey and yellow) or interpreted (dark grey) and the visible surfaces of the source models: (a) TUB1; (b) TUB2; (c) Fire Brigade; (d) Uvigo, (e) UoM. The ceilings are remove from visualization.

106

6.2 Evaluation of shape grammar for MW buildings

The quality of the sources was measured for the buffer size b and the cut-off distance r in the range [1cm, 10cm]. The threshold angle 휃 was set as 10 degrees. The reference models were sampled with the mean point density of 10 points per square centimetre in the measurement of the accuracy.

Figures 6.12 show the evaluation results for the wall surfaces of the five source models (i.e., TUB1, TUB2, Fire Brigade, Uvigo, and UoM) in terms of the three proposed quality metrics: completeness, correctness, and accuracy. Similar to the tendency seen in the evaluation results of the synthetic dataset demonstrated in Chapter 5, the completeness and correctness of the walls in the source models tend to increase with increasing values of the buffer size, and the accuracy is likely to decrease with the larger cut-off values. These variations, however, provide an insight into the geometric quality of the models in terms of the three proposed quality aspects independently.

The graphs in Figures 6.12(a) and 6.12(b) show that the walls in the four models TUB1, TUB2, Fire Brigade, and UoM achieve relatively low completeness and correctness (푀퐶표푚푝 < 0.39, 푀퐶표푟푟 < 0.39) at small sizes of the buffer (e.g., b =

1cm), which correspond to a high accuracy level (푀퐴푐푐 ≅ 0.5 푐푚) at a small cut- off distance (e.g., r = 1cm) as shown in Figure 6.12(c). This reveals that just a small proportion of the source walls are reconstructed with high accuracy. The completeness and correctness of these source walls increase gradually for higher buffer sizes b, but the accuracy is lower with higher values of the cut-off distance r as shown in Figure 6.12(c), since the computations also take into account the contribution of reconstructed elements which contain a certain location deviation with respect to the reference. The walls of these models reach a completeness of about 90% (푀퐶표푚푝 ≅ 0.9 at b = 10cm) with lower accuracy (푀퐴푐푐 > 1 푐푚 at r=10cm). It can be verified by visual inspection that the majority of the reference elements are reconstructed in the source models. The walls of the three models (i.e., TUB1, TUB2, Fire Brigade) achieves a moderate correctness less than 80%. This is the consequence of the wall surfaces of the source models, which contain the surface areas corresponding to surfaces of doors and windows in the references. These surface areas are not included in the comparison with the wall surfaces of the sources.

107

6.2 Evaluation of shape grammar for MW buildings

(a)

(b)

(c)

Figure 6.12 Quantitative evaluation of the source models reconstructed from the ISPRS benchmark dataset: (a) Completeness; (b) Correctness; (c) Accuracy.

108

6.2 Evaluation of shape grammar for MW buildings

An examination of the quality curves reveals differences in the geometric quality of the source models. For example, in the case of the TUB2 model, the walls achieve the maximum completeness of 91% and the maximum correctness of 75% (b = 10cm), which correspond to the accuracy of 2.1cm (r = 10cm). The walls in the UoM model reach a similar completeness of 92% and a better correctness of about 90% (b = 5cm), with a higher accuracy (푀퐴푐푐 ≅ 1 푐푚, r = 5cm). This indicates that the UoM model is reconstructed with better quality than TUB2. The completeness and correctness of the wall elements of all the models increase significantly at larger buffer sizes, except for the UVigo model. The UVigo model has a relatively high completeness (푀퐶표푚푝 ≅ 0.80), a moderate correctness

(푀퐶표푟푟 ≅ 0.72) and a high accuracy (푀퐴푐푐 ≅ 0.5푐푚) even at small buffer sizes and relatively large cut-off distances. This indicates that the lower completeness values at smaller buffer sizes for some models are due to the low accuracy of the models. These results demonstrate that instead of analysing the quality of each model at one single measure, which can be biased and does not provide a holistic understanding, the analysis of the variations of the quality measures over a range of buffer sizes and cut-off distances provides a better insight into the geometric quality of the models.

Localization of modelling errors: The semantic based comparison between the elements in the sources and the corresponding references facilitates the identification of missing and redundant elements by means of completeness and correctness metrics. This enables the proposed evaluation approach to be applicable to change detection and analysis of temporal variations of buildings. Figure 6.13(a) illustrates the localization of completeness errors, where the missing wall elements in the source model UoM are identified and marked (red colour) in the corresponding reference. The correctness errors, which indicate the additional wall elements in comparison with the reference, are identified in the surface-based source model UoM as shown in Figure 6.13(b). These additional elements represent the upper parts of the walls of the corridor and two small rooms, which were reconstructed as separated elements in the source model. Considering the reference UoM as a 3D model of the building captured in the past and the source as its current stage, the missing and the redundant walls are illustration of changes through different stages of the building.

109

6.2 Evaluation of shape grammar for MW buildings

(a)

(b) Figure 6.13 Localization of completeness and correctness errors of the UoM models: (a) missing walls (red colour) are located in the reference; (b) additional walls (red colour) are identified in the surface-based source.

The distances between the wall elements of the surface-based source UoM to the points sampled from the reference walls in the accuracy measure are colorized in Figure 6.14. The colorized map not only demonstrates that the missing elements (dark brown) have large distances to the source wall elements, which exceed the range of the cut-off distance r [1cm, 10cm], but also points out which parts of each wall in the reference are reconstructed with higher quality in the source.

110

6.2 Evaluation of shape grammar for MW buildings

Figure 6.14 Sampled point cloud of the reference walls UoM colorized according to point-surface distances in accuracy measures.

6.2.3 Compatibility with geometric data exchange standards

The reconstructed models consist of navigable spaces and building elements in a hierarchical structure enabling their compatibility with standards for geometric data exchange (e.g., IFC). The navigable spaces and building elements are first converted into Wavefront obj format (Murray and VanRyper 1996) by interpreting the boundary representation {V, F}. The reconstructed building elements and navigable spaces in the obj format are then converted to corresponding elements of an IFC data model using the open-source software FreeCAD. The IFC model can be opened in different BIM viewers and design tools that facilitate further complex spatial analyses and information exchange. Figure 6.15 shows the IFC model of Office-1 along with its structure hierarchy and geometric information.

Figure 6.15 IFC data model of Office-1 (middle) with its hierarchical model tree (left) and the geometry information (right) of a selected wall (green) (viewed by Solibri Model Viewer).

111

6.3 Evaluation of the shape grammar for non-MW buildings

6.3 Evaluation of the shape grammar for non-MW buildings

The extended shape grammar approach for modelling non-MW building interiors

was implemented in MATLAB and C++ on a personal computer (i7-7500U GPU,

2.7GHZ, with 16.0 GB memory). The CGAL C++ library was used in the cell

decomposition process. Experiments were carried out using a synthetic point cloud

and two real point clouds (i.e., SYN, Office-2, Museum) to evaluate the feasibility

of our approach in the reconstruction of indoor environments, which contain both

Manhattan and non-Manhattan structures.

Table 6.5 lists the parameters used in the reconstruction process. The max wall

thickness is defined based on prior knowledge of the indoor environments.

Meanwhile, appropriate values for the convergence factor, normalization and

smoothing coefficients (see Section 4.4) are experimentally found in relation with

the quality of input data (i.e., the level of noise, the completeness of data, and the

level of clutter).

Table 6.5. Parameter settings used in the experiments.

Parameters Max wall Convergence Normalization Smoothing coefficient thickness factor factors 휇 Dataset (m) 훽 푘1 푘2 푘3 case case case 1 2 3

SYN 0.3 0 1/3 1/3 1/3 1 0.83 0.9

Office 0.3 0.2 1/3 1/3 1/3 1 0.83 0.9

Museum 0.4 0.1 1/3 1/3 1/3 1 0.83 0.9

112

6.3 Evaluation of the shape grammar for non-MW buildings

6.3.1 Qualitative evaluation

6.3.1.1 3D reconstructed models

Figure 6.16 shows the cell decompositions and the reconstructed models of the synthetic environment SYN and the two real buildings Office-2 and Museum. The environments are partitioned into a set of arbitrary polyhedral cells. The final models contain not only the final interior spaces (i.e., rooms, corridors) and structural elements (i.e., walls), but also exterior spaces which are located within the scope of the environments. Due to the lack of data representing exterior surfaces of walls, the exterior walls are not reconstructed in the Office-2 model. Meanwhile, parts of the exterior walls are present in the Museum model based on the interpretation from the data points captured in other parts of the building and from knowledges of building architectures, such as the max wall thickness and their topological relations with interior spaces. Both exterior and interior walls are reconstructed in the SYN model as the input data contain data points representing them. Non-terminal shapes are also modelled as inactive elements in the final model, which allows later querying hierarchical structures of shapes at different granularities. Akin to exterior spaces, these inactive elements are excluded from visualization.

Visual examination of the reconstructed models indicates that all navigable (i.e., rooms and corridors) and structural elements (e.g., walls) are correctly reconstructed. However, this qualitative evaluation can provide a subjective and biased view on the quality of the models. Quantitative evaluation of these models based on the comparison between each model and the corresponding ground truth (i.e., SYN, Office-2, Museum) was carried out to provide insights into their geometric quality (section 6.3.2).

113

6.3 Evaluation of the shape grammar for non-MW buildings

(a)

(b)

(c)

Figure 6.16 Reconstructed models, including the 3D cell decomposition (left) and the final models (right): (a) SYN; (b) Office-2; (c) Museum. Green elements indicate interior spaces (i.e., rooms, corridors) and red elements indicate walls.

114

6.3 Evaluation of the shape grammar for non-MW buildings

6.3.1.2 Reconstruction of Topological relations

In addition to the generation of semantically rich 3D models of indoor environments, our proposed approach is able to establish the topological relations, i.e., adjacency, connectivity, and containment, between elements of the models owing to the characteristics of the shape grammar in maintaining the topological correctness. Figure 6.17 shows the reconstructed topology graphs for the Museum, which are extracted at the granularity of non-terminal spaces. Visual inspection indicates that the topological relations between indoor elements are correctly modelled. In the topology graph, each element is represented by its assigned unique id in the model. The blue circles represent the containment relations between the final interior spaces (the rooms and the corridor) and their subspaces (yellow circles). The relations between the final interior spaces and their adjacent walls

(brown circles) are also established. The adjacency relations are represented by brown dashed line in Figure 6.17 (a). The doors are manually inserted into the walls ids 83, 84, 124, and 152 (green circles). This can be simply done by interactively invoking the containment rules to establish the containment attribute of the walls.

The connectivity relations established between both final spaces and sub-spaces are represented as black solid lines as shown in Figure 6.17 (b). Figure 6.18 shows the connectivity at the granularity of the final spaces (i.e., rooms, corridors) of the

Museum.

115

6.3 Evaluation of the shape grammar for non-MW buildings

(a)

(b)

Figure 6.17 Topological graphs between interior spaces of the Museum. (a) adjacency and containment relations; (b) connectivity and containment relations

116

6.3 Evaluation of the shape grammar for non-MW buildings

Figure 6.18 The connectivity relations (black solid line) between final spaces (i.e., rooms, corridors) of the Museum.

6.3.2 Quantitative evaluation

Since the volumetric walls are represented by both visible surfaces and interpreted surfaces (see Chapter 5), the evaluation is based on the comparison between the visible wall surfaces bounding interior spaces (i.e., rooms and corridors) in the reconstructed model with the visible surfaces of the structural elements (i.e., walls, ceilings and floors) bounding the corresponding interior spaces of the ground truth.

Otherwise, the inclusion of interpreted surfaces of the sources in the comparison can lead to low correctness of the models. The quality is measured in terms of three quality metrics, i.e., completeness (Mcomp), correctness (Mcorr), and accuracy (Macc) for the buffer size b and the cut-off distance r in the range [1 cm, 10 cm]. The evaluation results of the three reconstructed models is shown in detail in Figure

6.19.

117

6.3 Evaluation of the shape grammar for non-MW buildings

(a)

(b)

(c) Figure 6.19 Quality evaluation of the final models in terms of completeness (a), correctness (b) and accuracy (c).

118

6.3 Evaluation of the shape grammar for non-MW buildings

As can be seen in Figure 6.19, the quality curves reveal that the models of both the synthetic environment and the real buildings are reconstructed with high completeness (푀푐표푚푝 ≥ 95% at 푏 = 10푐푚) and correctness (푀푐표푟푟 ≥ 95% at 푏 = 10푐푚). It can be verified by visual inspection that the majority of the ground truth elements are present in the reconstructed models. The SYN model is reconstructed with high accuracy (푀푎푐푐 < 0.5푐푚), which agrees with its random error and is measured as the median absolute distance between the surfaces bounding the interior spaces (i.e., rooms, corridors) of the final model and their corresponding visible surfaces in the ground truth. Meanwhile, the accuracy of the Office and the

Museum are 푀푎푐푐 ≈ 2.62푐푚 and 푀푎푐푐 ≈ 2.47푐푚 respectively.

We further locate the completeness and correctness errors in order to identify the additional elements and the missing elements in the reconstructed models in comparison with the ground truths. Figure 6.20 illustrates the localization of the completeness and correctness errors of the Museum. The completeness and correctness errors indicate that the entrance of the Museum is not well reconstructed. In fact, two surfaces on the sides of the entrance were not generated in surface extraction process. Consequently, these parts of the building are not presented in the final model. Most of the surfaces (in blue colour) of this building part having low completeness (푀푐표푚푝 ≤ 0.2) and low correctness (푀푐표푟푟 ≤ 0.2). These elements are therefore considered as the additional surfaces (blue) in the reconstructed model and missing surface (blue) of the ground truth in the final model.

119

6.4 Evaluation of the method for building change detection

(a)

(b)

Figure 6.20 Localization of completeness and correctness errors of the Museum: (a) the completeness errors localized in the ground truth; (b) the correctness errors localized in the reconstructed model.

6.4 Evaluation of the method for building change detection

In this section, the feasibility of the method for building change detection is demonstrated based on comparison between a 3D indoor model and a point cloud of TUB1 from the ISPRS benchmark data. We assume that the 3D model and the point cloud represent the indoor environment TUB1 at different stages.

In this experiment, we compare the visible building elements (i.e., walls, ceilings, and floors) in the 3D model with the point cloud, which contains data points of not only the visible building elements but also open and closed doors, windows, and a

120

6.4 Evaluation of the method for building change detection low level of clutter. As mentioned in section 6.1 Dataset, the point cloud was captured by a Viametris iMS3D mobile scanning system - with 3 cm data accuracy (Khoshelham et al., 2017). Thus, the cut-off threshold can be set at 10 cm (r = 10 cm), which corresponds to the errors of the data, the registration process of the 3D building model and the data, and the error in the reconstruction of the 3D model. Figure 6.21 shows the 3D model and the corresponding point cloud of the TUB1 dataset from the ISPRS benchmark on Indoor Modelling. The ceilings are removed for better visualization.

(a)

(b) Figure 6.21 The ISPRS Benchmark dataset – TUB1: (a) the 3D model with the visible surface marked as light grey; (b) the point cloud.

The results of point classification for the comparison between the 3D model and the point cloud of the TUB1 building of the ISPRS benchmark dataset are shown in Figure 6.22. As can be seen in Figure 6.22(a), the data points belonging to building structures (i.e., walls, ceilings, and floors) are close to the surfaces of the model (≤ 10 푐푚), while the points representing doors and windows, which were

121

6.4 Evaluation of the method for building change detection not reconstructed in the 3D model, and the clutter (e.g., people, furniture) have larger point-surface distances. The data points are then classified based on the point-surface distance. The points are labelled as existing points (푡푦푝푒 = 0) if their point-surface distances are smaller than the cut-off threshold (푟 = 10푐푚). Otherwise, the points are classified as new (푡푦푝푒 = 1) due to their larger point- surface distance as shown in Figure 6.22(b).

(a)

(b)

Figure 6.22 Comparison results of the TUB1 dataset: (a) the point cloud colorized according to point-surface distances; (b) the point cloud with the result of point classification

Figure 6.23 presents the colorization of the surfaces of the 3D model according to the coverage, and the location of redundant and existing surfaces of the model with

122

6.4 Evaluation of the method for building change detection respect to the point cloud.

(a)

(b)

(c) Figure 6.23 Comparison results of the TUB1 of the ISPRS benchmark dataset:

(a) the 3D model with surfaces colorized according to the coverage 푀푐표푣; (b) and (c) the 3D model with the location of redundant elements (blue) and existing

surfaces (yellow) with 푀푐표푣 ≈ 0.2 and 푀푐표푣 ≈ 0.3, respectively.

123

6.5 Summary

Figure 6.23(a) shows that several surfaces of the 3D model are reconstructed with less supporting points than others. Users are enabled to identify the redundant surfaces in the model based on the surface coverage. Figure 6.23(b) and (c) demonstrate the ability of the proposed method in locating the redundant surfaces based on coverage thresholds of 0.2 (푀푐표푣 > 0.2) and

0.3 (푀푐표푣 > 0.3) respectively. The thresholds are set empirically. The surfaces which do not reach the coverage threshold are classified as redundant surfaces in the model. As can be seen in Figure 6.23(c), a surface with 푀푐표푣 ≈ 0.3 is wrongly classified as redundant. This is due to the presence of furniture in the building, which leads to gaps in the point cloud.

6.5 Summary

In this chapter, the approaches for procedural reconstruction of an indoor model from point cloud, and change detection of indoor models, were evaluated. The feasibility of the proposed approach for quality evaluation of indoor models were also demonstrated.

The evaluation of the shape grammar approach for modelling indoor environments with Manhattan world structures showed that the approach can generate not only building elements, but also navigable spaces and topological relations between them, in a hierarchical volumetric representation compatible with geometric data exchange standards. The ability of the approach to predict building elements and interior spaces that are not captured in the point cloud was also demonstrated. Experiments with several point clouds of both synthetic and real indoor environments demonstrated the potential of the shape grammar approach for generating indoor models with high geometric quality and rich semantics as well as topological relations which can be used for path planning and navigation in indoor environments.

The evaluation of the extended shape grammar and the rjMCMC process demonstrated the potential of the approach for generating indoor models from incomplete and inaccurate data of both Manhattan and non-Manhattan world indoor environments, which contain a high level of clutter. The results showed that this approach is able to generate high quality 3D semantic-rich indoor models, 124

6.5 Summary which contains not only building elements (e.g., walls, ceilings, and floors), but also navigable spaces, and topological relations between them. The quality evaluation of the 3D models, which were reconstructed by the shape grammar approaches, demonstrated the potential of the proposed approach for geometric quality evaluation of 3D indoor models in terms of completeness, correctness, and accuracy and the ability to localize errors, i.e., missing and redundant elements in the 3D models. This enables the proposed method to detect changes and analyse temporal variations of the building through a comparison between 3D models of a building interior created at different times

The evaluation of the approach for building change detection through comparison between a 3D model and a point cloud of an indoor environment demonstrated that the proposed approach can detect the discrepancies and locate the new elements of the building represented by the point cloud, which are missing in the 3D model, as well as redundant elements in the model, which do not exist in the point cloud. This enables the proposed approach to detect changes of an indoor environment and update the 3D model.

125

Chapter 7 Conclusion and future work

The two main aims of this thesis were: (1) automated 3D reconstruction of indoor models, and (2) quality evaluation and change detection of indoor models. The automated reconstruction is based on procedural modelling of indoor environments from lidar data. The quality evaluation of indoor models mainly focused on geometric information through a comparison between the model and its ground truth reference. An approach for building change detection via a comparison between a 3D building model and a point cloud was also presented.

This chapter summarizes the contributions, the limitations, and finally the future directions of this study.

7.1 Contributions

The contributions of this thesis are twofold as follows: Automated 3D reconstruction of indoor models: Two approaches were developed for procedural reconstruction of 3D models of indoor environments from lidar data. In comparison with existing approaches, the proposed approaches made the following contributions: ▪ A shape grammar for modelling indoor environments with Manhattan world designs. The approach can generate 3D semantic-rich models, which contain not only building elements (e.g., walls, ceiling, and floors), but also navigable spaces and topological relations (e.g., adjacency, connectivity, and containment) between them. The output models are represented as a volumetric model in a hierarchical structure compatible with geometric data exchange standards. Thus, they are useful for complex spatial analysis and manufacture, as well as enabling the exchange of information between BIM-based systems. The shape grammar also provides the ability to infer

126

7.1 Contributions

and predict interior spaces from incomplete and inaccurate data, which are common characteristics of point clouds capturing indoor environments

▪ A procedural approach for the reconstruction of generic 3D indoor models based on the combination of an extended shape grammar and a data-driven process using a stochastic algorithm, i.e., the rjMCMC. The shape grammar, is an extension of the shape grammar proposed for Manhattan world environments, and is applicable to both Manhattan and non- Manhattan world indoor environments. The combination of the shape grammar and the data-driven process using the rjMCMC with Metropolis Hasting sampling algorithm allows automated application of the grammar rules, which provides flexibility in modelling different indoor architectures. The stochastic algorithm also enables the integration of local data properties and global plausibility and constraints of the model at the intermediate states of the model, which enhances the robustness of the reconstruction method to the inherent noise and incompleteness of the data.

Quality evaluation and change detection of indoor models: The contributions of this study regarding quality evaluation of indoor models and building change detection are as follows: ▪ An approach for quality evaluation and geometric comparison of an indoor model based on comparison with a ground truth reference model. The approach enables quantitative measurement of the geometric quality of 3D indoor models in terms of accuracy, completeness and correctness, and can localize geometric errors of the reconstructed elements. Consequently, it facilitates change detection in indoor models and temporal analysis of building changes. The approach is applicable to both surface-based models and volumetric models generated from different data sources (e.g., point clouds, images, and trajectories). ▪ An approach for building change detection based on the comparison between a 3D model (e.g., as-design models) and a point cloud. This approach can detect the discrepancies and locate the new elements of the building as represented by the point cloud, which are missing in the 3D model, via the point classification step. It can also identify redundant

127

7.2 Limitations

elements in the 3D model, which do not exist in the point cloud, via the measurement of surface coverage.

7.2 Limitations

Although the experiments and results show the potential of the proposed approaches for procedural 3D reconstruction, quality evaluation, and change detection of indoor models using point clouds, the approaches have several inherent limitations:

▪ The application of the reconstruction and change detection approaches is limited to environments with planar surfaces. The building structures are assumed to consist of horizontal or vertical surfaces, which are parallel to horizontal and vertical directions up to a small angle. Consequently, the proposed approaches are not applicable to environments that contain curved surfaces or planar surfaces that are neither horizontal nor vertical. ▪ The current shape grammars can reconstruct spaces and structural elements only, resulting in models with relatively low level of details. Other elements, such as doors, are manually added to the model by simply updating the containment attribute of the other elements without reconstructing their geometry. ▪ The rjMCMC process for modelling non-Manhattan world building interiors is computationally expensive as it involves the generation of a large number of possible indoor models. This makes the reconstruction of large indoor environments a time-consuming process. ▪ The approach for the quality evaluation and geometric comparison of indoor models is not designed for the evaluation of defective and topologically inconsistent models. The approach is also unable to identify shape variations among corresponding building elements of the two compared models. Meanwhile, the comparison of a 3D indoor model and a point cloud cannot provide semantic information (e.g., clutter, building elements, utilities, etc.) about the detected changes. This is because such semantic information is not available in the point cloud of the building.

128

7.3 Future work

7.3 Future work

To overcome the limitations of this study, the following suggestions for future research are put forward:

▪ Extension of the indoor grammar to buildings with non-planar elements (e.g., beams and curved walls). Investigation on the extension of the shape grammar to allow geometric reconstruction of details such as doors, windows and stairs is also necessary. Improving the time-efficiency of the shape grammar approaches for near real-time generation of as-built indoor environments is also a topic for future research. ▪ An approach to validate a 3D indoor model as well as to detect and localise topological inconsistencies in 3D indoor models. In fact, the validation of an indoor model is necessary to ensure the conformity of the model before evaluating its quality. In addition to the comparison of the geometric quality of 3D indoor models, identifying shape variations (e.g., taller walls or shorter walls) among corresponding elements of models can be useful for identifying of systematic errors of each modelling approach. ▪ Adding semantic information to the detected changes through comparison between a 3D building model and a point cloud. Also, taking advantage of the detected changes to update the existing 3D model can result in higher efficiency in generating up-to-date 3D models of indoor environments.

129

Bibliography

Adan, A., Huber, D. (2011). “3D reconstruction of interior wall surfaces under occlusion and clutter.” International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission, IEEE Computer Society, Los Alamitos, CA, USA, 275–281.

Agarwal, P.K. and Sharir, M., 2000. Arrangements and their applications. In Handbook of computational geometry (pp. 49-119). North-Holland.

Australian Building Codes Board, 2016. Upgrading existing building. https://www.abcb.gov.au/Resources/Publications/Education-Training/Upgrading- existing-buildings (28 July 2019)

A. Lindenmayer. Mathematical models for cellular interaction in development, Parts I and II. Journal of Theoretical Biology, 18:280–315, 1968.

Awrangjeb, M. and Fraser, C.S., 2014. An automatic and threshold-free performance evaluation system for building extraction techniques from airborne LiDAR data. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 7(10), pp.4184-4198.

Bacharach, S., 2008, OGC approves KML as open standard, Geospatial Press Releases. Available online: http://geospatialpr.com/2008/04/14/ogc-approves- kml-as-open-standard

Becker, S., 2009. Generation and application of rules for quality dependent façade reconstruction. ISPRS Journal of Photogrammetry and Remote Sensing, 64(6), pp.640-653.

Becker, S., Peter, M., Fritsch, D. (2015). “Grammar-supported 3d Indoor Reconstruction from Point Clouds for "as-built" BIM.” ISPRS Annals of the

130

Photogrammetry, Remote Sensing and Spatial Information Sciences II-3/W4, 17- 24.

Biljecki, F., Ledoux, H., and Stoter, J., 2015. An improved LOD specification for 3D building models and its CityGML realisation with the Random3Dcity procedural modelling engine. Computers, Environment and Urban Systems In submission.

Bassier, M.; Vergauwen, M.; Van Genechten, B (2016). Automated Semantic Labelling of 3D Vector Models for Scan-to-BIM. In Proceedings of the 4th Annual International Conference on Architecture and Civil Engineering, Singapore, 25–26 April 2016; pp. 93–100.

Besl, P.J. and McKay, N.D., 1992. Method for registration of 3-D shapes. In Sensor fusion IV: control paradigms and data structures (Vol. 1611, pp. 586-606). International Society for Optics and Photonics.

Budroni, A., Boehm, J. (2010). “Automated 3D reconstruction of interiors from point clouds.” International Journal of Architectural Computing, 8, 55-73.

Building Efficient Initiatives, 2010. Why focus on existing building? https://buildingefficiencyinitiative.org (28th July 2019)

Becker, R., Falk, V., Hönen, S., Loges, S., Stumm, S., Blankenbach, J., Brell- Cokcan, S., Hildebrandt, L., Vallée, D., 2018. Bim–Towards The Entire Lifecycle. International Journal of Sustainable Development and Planning, 13(1), pp.84-95.

Bosse, M.; Zlot, R.; Flick, P., 2012. Zebedee: Design of a Spring-Mounted 3-D Range Sensor with Application to Mobile Mapping. IEEE Trans. Robot., 28, 1104– 1119.

Brilakis I., Lourakis M., Sacks R., Savarese S., Christodoulou S., Teizer J., and Makhmalbaf A. (2010). "Toward automated generation of parametric BIMs based on hybrid video and laser scanning data." Advanced Engineering Informatics, 24(4), 456-465.

131

Breuer, P.; Eckes, C.; Muller, S, 2007. Hand gesture recognition with a novel IR time-of-flight range camera—A pilot study. In Lecture Notes in Computer Science; Gagalowicz, A., Philips, W., Eds.; Springer: Berlin, Germany; Volume 4418, pp. 247–260.

CGAL, 2019. Computational Geometry Algorithms Library. .

ClearEdge3D, 2019. Edgewise Building. Available online: http://www.clearedge3d.com/products/ edgewise-building/ (accessed on 9 Dec 2019).

Coughlan, J., Yuille, A. (1999). “Manhattan World: compass direction from a single image by Bayesian inference.” In Proceedings of the International Conference on Computer Vision ICCV’99, Kerkyra, Greece, 20–27 September 1999; Volume 2, pp. 941–947.

Chen, Y. and Medioni, G., 1992. Object modelling by registration of multiple range images. Image and vision computing, 10(3), pp.145-155.

Choudhry, R.M., Gabriel, H.F., Khan, M.K., Azhar, S., 2017. Causes of Discrepancies between Design and Construction in Pakistan Construction Industry. Journal of Construction in Developing Countries, 22(2), pp.1-18.

CloudCompare Development Team, 2019, CloudCompare Software, version 2.9. http://www.cloudcompare.org/ (23rd January 2019).

Dick, A., Torr, P., Cipolla, R. and Ribarsky, W. (2004): Modelling and interpretation of architecture from several images. International Journal of Computer Vision 60(2), pp. 111–134.

Dang, M., Lienhard, S., Ceylan, D., Neubert, B., Wonka, P., Pauly, M. (2015). “Interactive design of probability density functions for shape grammars.” ACM Transactions on Graphics (TOG), 34, 206. Doi:10.1145/2816795.2818069

Dehbi., Y.,Hadiji, F.,Gröger, G., Kersting, K.,Plümer, L. (2016). “Statistical relational learning of grammar rules for 3D building reconstruction.” Transactions in GIS, 21, 134-150.

132

Díaz-Vilariño, L., Khoshelham, K., Martínez-Sánchez, J., Arias, P. (2015). “3D modeling of building indoor spaces and closed doors from imagery and point clouds.” Sensors, 15, 3491-3512.

Díaz-Vilariño, L., Verbree, E., Zlatanova, S., Diakité, A. (2017). “Indoor modelling from SLAM-based laser scanner: Door detection to envelope reconstruction.” ISPRS-International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 345-352.

Daniel R dos Santos, Marcos A Basso, Kourosh Khoshelham, Elizeu de Oliveira, Nadisson L Pavan, and George Vosselman. Mapping indoor spaces by adaptive coarse-to-fine registration of RGB-D data. IEEE geoscience and remote sensing letters, 13(2):262–266, 2016.

D. Frijters and A. Lindenmayer, 1974. A model for the growth and flowering of Aster novae-angliae on the basis of table (1,0)Lsystems. In G. Rozenberg and A. Salomaa, editors, L Systems, Lecture Notes in Computer Science 15, pages 24–52. Springer- Verlag, Berlin.

EU BIM Task Group, 2017. Handbook for the introduction of Building Information Modelling by the European Public Sector Strategic action for construction sector performance: driving value, innovation and growth. http://www.eubim.eu/wp- content/uploads/2017/07/EUBIM_Handbook_Web_Optimized-1.pdf (23rd January 2019).

Edelsbrunner, H., 1992. Weighted alpha shapes. Technical Report UIUCDCS-R- 92-1760, Dept. Comput. Sci., Univ. Illinois, Urbana, IL, 1992.

Eastman, C., Teicholz, P., Sacks, R. and Liston, K., 2011. BIM handbook: A guide to building information modeling for owners, managers, designers, engineers and contractors. John Wiley & Sons.

Edelsbrunner, H. and Mücke, E.P., 1994. Three-dimensional alpha shapes. ACM Trans. Graph13(1), 43–72.

Elberink, S.O. and Khoshelham, K., 2015. Automatic extraction of railroad centerlines from mobile laser scanning data. Remote sensing, 7(5), pp.5565-5583.

133

Elberink, S.O., Khoshelham, K., Arastounia, M. and Benito, D.D., 2013. Rail track detection and modelling in mobile laser scanner data. The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences , pp.223- 228.

Ebert, D., Musgrave, K., Peachey, P., Perlin, K., and Worley, S., 1998. Texturing and Modelling: A procedural approach. AP Professional ISBN 0-12-228730-4

Esri, 2019a. CityEngine. http://www.esri.com/software/cityengine.

Esri, 2019b. Marseille Urban Planning Project. http://www.esri.com/software/cityengine/industries/marseille (6 September 2019).

Faro, 2019. Faro Focus. https://www.faro.com/en-gb/products/construction-bim- cim/faro-focus/, (28 August 2019)

Forstner, W. and Khoshelham, K., 2017. Efficient and accurate registration of point clouds with plane to plane correspondences. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2165-2173).

Furukawa, Y.; Curless, B.; Seitz, S.M.; Szeliski, R., 2009. Reconstructing building interiors from images. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision; pp. 80–87.

Fu, K.S. (1981), Syntactic Pattern Recognition and Applications, Prentice Hall.

GeoSLAM, 2019. Zeb Revo RT. https://geoslam.com/zeb-revo-rt/ (6 September 2019).

Green, P.J. (1995). "Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination". Biometrika. 82 (4): 711–732. .

Gröger, G., Plümer, L. (2010). “Derivation of 3D indoor models by grammars for route planning.” Photogrammetrie-Fernerkundung-Geoinformation, 191-206.

Grussenmeyer, P., Landes, T., Doneus, M., Lerma, J.L. (2016). Basics of range- based modelling techniques in Cultural Heritage. In 3D Recording, Documentation and Management in Cultural Heritage; Whittles Publishing: Caithness, UK, Chapter 6.

134

Hastings, W.K., 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1), 97- 109.

Heiko Hirschmuller, 2008. Stereo processing by semiglobal matching and mutual information. IEEE Transactions on pattern analysis and machine intelligence, 30(2):328–341, 2008

H. Zhang and S. Zlatanova (eds.), Academic Track of Geoweb 2009: Cityscapes, ISPRS, 46-53.

Hong, S., Jung, J., Kim, S., Cho, H., Lee, J., Heo, J. (2015). “Semi-automated approach to indoor mapping for 3D as-built building information modelling.” Computers, Environment and Urban Systems, 51, 34-46.

Hossain, M.A., Yeoh, J.K.W., 2018. BIM for Existing Buildings: Potential Opportunities and Barriers. In IOP Conference Series: Materials Science and Engineering, 371(1), p. 012051. IOP Publishing.

Ikehata, S., Yang, H., Furukawa, Y. (2015). “Structured indoor modelling.” In Proceedings of the IEEE International Conference on Computer Vision, 1323- 1331.

Javadnejad, F., Simpson, C. H., Gillins, D. T., Claxton, T., and Olsen, M. J. (2017). "An Assessment of UAS-Based Photogrammetry for Civil Integrated Management (CIM) Modeling of Pipes." Pipelines 2017, American Society of Civil Engineers, Reston, VA, 112-123.

Jenke, P., Huhle, B., Strasser, W. (2009). “Statistical reconstruction of indoor scenes.” International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision, 17–24.

Khoshelham, K., 2015. Direct 6-DoF pose estimation from point-plane correspondences. Digital Image Computing: Techniques and Applications (DICTA), 2015 International Conference on, pp. 1-6.

Khoshelham, K., 2016. Closed-form solutions for estimating a rigid motion from plane correspondences extracted from point clouds. ISPRS Journal of Photogrammetry and Remote Sensing 114, 78-91.

135

Khoshelham, K., 2018. Smart Heritage: Challenges in Digitisation and Spatial Information Modelling of Historical Buildings. 2nd Workshop On Computing Techniques For Spatio-Temporal Data in Archaeology And Cultural Heritage, Melbourne, Australia.

Kourosh Khoshelham, Daniel Dos Santos, and George Vosselman. Generation and weighting of 3d point correspondences for improved registration of RGB-D data. The International Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences , 5:W2, 2013.

Khoshelham, K. and Díaz-Vilariño, L., 2014. 3D modelling of interior spaces: learning the language of indoor architecture. The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences XL-5, 321- 326.

Khoshelham, K., Tran, H., Díaz-Vilariño, L., Peter, M., Kang, Z. and Acharya, D., 2018. An evaluation framework for benchmarking indoor modelling methods. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences, 42(4).

Khoshelham, K., Díaz Vilariño, L., Peter, M., Kang, Z., Acharya, D., 2017. “The ISPRS Benchmark on Indoor Modelling.” The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W7, 367-372.

Khoshelham, K. and Elberink, S.O., 2012. Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors, 12(2), pp.1437-1454.

Kolbe, T.H., Gröger, G., Plümer, L., 2005. CityGML: Interoperable access to 3D city models, In Geo-information for disaster management. 883-899. Springer, Berlin, Heidelberg.

Kimberly, D., Frischer, B., Müller, P., Ulmer, A., and Haegler, S., 2009. Rome reborn 2.0: A case study of virtual city reconstruction using procedural modeling techniques. In Proceedings of the 37th international conference on computer applications and quantitative methods in archaeology (CAA), pp. 62–66.

136

Lee, J., Li, K., Zlatanova, S., Kolbe, T., Nagel, C., Becker, T., 2014. “OGC® indoorgml. Version 1.0. OGC Doc. No. 14-005r1.” Open Geospatial Consortium, (10 February 2018)

Leica, 2019a. Pegasus: Backpack. https://leica-geosystems.com/products/mobile- sensor-platforms/capture-platforms, (1 Feb. 2019).

Leica, 2019b. Laser scanners. https://leica-geosystems.com/products/laser- scanners (6 September 2019).

Lehtola, V.V.; Virtanen, J.P.; Vaaja, M.T.; Hyyppä, H.; Nüchter, A., 2016. Localization of a mobile laser scanner via dimensional reduction. ISPRS J. Photogramm. Remote Sensing, 121, 48–59.

Lehtola, V.V., Kaartinen, H., Nüchter, A., Kaijaluoto, R., Kukko, A., Litkey, P., Honkavaara, E., Rosnell, T., Vaaja, M.T., Virtanen, J.P. and Kurkela, M., 2017. Comparison of the selected state-of-the-art 3D indoor scanning and point cloud generation methods. Remote Sensing, 9(8), p.796.

Liebich, T., 2009. “IFC 2x edition 3, model implementation guide version 2.0.” BuildingSMART international modeling support group, (10 February 2018).

Lichti, D.D., 2008. Self-calibration of a 3D range camera. In Proceedings of International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences, Beijing, China; pp. 927–932.

Lindner, M., Schiller, I., Kolb, A. and Koch, R., 2010. Time-of-flight sensor calibration for accurate range sensing. Computer Vision and Image Understanding, 114(12), pp.1318-1328.

Liu, T.; Carlberg, M.; Chen, G.; Chen, J.; Kua, J.; Zakhor, A., 2010. Indoor localization and visualization using a human-operated backpack system. In Proceedings of the 2010 International Conference on Indoor Positioning and Indoor Navigation, Zurich, Switzerland; pp. 1–10.

137

Macher, H., Landes, T., Grussenmeyer, P., 2017. “From Point Clouds to Building Information Models: 3D Semi-Automatic Reconstruction of Indoors of Existing Buildings.” Applied Sciences, 7, 1030. Doi: 10.3390/app7101030

Microsoft, 2019a. Kinect. Available online: http://www.xbox.com/en-us/kinect/ (28 August 2019)

Microsoft, 2019b. HoloLens (1st gen) hardware details. https://docs.microsoft.com/en-us/windows/mixed-reality/hololens-hardware- details (20 March 2019).

Matterport, 2019. Matterport: https://matterport.com/ (28 August 2019).

MESA Imaging, 2019. Available online: http://www.mesa-imaging.ch/ (28 August 2019).

Microdrones, 2019. https://www.microdrones.com/en/drones/md4-1000/ (28 August 2019)

Müller, P., Wonka, P., Haegler, S., Ulmer, A., Van Gool, L. (2006). “Procedural modeling of buildings.” ACM Transactions on Graphics (TOG), 25, 614–623.

Mura, C., Mattausch, O., Pajarola, R. (2016). “Piecewise‐planar Reconstruction of Multi‐room Interiors with Arbitrary Wall Arrangements.” Computer Graphics Forum, 35, 179-188.

Mura, C., Mattausch, O., Villanueva, A.J., Gobbetti, E., Pajarola, R. (2014). “Automatic room detection and reconstruction in cluttered indoor environments with complex room layouts.” Computers and Graphics, 44, 20-32.

Murray, J.D., VanRyper, W (1996). Graphics File Formats. In Encyclopedia of Graphics File Formats, 2nd ed.; O’Reilly: Sebastopol, CA, USA; pp. 946–952.

Musialski, P., Wonka, P., Aliaga, D.G., Wimmer, M., Van Gool, L. and Purgathofer, W., 2013, September. A survey of urban reconstruction. In Computer graphics forum (Vol. 32, No. 6, pp. 146-177).

138

Marvie, J.-E., Perret, J. and Bouatouch, K. (2005): The fl-system: a functional l- system for procedural geometric modeling. The Visual Computer 21(5), pp. 329 – 339.

Moran, P. A. P. (1950). "Notes on Continuous Stochastic Phenomena". Biometrika. 37 (1), 17–23. doi:10.2307/2332142

Mitchell, W. J. (1990): The Logic of Architecture: Design, Computation, and Cognition. Cambridge, Mass.: The MIT Press.

Metropolis, N., Rosenbluth, A.W., Rosenbluth, M., Teller, A., 1953. Equation of state calculations by fast computing machines. J. Chem. Phys. 21 (6), 1087–1092.

Merrell, P., Schkufza, E. and Koltun, V., 2010. Computer-generated residential building layouts. In ACM Transactions on Graphics, Vol. 29, No. 6, p. 181.

Nagel, C., Stadler, A. and Kolbe, T.H., 2009. Conceptual requirements for the automatic reconstruction of building information models from uninterpreted 3D models. In Proceedings of the Academic Track of the Geoweb 2009-3D Cityscapes Conference in Vancouver, Canada, 27-31 July 2009.

NavVis, 2019. M6. https://www.navvis.com/m6, (1 February 2019).

Nikoohemat, S., Peter, M., Oude Elberink, S., Vosselman, G., 2018. Semantic Interpretation of Mobile Laser Scanner Point Clouds in Indoor Scenes Using Trajectories. Remote sensing, 10(11), p.1754.

Office of Projects Victoria, 2019. Victorian Digital Asset Strategy: Strategy framework. http://www.opv.vic.gov.au/Victorian-Chief-Engineer/Victorian- Digital-Asset-Strategy (27th July 2019).

Oesau, S., Lafarge, F., Alliez, P., 2014. “Indoor scene reconstruction using feature sensitive primitive extraction and graph-cut.” ISPRS Journal of Photogrammetry and Remote Sensing, 90, 68-82.

Ochmann, S., Vock, R., Wessel, R. and Klein, R., 2016. Automatic reconstruction of parametric building models from indoor point clouds. Computers & Graphics, 54, pp.94-103.

139

Ochmann, S., Vock, R., & Klein, R., 2019. Automatic reconstruction of fully volumetric 3D building models from oriented point clouds. ISPRS Journal of Photogrammetry and Remote Sensing, 151, 251-262.

P. Prusinkiewicz, 1986. Graphical applications of L-systems. In Proceedings of Graphics Interface ’86 — Vision Interface ’86, pages 247–253. CIPS.

Previtali, M., Díaz-Vilariño, L. and Scaioni, M., 2018. Indoor building reconstruction from occluded point clouds using graph-cut and ray-tracing. Applied Sciences, 8(9), p.1529.

Palladio, A., 1997. The Four Books on Architecture, Cambridge: MIT Press 4. 64- 79.

Parish, Y.I.H., Müller, P., 2001. “Procedural modeling of cities.” Proceedings of the 28th annual conference on Computer graphics and interactive techniques, SIGGRAPH ’01, 301–308.

Pătrăucean, V., Armeni, I., Nahangi, M., Yeung, J., Brilakis, I., Haas, C., 2015. “State of research in automatic as-built modelling.” Advanced Engineering Informatics, 29, 162-171.

Pesci, A., Teza, G. and Bonali, E., 2011. Terrestrial laser scanner resolution: Numerical simulations and experiments on spatial sampling optimization. Remote Sensing, 3(1), pp.167-184.

PMD, 2019. CamCube 3.0. Available online: http://www.pmdtec.com/products- services/ pmdvisionr-cameras/pmdvisionr-camcube-30/ (28 August 2019).

Previtali, M., Scaioni, M., Barazzetti, L., Brumana, R., 2014. “A flexible methodology for outdoor/indoor building reconstruction from occluded point clouds.” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 119-126.

Prusinkiewicz, P., Lindenmayer, A., 1990. The algorithmic beauty of plants, Springer New York. ISBN: 3540972978

140

Runne, H., Niemeier, W., Kern, F., 2001. “Application of Laser Scanners to Determine the Geometry of Buildings.” In Proceedings of the Optical 3-D Measurement Techniques V, Vienna, Austria, 1–4 October 2001, 41–48.

Ripperda, N. and Brenner, C., 2007. Data driven rule proposal for grammar based facade reconstruction. Photogrammetric Image Analysis, 36(3/W49A), pp.1-6.

Ripperda, N. and Brenner, C., 2009, June. Application of a formal grammar to facade reconstruction in semiautomatic and automatic environments. In Proc. of the 12th AGILE Conference on GIScience (pp. 1-12).

Ripperda, N., 2008. Grammar based facade reconstruction using rjMCMC. Photogrammetrie Fernerkundung Geoinformation, (2), pp.83-92.

Rottensteiner, F., Sohn, G., Gerke, M., Wegner, J.D., Breitkopf, U., Jung, J., 2014. Results of the ISPRS benchmark on urban object detection and 3D building reconstruction. ISPRS Journal of Photogrammetry and Remote Sensing, 93, pp.256-271.

Rutzinger, M., Rottensteiner, F. and Pfeifer, N., 2009. A comparison of evaluation techniques for building extraction from airborne laser scanning. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2(1), pp.11- 20.

Rusinkiewicz, S., Levoy, M., 2001. Efficient variants of the ICP algorithm. In: Proceedings of Third International Conference on 3-D Digital Imaging and Modeling. IEEE Computer Soc, Los Alamitos, pp. 145–152

Sanchez, V., Zakhor, A. (2012). “Planar 3D modeling of building interiors from point cloud data.” International Conference on Image Processing, IEEE Signal Processing Society, Piscataway, NJ, USA, 1777–1780.

Smelik, R.M., Tutenel, T., Bidarra, R. and Benes, B., 2014, September. A survey on procedural modelling for virtual worlds. In Computer Graphics Forum (Vol. 33, No. 6, pp. 31-50).

Sony, 2019. https://www.sony.com.au/electronics/interchangeable-lens- cameras/ilce-6000-body-kit (28 August 2019)

141

SpeedTree, 2019. SpeedTree. http://www.speedtree.com/ (6 September 2019)

Stiny, G., 2008. Shape: talking about seeing and doing, The MIT Press.

Stiny, G., Mitchell, W. J., 1978. “The palladian grammar”. Environment and planning B: Planning and design, 5, 5-18.

Stiny, G. & Gips, J., 1972. Shape grammars and the generative specification of painting and sculpture. In Information Processing 71, 1460–1465. North-Holland Publishing Company.

Schmidt, A., Lafarge, F., Brenner, C., Rottensteiner, F., Heipke, C., 2017. Forest point processes for the automatic extraction of networks in raster data. ISPRS Journal of Photogrammetry and Remote Sensing, 126, pp.38-55.

Schnabel, R., Wahl, R., Klein, R., 2007. Efficient RANSAC for point-cloud shape detection. Comput. Graph. Forum 26, 214–226.

Tang, P., Huber, D., Akinci, B., Lipman, R., Lytle, A., 2010. “Automatic reconstruction of as-built building information models from laser-scanned point clouds: A review of related techniques.” Automation in Construction, 19, 829-843.

Thomson, C. and Boehm, J., 2015. “Automatic geometry generation from point clouds for BIM.” Remote Sensing, 7, 11753-11775.

Tran, H., Khoshelham, K., Kealy, A., Díaz Vilariño, L., 2017. “Extracting Topological Relations Between Indoor Spaces From Point Clouds.” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences 4, 401- 406.

Tran, H., Khoshelham, K., 2019a. A stochastic approach to automated reconstruction of 3D models of interior spaces from point clouds, ISPRS Geospatial Week 2019, Enschede, The Netherlands.

Tran, H., Khoshelham, K., 2019b. Building Change Detection Through Comparison of a Lidar Scan with a Building Information Model, ISPRS Geospatial Week 2019, Enschede, The Netherlands.

142

Tran, H., Khoshelham, K., Kealy, A., 2019. Geometric comparison and quality evaluation of 3D models of indoor environments. ISPRS Journal of Photogrammetry and Remote Sensing 149, 29-39.

Tran, H., Khoshelham, K., Kealy, A., Díaz-Vilariño, L., 2018. Shape Grammar Approach to 3D Modelling of Indoor Environments Using Point Clouds. Journal of Computing in Civil Engineering In Press.

Van Nederveen, G., & Tolman, F., 1992. Modelling multiple views on buildings. Automation in Construction. 1.1

Valero, E., Adán, A., Cerrada, C., 2012. “Automatic method for building indoor boundary models from dense point clouds collected by laser scanners.” Sensors, 12, 16099-16115.

Volk, R., Stengel, J., Schultmann, F., 2014. “Building Information Modeling (BIM) for existing buildings-Literature review and future needs.” Automation in construction, 38, 109-127.

Viametrics, 2019. iMS3D. https://www.viametris.com/ (6 September 2019)

Wang, J., Sun, W., Shou, W., Wang, X., Wu, C., Chong, H.Y., Liu, Y., Sun, C., 2015. Integrating BIM and LiDAR for real-time construction quality control. Journal of Intelligent & Robotic Systems, 79(3-4), pp.417-432.

Wonka, P., Wimmer, M., Sillion F., Ribarsky, W., 2003. “Instant architecture.” ACM Transaction on Graphics, 22, 669–677.

Xiao, J., Furukawa, Y., 2014. Reconstructing the world’s museums. International Journal of Computer Vision110, 243-258.

Xiong, X., Adan, A., Akinci, B., Huber, D., 2013. “Automatic creation of semantically rich 3D building models from laser scanner data.” Automation in Construction, 31, 325-337.

Yasutaka Furukawa and Jean Ponce, 2010. Accurate, dense, and robust multiview stereopsis. IEEE transactions on pattern analysis and machine intelligence, 32(8):1362–1376,. Furukawa and Ponce, 2010

143

Zhang, J.; Singh, S., 2014. LOAM: Lidar odometry and mapping in real-time. In Proceedings of the Robotics: Science and Systems Conference (RSS); pp. 109– 111.

144

Minerva Access is the Institutional Repository of The University of Melbourne

Author/s: Tran, Ha Thi Thu

Title: Procedural 3D reconstruction and quality evaluation of indoor models

Date: 2019

Persistent Link: http://hdl.handle.net/11343/233726

File Description: Final thesis file

Terms and Conditions: Terms and Conditions: Copyright in works deposited in Minerva Access is retained by the copyright owner. The work may not be altered without permission from the copyright owner. Readers may only download, print and save electronic copies of whole works for their own personal non-commercial use. Any use that exceeds these limits requires permission from the copyright owner. Attribution is essential when quoting or paraphrasing from these works.