Making sense of building data and building product data Pieter Pauwels Thomas Krijnen, Jakob Beetz Dept. of Architecture and Urban Planning Dept. Built Environment Ghent University Eindhoven University of Technology J. Plateaustraat 22 P.O. Box 513 B-9000 Ghent, Belgium NL-5600 MB Eindhoven, The Netherlands [email protected] [email protected]; [email protected]

Abstract Standardization (ISO). The IFC allows to seman- tically describe a building as a digital building The architectural design and construc- model, including element types (walls, windows, tion domain works with massive amounts spaces), complex 3D geometry, custom property of data (building data, engineering data, sets and many more. Capturing such information product manufacturer data, geographical in digital format is generally referred to as Build- data, regulation data) on a daily basis. ing Information Modelling (BIM) (Eastman et al., More and more of this data is being han- 2008). dled using semantic web technologies. In addition to a core data exchange model, bSI This position paper documents existing produced the BuildingSMART Data Dictionary2 initiatives, focusing on the Industry Foun- (bSDD), which can be considered as a hierar- dation Classes (IFC) ISO standard and the chically structured encyclopedia of the different buildingSMART Data Dictionary (bSDD) terms and concepts that are available in the in- and outlines how a multilingual lexical- ternational building product manufacturing mar- ized semantic network like BabelNet can ket. This encyclopedia is instantiated by a con- make a useful contribution to this particu- cept repository following the ISO 12006 guide- lar domain. lines (ISO 12006, 2005) with an API so that it al- 1 Introduction lows (1) to creating multiple dictionaries, ontolo- Building data is modelled by many stakeholders gies and other content and (2) to mapping content involved in the building process, including archi- in these ontologies and dictionaries. The bSDD tects, engineers, contractors and owners. Captur- is multilingual and contains tens of thousands of ing the unambiguous meaning of the many con- concepts and relationships representing interna- cepts handled in the construction industry is one tional building classifications and codes. of longest standing challenges in this domain. The Both the IFC data model and the bSDD are be- construction of many buildings using IT tools can ing made available as RDF graphs within bSI, al- be compared to the construction of the Tower of lowing its usage outside the restricted and closed Babel, in which the building ‘fails’ as those work- construction industry domain. In this position pa- ing on it could no longer communicate properly1. per, we give a brief overview of these efforts. We With the advent of information technologies, this furthermore outline how the available sources can 3 is typically also referred to as an interoperability be enriched with links to the BabelNet data, fin- problem. ishing with an outline of how this can benefit the This interoperability challenge is being ad- construction industry expert. dressed since many decades with the tremendous 2 buildingSMART and semantic web efforts on the production of a standardised data technologies exchange format, Industry Foundation Classes (IFC) (Liebich et al., 2013), which is modelled in The construction domain is now looking into the the EXPRESS information language (ISO, 2004). usage of semantic web technologies for enabling a IFC is standardised by buildingSMART Interna- decentralised building data management approach tional (bSI) and the International Organization for and linking building data more effortlessly with 1see also http://constructioncode.blogspot.be/2012/07/end- 2http://bsdd.buildingsmart.org/ of-babel-ifc-promotional-video.html 3http://babelnet.org/ data in other domains (product manufacturer data, cept, and version information is included. Further- geographical data, regulation data). This has more, this concept is typed as a ‘subject’. Other evolved into the W3C Linked Building Data Com- concept types used are property, bag, document, munity Group4 and the BuildingSMART Linked classification, measure, unit, value, nest, activity, Data Working Group5 (LDWG). One of the re- and so forth. This information is now explicitly sults of these efforts is a conversion from EX- contained as strings in the RDF graph, but they PRESS to OWL (TBox) and from IFC (an EX- could clearly be represented in a multitude of alter- PRESS schema) to a corresponding RDF graph native RDF graph representations. For example, (ABox) (Pauwels and Terkaj, 2016; Beetz et al., properties could be listed as object and data prop- 2013). As a result, BIM models can now be made erties if they are available for a particular subject, available as RDF graphs that comply with the ifc- which is the case for the Calcium silicate board OWL ontology6. For reference, the ifcOWL on- concept. tology contains 1230 OWL classes, 1578 object properties, 1627 individuals (Pauwels and Terkaj, a owl:Class ; 2016). A sample repository with open ifcOWL rdfs:comment "Bygningsplate basert p compliant RDF graphs is available7, but most sam- kalsiumsilikat (sement, kisel og kalk ), med armering av cellulosefiber. ple RDF data is not publicly available. Platene fremstilles ved The construction sector also works intensively autoklavherding. Benyttes innendrs i 8 miljer hvor det stilles krav til on a BuildingSMART Data Dictionary (bSDD ), fuktbestandighet og brannbeskyttelse which can be considered as a hierarchically struc- ."@nb-no ; tured encyclopedia of the different element types rdfs:comment "Building board based on calcium silicate (cement, silica and that are used through classification systems in the mortar), with reinforcement of international building product manufacturing mar- cellulose filament. The boards being ket. Using this multilingual data dictionary, which produced by autoclave curing. Being used in environments with demands to is designed as a thesaurus for representing clas- moisture resistance and fire sifications in parallel in a common repository, it protection."@en ; is not only possible to describe more precisely rdfs:label "Calcium silicate board"@en , rdfs:label "Kalsiumsilikatplate"@nb what types of building products are made available -no ; by manufacturers, but it also makes the exchange :conceptType "SUBJECT" ; :guid "3MyXi0NvmHt00000PR1IRl" ; and (multilingual) interpretation of these data eas- :status "DRAFT" ; ier. Efforts are also underway to make the bSDD :versionDate "2007.09.10" ; data available as an RDF graph (Beetz, 2014). A :versionId "1 2007.09.10" . 9 sample bSDD dataset is temporarily available, Listing 1: RDF graph for the Calcium silicate containing a total of 986161 triples. However, a board concept in the bSDD. standard procedure for generating the RDF graphs from the bSDD API is not available, nor are there any links with external vocabularies available (like 3 Combining IFC and bSDD with ifcOWL or BabelNet). BabelNet One of the lexical concepts available in the The bSDD contains lexical concepts that are made bSDD, namely the Calcium silicate board con- available from within the construction industry 10 cept , is reproduced in Listing 1. As can be seen sector only. By providing the data as RDF graphs, in this example, this concept contains a definition however, the data can easily be enriched with lex- in English (@en) and Norwegian (@nb-no), it has ical data. As a first step, a number of vocab- a GUID that is maintained in the URI of the con- ularies have been semi-automatically pre-aligned 4http://w3.org/community/lbd/ (Shvaiko, 2013) in the context of the FP7 DU- 5http://buildingsmart-tech.org/future/linked-data RAARK project. In a second step, these pre- 6https://w3id.org/ifc/IFC4 ADD1# alignment relations between concepts in differ- 7 http://smartlab1.elis.ugent.be:8889/IFC-repo/ ent vocabularies can be reinforced by experts or 8http://bsdd.buildingsmart.org/ 9http://bw-dssv16.bwk.tue.nl:8080/openrdf- crowds. To facilitate this, a user interface has been workbench/repositories/bsdd/summary created11, of which a screenshot is shown in Fig. 1. 10http://bsdd.buildingsmart.org/#concept/de- tails/3MyXi0NvmHt00000PR1IRl 11http://bw-dssv19.bwk.tue.nl/interlink/ Figure 1: Screenshot of a dedicated concept mapping web interface for the bSDD.

This interface presents the user with a concept 2. allowing links to externally available struc- in the bSDD and a concept in an external vocab- tured vocabularies such as BabelNet. ulary, including the AAT Getty Arts & Architec- This case was presented for the IFC schema and 12 13 ture Thesaurus , WordNet , the TGN Getty The- the bSDD data dictionary. An interface was also 14 15 saurus of Geographic Names , and DBPedia . presented that allows to interactively create links The interface then allows to specify the link be- from various concepts in the bSDD to concepts tween the two presented concepts, namely the con- in various outside schema’s (AAT, TGN, DBPe- cept for Sand in the bSDD and in the Getty AAT dia, WordNet). Making this effort can result in a in the case of Fig. 1. global semantic network of concepts, thus consid- As a result, the concepts of these diverse the- erably enlarging the set of concepts and descrip- sauri can be combined into a global semantic net- tions that is currently available in the bSDD. This work of concepts. The same approach could also semantic network of concepts can then be used be followed for enriching the bSDD with lexical in addition to existing BIM tools and services to concepts available in BabelNet. Such a consid- classify building elements and exchanging well- erably enriched multilingual lexicalized semantic defined information between construction industry network that includes bSDD, AAT, WordNet, Ba- stakeholders. belNet, WordNet, and TGN can be highly useful Of course, this is but an initial outline of what for the construction industry stakeholders in the could be realised. Further research is highly nec- sense that it helps construction domain specialists essary, more particular for: to get used to semantically structuring their data using state of the art semantic technologies. It also 1. optimising and finalising the conversion from allows extending initial data sources specific to the the information in the bSDD to a usable RDF construction industry (bSDD as well as ifcOWL) graph, with multilingual data from third-party vocabular- 2. linking concepts in the bSDD graph to out- ies (e.g. BabelNet). side concepts, and 4 Conclusions 3. designing an optimal strategy to maintain links, authority and ownership (see 4 strate- In this position paper, we made a case for: gies presented in (Beetz, 2014)). 1. making building data available as structured Acknowledgments RDF graphs, For their support, the authors would like to thank 12 http://vocab.getty.edu/aat/ the Special Research Fund (BOF) of Ghent Uni- 13http://www.w3.org/2006/03/wn/wn20/instances/ 14http://vocab.getty.edu/tgn/ versity, and the Eindhoven University of Technol- 15http://dbpedia.org/resource/ ogy. References T. Liebich, Y. Adachi, J. Forester, J. Hyvari- nen, S. Richter, T. Chipman, M. Weise, and J. Wix. 2013. Industry Foundation Classes IFC4 Official Release. available online: http://www.buildingsmart-tech. org/ifc/IFC4/final//index.htm. Anders Ekholm 2005. Iso 12008-2 and Ifc Prereq- uisites for Coordination of Standards for Classifica- tion and Interoperability. ITCon 10(2005):275-289. Jakob Beetz, Jos Van Leeuwen, and Bauke de Vries. 2009. IfcOWL: a case of transforming EXPRESS schemas into ontologies. Artificial Intelligence for Engineering Design, Analysis and Manufacturing, 23(1):89–101. Pieter Pauwels and Walter Terkaj. 2016. EXPRESS to OWL for construction industry: Towards a recom- mendable and usable ifcOWL ontology. Automation in Construction, 63:100–133. ISO International Organization for Standardiza- tion. 2006. ISO 10303-11: Industrial automa- tion systems and integration - Product data representation and exchange - Part 11: De- scription methods: The EXPRESS language reference manual. available online: http: //www.iso.org/iso/iso\_catalogue/ catalogue\_tc/catalogue\_detail. htm?csnumber=38047. ISO International Organization for Standardization. 2005. 12006-3:2006 Building construction Organi- zation of information about construction works Part 3: Framework for object-oriented information. Charles M. Eastman, Paul Teicholz, Rafael Sacks, and Kathleen Liston. 2008. BIM handbook: a guide to building information modeling for owners, man- agers, architects, engineers, contractors, and fabri- cators. John Wiley & Sons, Hoboken, NJ, USA.

Jakob Beetz. 2014. A scalable network of concept li- braries using distributed graph databases. Proceed- ings of the 2014 International Conference on Com- puting in Civil and Building Engineering, Orlando, FL, pp. 569–576.

P. Shvaiko & J.Euzenat 2013. Ontology Match- ing: State of the Art and Future Challenges. IEEE Transactions on Knowledge and Data Engineering 25(1):158-176