Mobile Augmented Reality for Semantic 3D Models -A Smartphone-Based Approach with Citygml

Total Page:16

File Type:pdf, Size:1020Kb

Mobile Augmented Reality for Semantic 3D Models -A Smartphone-Based Approach with Citygml Mobile Augmented Reality for Semantic 3D Models -A Smartphone-based Approach with CityGML- Christoph Henning Blut Veröffentlichung des Geodätischen Instituts der Rheinisch-Westfälischen Technischen Hochschule Aachen Mies-van-der-Rohe-Straße 1, 52074 Aachen NR. 70 2019 ISSN 0515-0574 Mobile Augmented Reality for Semantic 3D Models -A Smartphone-based Approach with CityGML- Von der Fakultät für Bauingenieurwesen der Rheinisch-Westfälischen Technischen Hochschule Aachen zur Erlangung des akademischen Grades eines Doktors der Ingenieurwissenschaften genehmigte Dissertation vorgelegt von Christoph Henning Blut Berichter: Univ.-Prof. Dr.-Ing. Jörg Blankenbach Univ.-Prof. Dr.-Ing. habil. Christoph van Treeck Tag der mündlichen Prüfung: 24.05.2019 Diese Dissertation ist auf den Internetseiten der Universitätsbibliothek online verfügbar. Veröffentlichung des Geodätischen Instituts der Rheinisch-Westfälischen Technischen Hochschule Aachen Mies-van-der-Rohe-Straße 1, 52074 Nr. 70 2019 ISSN 0515-0574 Acknowledgments I Acknowledgments This thesis was written during my employment as research associate at the Geodetic Institute and Chair for Computing in Civil Engineering & Geoinformation Systems of RWTH Aachen University. First and foremost, I would like to express my sincere gratitude towards my supervisor Univ.-Prof. Dr.-Ing. Jörg Blankenbach for his excellent support, the scientific freedom he gave me and the inspirational suggestions that helped me succeed in my work. I would also like to thank Univ.-Prof. Dr.-Ing. habil. Christoph van Treeck for his interest in my work and the willingness to take over the second appraisal. Many thanks go to my fellow colleagues for their valuable ideas towards my research and the fun after-work activities that will be remembered. Last but not least, I am grateful to my family for the support and motivation they gave me. A very special thank you goes to my brother Timothy Blut for the inspiring extensive discussions about my work throughout this journey, sometimes until late into the night, that where very helpful in finding great new ideas. Aachen, June 2019 Christoph Henning Blut II Abstract Abstract The increasing popularity of smartphones over the past 10 years has drastically propelled mobile technology forward, enabling innovative applications and experiences, as for example in form of mobile virtual reality (VR) and mobile augmented reality (AR). While in earlier days mobile AR systems were constructed using multiple large and costly external components carried in bulky and heavy backpacks, today low-cost off-the-shelf mobile devices, such as smartphones, are sufficient, since these provide all the necessary technology right out- of-the-box. However, the realization of highly accurate and performant systems on such devices poses a challenge, since the inexpensive parts (e.g. sensors) are often prone to inaccuracies. Many AR systems are developed for entertainment purposes, but mobile AR potentially also has further beneficial applications in more serious fields, such as archaeology, education, medicine, military, etc. For civil engineering and city planning, mobile AR is also promising, as it could be used to enhance some typical workflows and planning processes. A real-life example application is the visualization of planed building parts, to simplify planning processes and to optimize the communication between the participating decision makers. In this thesis, a concept for a mobile AR system aimed at the mentioned scenarios is presented, implemented and evaluated. For this, on the one side a suitable mobile AR system and on the other some appropriate data are necessary. A problem is that much digital 3D building data typically lacks the required spatial referencing and important additional information, like semantics or Abstract III topology. Some exceptions can be found in the construction sector and in the geographic information domain with the IFC and CityGML format. While the focus of IFC primarily lies on particular highly detailed building models, CityGML emphasizes more general, less detailed models in a broader context, thus, enabling city and room scale visualizations. A proof-of-concept system was realized on an Android-based smartphone using CityGML models. It is fully self-sufficient and operates without external infrastructures. To process the CityGML data, a mobile data processing unit consisting of a SpatiaLite database, a data importer and a data selection method, was implemented. The importer is based on a XML Pull parser which reads CityGML 1.0 and CityGML 2.0 data and writes it into the SpatiaLite-based CityGML database that is modelled according to the CityGML schema. The selection algorithm enables efficiently filtering the data that is relevant to the user at his current location from the entirety of data in the database. To visualize the data and make the information of each object accessible, a customized rendering solution was implemented that aims at preserving the object information while maximizing the rendering performance. For preparing the geometry data for rendering, a customized polygon triangulation algorithm was implemented, based on the ear-clipping method. To superimpose the physical objects with these virtual elements, a fine-grained (indoor) pose tracking system was implemented, using a combination of image- and inertial measurement unit (IMU)-based methods. The IMU is utilized to determine initial coarse pose estimates which then are optimized by the CityGML model-based optical pose estimation methods. For this, a 2D image-based door detector and a 3D corner extraction method that return accurate IV Abstract corners of the door were implemented. These corners are then used for the pose estimations. Lastly, the mobile CityGML AR system was evaluated in terms of data processing/visualization performance and accuracy/stability of the pose tracking solution. The results show that off-the-shelf low-cost mobile devices, such as smartphones, are sufficient to realize a fully-fledged self-sufficient location-based mobile AR system that qualifies for numerous AR scenarios, like the earlier described one. Zusammenfassung V Zusammenfassung Die zunehmende Popularität von Smartphones über die vergangenen 10 Jahre hat die mobile Technologie entscheidend vorangetrieben und ermöglicht innovative Anwendungen und Erfahrungen, wie zum Beispiel in Form von mobiler Virtual Reality (VR) und mobiler Augmented Reality (AR). Wurden zuvor für mobile AR-Systeme noch eine Vielzahl von großen und teuren externen Komponenten benötigt, die in sperrigen und schweren Rucksäcken transportiert wurden, reichen heute preiswerte, handelsübliche mobile Geräte, wie Smartphones, aus, da diese bereits alle erforderlichen Technologien beinhalten. Die Realisierung hochgenauer und performanter Systeme auf Basis solcher Geräte stellt jedoch eine Herausforderung dar, da die kostengünstigen Komponenten (z.B. Sensoren) oft zu Ungenauigkeiten neigen. Vorrangig werden mobile AR-Systeme für den Entertainment- bereich entwickelt, mobile AR hat jedoch auch ein vielversprechendes Potential in anderen Bereichen, wie beispielsweise der Archäologie, Bildung, Medizin oder dem Militär. Auch im Bauwesen und in der Stadtplanung ist mobile AR äußerst vielversprechend, da es zur Optimierung einiger typischer Arbeitsabläufe und Planungsprozesse verwendet werden könnte. Ein Beispiel für eine reale Anwendung ist die Visualisierung von geplanten Bauwerksteilen, um Planungs- prozesse zu vereinfachen und die Kommunikation zwischen den beteiligten Entscheidungsträgern zu optimieren. In dieser Arbeit wird ein Konzept für ein AR-System vorgestellt, implementiert und evaluiert, das auf die genannten Szenarien abzielt. VI Zusammenfassung Dazu sind einerseits ein geeignetes mobiles AR-System und andererseits entsprechende Daten notwendig. Problematisch sind die benötigten, jedoch häufig fehlenden räumlichen Bezüge der digitalen 3D-Gebäudedaten und fehlende wesentliche attributive Daten, wie Semantik oder Topologie. Einige Ausnahmen finden sich im Bau- und Geoinformationssektor mit dem IFC- und CityGML-Format. Während der Fokus von IFC in erster Linie auf einzelnen, hochdetaillierten Gebäudemodellen liegt, legt CityGML den Schwerpunkt auf allgemeinere, weniger detaillierte Modelle in einem breiteren Kontext und ermöglicht so Visualisierungen im Stadt- und Raummaßstab. Ein Demonstrator wurde auf einem Android-basierten Smartphone und mit entsprechenden CityGML-Modellen realisiert. Dieser ist vollständig autark und funktioniert ohne externe Infrastrukturen. Zur Verarbeitung der CityGML-Daten wurde eine mobile Daten- verarbeitungskomponente implementiert, die aus einer SpatiaLite- Datenbank, einem Datenimporter und einer Datenselektionsmethode besteht. Der Importer basiert auf einem XML Pull-Parser, der CityGML 1.0- und CityGML 2.0-Daten liest und in die SpatiaLite- basierte CityGML-Datenbank schreibt, die nach dem CityGML- Schema modelliert ist. Der Selektionsalgorithmus ermöglicht ein effizientes Filtern der Daten abhängig von der aktuellen Position des Nutzers, sodass nur relevante Daten aus der Datenbank exportiert werden. Für die Visualisierung der Daten und Bereitstellung der Objektinformationen wurde eine spezialisierte Rendering-Lösung implementiert, die es ermöglicht die Objektinformationen zu erhalten, aber gleichzeitig die Rendering-Leistung zu maximieren. Zur Vorbereitung der Geometriedaten für das Rendering wurde ein Zusammenfassung VII angepasster Polygontriangulationsalgorithmus, basierend auf der Ear- Clipping
Recommended publications
  • Camera Motion Estimation for Multi-Camera Systems
    Camera Motion Estimation for Multi-Camera Systems Jae-Hak Kim A thesis submitted for the degree of Doctor of Philosophy of The Australian National University August 2008 This thesis is submitted to the Department of Information Engineering, Research School of Information Sciences and Engineering, The Australian National University, in fullfilment of the requirements for the degree of Doctor of Philosophy. This thesis is entirely my own work, except where otherwise stated, describes my own research. It contains no material previously published or written by another person nor material which to a substantial extent has been accepted for the award of any other degree or diploma of the university or other institute of higher learning. Jae-Hak Kim 31 July 2008 Supervisory Panel: Prof. Richard Hartley The Australian National University Dr. Hongdong Li The Australian National University Prof. Marc Pollefeys ETH, Z¨urich Dr. Shyjan Mahamud National ICT Australia In summary, this thesis is based on materials from the following papers, and my per- cived contributions to the relevant chpaters of my thesis are stated: Jae-Hak Kim and Richard Hartley, “Translation Estimation from Omnidirectional Images,” Digital Im- age Computing: Technqiues and Applications, 2005. DICTA 2005. Proceedings, vol., no., pp. 148-153, Dec 2005, (80 per cent of my contribution and related to chapter 6) Brian Clipp, Jae-Hak Kim, Jan-Michael Frahm, Marc Pollefeys and Richard Hartley, “Robust 6DOF Motion Estimation for Non-Overlapping, Multi-Camera Systems,” Applications of Computer Vision, 2008. WACV 2008. IEEE Workshop on , vol., no., pp.1-8, 7-9 Jan 2008, (40 per cent of my contribu- tion and related to chapter 7) Jae-Hak Kim, Richard Hartley, Jan-Michael and Marc Pollefeys, “Visual Odometry for Non-overlapping Views Using Second-Order Cone Programming,” Asian Conference on Computer Vision, Tokyo, Japan, ACCV (2) 2007, pp.
    [Show full text]
  • 2020Spring 20 Camerasandca
    Cats + mirrors + face filters [reddit – juicysox] [Isa Milefchik (1430 HTA Spring 2020)] [Madelyn Adams (student Spring 2019)] Zoom protocol Please: • Cameras on (it really helps me to see you) • Real names • Mics muted • Raise hands in Zoom for questions, unmute when I call • I will ask more often for questions Project 4 – due Friday • Both parts – Written – Code Project 5 • Questions and code due Friday April 10th Final group project • Groups of four • Groups of one are discouraged – you need a good reason. • Group by timezone where possible; use Piazza • We’ll go over possible projects at a later date Questions • What else did I miss? By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 What is a camera? Camera obscura: dark room Known during classical period in China and Greece (e.g., Mo-Ti, China, 470BC to 390BC) Illustration of Camera Obscura Freestanding camera obscura at UNC Chapel Hill Photo by Seth Ilys James Hays James, San Francisco, Aug. 2017 Camera obscura / lucida used for tracing Lens Based Camera Obscura, 1568 Camera lucida drawingchamber.wordpress.com Tim’s Vermeer Vermeer, The Music Lesson, 1665 Tim Jenison (Lightwave 3D, Video Toaster) Tim’s Vermeer – video still First Photograph Oldest surviving photograph Photograph of the first photograph – Took 8 hours on pewter plate Joseph Niepce, 1826 Stored at UT Austin Niepce later teamed up with Daguerre, who eventually created Daguerrotypes Dimensionality Reduction Machine (3D to 2D) 3D world 2D image Point of observation Figures © Stephen E.
    [Show full text]
  • Pinhole Camera Calibration in the Presence of Human Noise
    Linköping Studies in Science and Technology Dissertations, No. 1402 Pinhole Camera Calibration in the Presence of Human Noise Magnus Axholt Department of Science and Technology Linköping University SE-601 74 Norrköping, Sweden Norrköping, 2011 Pinhole Camera Calibration in the Presence of Human Noise Copyright © 2011 Magnus Axholt [email protected] Division of Visual Information Technology and Applications (VITA) Department of Science and Technology, Linköping University SE-601 74 Norrköping, Sweden ISBN 978-91-7393-053-6 ISSN 0345-7524 This thesis is available online through Linköping University Electronic Press: www.ep.liu.se Printed by LiU-Tryck, Linköping, Sweden 2011 Abstract The research work presented in this thesis is concerned with the analysis of the human body as a calibration platform for estimation of a pinhole camera model used in Aug- mented Reality environments mediated through Optical See-Through Head-Mounted Display. Since the quality of the calibration ultimately depends on a subject’s ability to construct visual alignments, the research effort is initially centered around user studies investigating human-induced noise, such as postural sway and head aiming precision. Knowledge about subject behavior is then applied to a sensitivity analy- sis in which simulations are used to determine the impact of user noise on camera parameter estimation. Quantitative evaluation of the calibration procedure is challenging since the current state of the technology does not permit access to the user’s view and measurements in the image plane as seen by the user. In an attempt to circumvent this problem, researchers have previously placed a camera in the eye socket of a mannequin, and performed both calibration and evaluation using the auxiliary signal from the camera.
    [Show full text]
  • Calibration Using a General Homogeneous Depth Camera Model
    DEGREE PROJECT IN THE FIELD OF TECHNOLOGY ENGINEERING PHYSICS AND THE MAIN FIELD OF STUDY ELECTRICAL ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2017 Calibration using a general homogeneous depth camera model DANIEL SJÖHOLM KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF COMPUTER SCIENCE AND COMMUNICATION Calibration using a general homogeneous depth camera model DANIEL SJÖHOLM Master’s Thesis at CSC/CVAP Supervisor: Magnus Burenius Supervisor at KTH: Patric Jensfelt Examiner: Joakim Gustafson Abstract Being able to accurately measure distances in depth images is important for accurately reconstructing objects. But the measurement of depth is a noisy process and depth sensors could use additional correction even after factory calibration. We regard the pair of depth sensor and image sensor to be one single unit, returning complete 3D information. The 3D information is combined by relying on the more accurate image sensor for everything except the depth measurement. We present a new linear method of correcting depth distortion, using an empirical model based around the con- straint of only modifying depth data, while keeping planes planar. The depth distortion model is implemented and tested on the Intel RealSense SR300 camera. The results show that the model is viable and generally decreases depth measurement errors after calibrating, with an average improvement in the 50 % range on the tested data sets. Referat Kalibrering av en generell homogen djupkameramodell Att noggrant kunna mäta avstånd i djupbilder är viktigt för att kunna göra bra rekonstruktioner av objekt. Men denna mätprocess är brusig och dagens djupsensorer tjänar på ytterligare korrektion efter fabrikskalibrering. Vi betraktar paret av en djupsensor och en bildsensor som en enda enhet som returnerar komplett 3D information.
    [Show full text]