Semantic Mapping in ROS XAVIER GALLART DEL BURGO
Total Page:16
File Type:pdf, Size:1020Kb
Semantic Mapping in ROS XAVIER GALLART DEL BURGO Master’s Thesis at CVAP/CAS, The Royal Institute of Technology, Stockholm, Sweden Supervisor: Dr. Andrzej Pronobis Examiner: Prof. Danica Kragic August, 2013 Abstract In the last few years robots are becoming more popular in our daily lives. We can see them guiding people in museums, helping surgeons in hospitals and autonomously cleaning houses. With the aim of enabling robots to cooperate with humans and to perform human-like tasks we need to provide them with the capability of understanding human envi- ronments and representing the extracted knowledge in such a way that humans can interpret. Semantic mapping can be defined as the process of building a representation of the environment, incorporating semantic knowledge obtained from sensory information. Semantic properties can be extracted from various sources such as objects, topology of the envi- ronment, size and shape of rooms and room appearance. This thesis proposes an implementation of semantic mapping for mo- bile robots which is integrated in a framework called Robot Operat- ing System (ROS). The system extracts spatial properties like rooms, objects and topological information and combines them with common- sense knowledge into a probabilistic framework which is capable of in- ferring room categories. The system is tested in simulations and in real-world scenarios and the results show how the system explores an unknown environment, creates an accurate map, detects objects, infers room categories and represents the results in a map where each room is labelled according to its functionality. Acknowledgements First and foremost, I would like to express my gratitude to my super- visor, Andrzej Pronobis, for giving me the opportunity to develop my thesis in one of my passions, the robots. Thank you for your guidance, your advices, your supervision, your patience and for introducing me to robotics research. There are many people who have made my time in Sweden rich of en- joyable moments. I would like to thank all the people of the Computer Vision and Active Perception Lab for their kindness and hospitality that made my time in KTH as if I were at home. Especially to the “innebandy crew” for showing me this sport, teaching me and for all the great mo- ments playing every Wednesday. To my Swedish family, Nina, Jessica and Nichlas for helping me from the first day in Stockholm. Thank you for all the amazing things you have showed me and for being so kind and generous. It was a pleasure to share my time in Stockholm with my neighbours, the “Kista family”. I will not forget every moment we spent together doing shared dinners, watching movies, skating, going out, travelling, etc. I would also like to thank all the people who have helped me during my studies in Barcelona, specially to Gerard, Henar and Javi for all the moments we have spent in class, in the laboratory, in the library and at the bar. All these years would not have been the same without you. Last but not least, I want to thank my family; to my parents and broth- ers for understanding all my decisions and giving me the chance to study abroad. Thank you for all your love and support. Finally, I want to thank Júlia, for always being by my side. For your constant love in good and bad moments, for your patience during these months and for always helping me when I needed it most. Xavier Gallart del Burgo Stockholm, August 2013 List of Acronyms ROS Robot Operating System SLAM Simultaneous Localization and Mapping HOG Histogram of Oriented Gradients SURF Speeded Up Robust Features SVM Support Vector Machine DGM Directed Graphical Model UGM Undirected Graphical Model PDGM Partially Directed Graphical Model DAG Directed Acyclic Graph CVAP Computer Vision and Active Perception Laboratory Contents 1 Introduction 1 1.1 Related Work . 3 1.2 Contribution of the Thesis . 5 1.3 Outline . 6 2 Background 7 2.1 Mapping . 7 2.2 Object Detection . 10 2.2.1 Speeded Up Robust Features (SURF) . 10 2.2.2 Histogram of Oriented Gradients (HOG) . 12 2.2.3 Discriminatively Trained Deformable Part Models . 13 2.3 Probabilistic Graphical Models . 15 2.3.1 Factor Graph . 17 3 Robot Operating System (ROS) 19 3.1 ROS Concepts . 20 3.2 ROS Tools . 22 3.3 ROS Packages . 24 3.3.1 Coordinate Frames . 24 3.3.2 ROSARIA . 25 3.3.3 Gmapping . 26 3.3.4 RoboEarth . 27 3.3.5 OpenCV . 29 4 Semantic Mapping Package for ROS 31 4.1 Metric and Topological Mapping . 33 4.2 Object Detection . 35 4.2.1 RoboEarth . 35 4.2.2 Latent SVM . 36 4.3 Conceptual Mapping and Reasoning . 38 5 Experiments 39 5.1 Experimental Setup . 39 5.1.1 COLD Database . 41 5.2 Evaluating Topological Mapping in Simulation . 41 5.3 Real-world Experiments . 43 5.3.1 Metric and Topological Map . 43 5.3.2 Object Detections . 44 5.3.3 Conceptual Mapping and Reasoning . 47 6 Conclusions and Future Work 51 6.1 Future Work . 52 Bibliography 55 List of Figures 59 List of Tables 61 Chapter 1 Introduction Each day, robots are increasingly more present in our daily lives. In the near future we expect them to perform more advanced actions and to exhibit human-like be- haviours. To achieve that goal, we need them to reason and understand the world in a human-like fashion. Moreover, if robots understand human concepts, human- robot communication becomes possible and we can transfer them knowledge about the world. An important concepts for robots working in our environments is to understand them, which is called spatial understanding. Spatial understanding is an important feature for robots to perform more complex tasks such as exploration, navigation, object search and human-robot interaction. There are different ways in which spatial knowledge can be represented. A metric map is a well-known strategy but it is limited to geometrical information about the environment. Another spatial representation is a topological map, which segments space into discrete places that are connected to one another by paths. Our objec- tive is to extend those representations by providing human-like conceptualization of space. In indoor environments, humans tend to perceive space in terms of discrete areas such as rooms. Rooms play an important role in spatial understanding be- cause we label them with specific names such as “corridor”, “bedroom”, “kitchen”, “office” or “living-room”, typically according to their specific functionality: for ex- ample, kitchen is to cook, bedroom is to sleep and office is to work. The presence of certain objects in rooms conveys a lot of information about their functionality (e.g. a room where we find a microwave oven, a fridge and a cooker is a kitchen). Consequently, it is of vital importance for a robot to be able to extract this infor- mation from the environment. Semantic knowledge can be extracted, not only from objects, but also from a wide variety of sources such as metric and topological properties, size, shape and ge- ometry of rooms and human knowledge. Many recent advances permit robots to obtain all this information: mapping processes build maps using laser scanners, place classification systems perceive size, shape, geometry and appearance of places 1 CHAPTER 1. INTRODUCTION and computer vision algorithms detect objects in images. Human knowledge can be provided as an asserted information from the user and also from search engines (e.g. searching on Google a description for “kitchen”). The process of extracting seman- tic knowledge from an environment, combining it with other sources of knowledge and representing all in a intuitive map, is what we call semantic mapping. In the last few years, an open source robot operating system called ROS1 (Robot Operating System) has become more popular among robot researchers. ROS is a framework to develop robotic software that is becoming more popular among robot researchers. It is based on a peer-to-peer network of processes and it provides var- ious tools to manage the complexity of developing a robotic system. Management tools perform tasks such as navigate the file system, create source code and define dependencies. Visualization tools monitor the communications between ROS com- ponents and show the data flow. Moreover, ROS permits programming in three programming languages: C++, Python or Lisp. Using all these features, robot re- searchers have created a wide number of robot applications which are integrated in ROS and are available online, some of which are: mapping, exploration, navigation, grasping, image processing and object manipulations. Another advantage of using ROS is that the same system can be easily run on numerous robot platforms. The goal of this thesis is to create a ROS framework capable of performing semantic mapping using mobile robots. We aim to create a system that uses information from diverse sensory sources to extract spatial properties from the environment, combines the perceptions using semantic relationships and reasons about the explored areas. Our system extracts information from sensors such as laser and cameras to detect objects, extract topology features and detect doors to segment space in different rooms (e.g. a room which is connected to the corridor and contains microwave and a fridge is more likely to be a kitchen than a meeting room). These properties are combined into a conceptual map which maintains relationships between sensed in- formation and human concepts. A probabilistic chain graph models the conceptual map with common-sense knowledge to perform spatial reasoning and to infer room categories. This allows us to jointly reason about the presence of objects, room formation and functional categories of rooms. The system has been tested offline and online.