A Multimodal Ouija Board for Aircraft Carrier Deck Operations
Total Page:16
File Type:pdf, Size:1020Kb
A Multimodal Ouija Board for Aircraft Carrier Deck Operations by Birkan Uzun S.B., C.S. M.I.T., 2015 Submitted to the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degree of Master of Engineering in Computer Science and Engineering at the Massachusetts Institute of Technology June 2016 Copyright 2016 Birkan Uzun. All rights reserved. The author hereby grants to M.I.T. permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole and in part in any medium now known or hereafter created. Author ……………………………………………………………………………………………... Department of Electrical Engineering and Computer Science April 6, 2016 Certified by ………………………………………………………………………………………... Randall Davis, Professor Thesis Supervisor Accepted by ……………………………………………………………………………………….. Dr. Christopher J. Terman Chairman, Masters of Engineering Thesis Committee 1 2 A Multimodal Ouija Board for Aircraft Carrier Deck Operations by Birkan Uzun Submitted to the Department of Electrical Engineering and Computer Science April 6, 2016 in Partial Fulfillment of the Requirements for the Degree of Master of Engineering in Computer Science and Engineering Abstract In this thesis, we present improvements to DeckAssistant, a system that provides a traditional Ouija board interface by displaying a digital rendering of an aircraft carrier deck that assists deck handlers in planning deck operations. DeckAssistant has a large digital tabletop display that shows the status of the deck and has an understanding of certain deck actions for scenario planning. To preserve the conventional way of interacting with the oldschool Ouija board where deck handlers move aircraft by hand, the system takes advantage of multiple modes of interaction. Deck handlers plan strategies by pointing at aircraft, gesturing and talking to the system. The system responds with its own speech and gestures, and it updates the display to show the consequences of the actions taken by the handlers. The system can also be used to simulate certain scenarios during the planning process. The multimodal interaction described here creates a communication of sorts between deck handlers and the system. Our contributions include improvements in handtracking, speech synthesis and speech recognition. 3 4 Acknowledgements Foremost, I would like to thank my advisor, Professor Randall Davis, for the support of my work, for his patience, motivation and knowledge. His door was always open whenever I had a question about my research. He consistently allowed this research to be my own work, but steered me in the right direction with his meaningful insights whenever he thought I needed it. I would also like to thank Jake Barnwell for helping with the development environment setup and documentation. Finally, I must express my gratitude to my parents and friends who supported me throughout my years of study. This accomplishment would never be possible without them. 5 6 Contents 1. Introduction……………………………………………………………………………..13 1.1. Overview…………………………………………………………………………13 1.2. Background and Motivation……………………………….…..….………..….....14 1.2.1. Ouija Board History and Use…………………………………………….14 1.2.2. Naval Push for Digital Information on Decks………………………....…15 1.2.3. A Multimodal Ouija Board………………………………………………16 1.3. System Demonstration………………………………………………………...…17 1.4. Thesis Outline……………………………………………………………………20 2. Deck Assistant Functionality…………………………………………………………..21 2.1. Actions in DeckAssistant……...………………………………………………....21 2.2. Deck Environment………...……………………………………………………..22 2.2.1. Deck and Space Understanding…....…………………………………….22 2.2.2. Aircraft and Destination Selection…………..…………………………...23 2.2.3. Path Calculation and Rerouting.…………………………………………23 2.3. Multimodal Interaction..…………………………………………………………24 2.3.1. Input………...……………………………………………………………24 2.3.2. Output………...………………………………………………………….24 3. System Implementation…….…………………………………………………………..28 3.1. Hardware………………….…...…………………………………………………28 3.2. Software……………….......……………………………………………………..29 3.2.1. Libraries……………………....…....…………………………………….29 7 3.2.2. Architecture……………………...…………..…………………………...30 4. Hand Tracking...……………………….…….…..………………………......…..……..32 4.1. The Leap Motion Sensor…....……………………………………………………33 4.1.1. Pointing Detection………....……………………………....…………….34 4.1.2. Gesture Detection…………………………....…………………………...35 5. Speech Synthesis and Recognition……………………………………………………..37 5.1. Speech Synthesis……….……...…………………………………………………37 5.2. Speech Recognition…..…...…………………………………………………..…38 5.2.1. Recording Sound………………......……………………………………..38 5.2.2. Choosing a Speech Recognition Library..…..…………………………...39 5.2.3. Parsing Speech Commands…....…………………………………………40 5.2.4. Speech Recognition Stack in Action……………………………………..41 6. Related Work…………………….……………………………………………………..44 6.1. Navy ADMACS.……….……...…………………………………………………44 6.2. Deck Heuristic Action Planner……....…………………………………………..44 7. Conclusion….…………………….……………………………………………………..45 7.1. Future Work…...……….……...…………………………………………………46 8. References…..…………………….……………………………………………………..47 9. Appendix…....…………………….……………………………………………………..48 9.1. Code and Documentation....…...…………………………………………………48 8 List of Figures Figure 1: Deck handlers collaboratively operating on an Ouija Board. Source: Google Images..15 Figure 2: The ADMACS Ouija board. Source: Google Images…………………………………16 Figure 3: DeckAssistant’s tabletop display with the digital rendering of the deck [1]...........…..17 Figure 4: A deck handler using DeckAssistant with hand gestures and speech commands [1]....18 Figure 5: The initial arrangement of the deck [1]..........................................................................19 Figure 6: Deck handler points at the aircraft to be moved while speaking the command [1].......19 Figure 7: DeckAssistant uses graphics to tell the deck handler that the path to destination is blocked [1].....................................................................................................................................19 Figure 8: DeckAssistant displays an alternate location for the F18 that is blocking the path [1]...............................................………………………………………………………………....20 Figure 9: The logic for moving aircraft [1]...................................................................................22 Figure 10: Regions on an aircraft carrier’s deck. Source: Google Images....................................23 Figure 11: (a) Orange dot represents where the user is pointing at. (b) Aircraft being hovered over is highlighted green [1]..........................................................................................................25 Figure 12: (a) Single aircraft selected. (b) Multiple aircraft selected [1]......................................25 Figure 13: Aircraft circled in red, meaning there is not enough room in region [1].....................26 Figure 14: Alternate region to move the C2 is highlighted in blue [1]........................................27 Figure 15: The hardware used in DeckAssistant………………………………………………...28 Figure 16: DeckAssistant software architecture overview……………………………………....31 Figure 17: The Leap Motion Sensor mounted on the edge of the table top display……………..33 9 Figure 18: Leap Motion’s InteractionBox, colored in red. Source: Leap Motion Developer Portal……………………………………………………………………………………………..35 Figure 19: Demonstration of multiple aircraft selection with the pinch gesture………………...36 Figure 20: A summary of how the speech recognition stack works……………………………..43 10 List of Tables Table 1: Set of commands that are recognized by DeckAssistant……………………………….41 11 List of Algorithms Algorithm 1: Summary of the pointing detection process in pseudocode…………………….…35 12 1. Introduction 1.1. Overview In this thesis, we present improvements to DeckAssistant, a digital aircraft carrier Ouija Board interface that aids deck handlers with planning deck operations. DeckAssistant supports multiple modes of interaction, aiming to improve the user experience over the traditional Ouija Boards. Using handtracking, gesture recognition and speech recognition, it allows deck handlers to plan deck operations by pointing at aircraft, gesturing and talking to the system. It responds with its own speech using speech synthesis and updates the display, which is a digital rendering of the aircraft carrier deck, to show results when deck handlers take action. The multimodal interaction described here creates a communication of sorts between deck handlers and the system. DeckAssistant has an understanding of deck objects and operations, and can be used to simulate certain scenarios during the planning process. The initial work on DeckAssistant was done by Kojo Acquah, and we build upon his implementation [1]. Our work makes the following contributions to the fields of HumanComputer Interaction and Intelligent User Interfaces: ● It discusses how using the Leap Motion Sensor is an improvement over the Microsoft Kinect in terms of handtracking, pointing and gesture recognition. ● It presents a speech synthesis API which generates speech that has high pronunciation quality and clarity. It investigates several speech recognition APIs, argues which one is the most applicable, and introduces a way of enabling voiceactivated speech recognition. 13 ● Thanks to the refinements in handtracking and speech, it provides a natural, multimodal way of interaction with the first largescale Ouija Board alternative that has been built to help with planning deck operations.