Field Service Support with Google Glass and Webrtc

Field Service Support with Google Glass and WebRTC Support av Fälttekniker med Google Glass och WebRTC PATRIK OLDSBERG Examensarbete inom Datorteknik, Grundnivå, 15 hp Handledare på KTH: Reine Bergström Examinator: Ibrahim Orhan TRITA-STH 2014:68 KTH Skolan för Teknik och Hälsa 136 40 Handen, Sverige Abstract The Internet is dramatically changing the way we communicate, and it is becoming increasingly important for communication services to adapt to context in which they are used. The goal of this thesis was to research how Google Glass and WebRTC can be used to create a communication system tailored for field service support. A prototype was created where an expert is able to provide guidance for a field technician who is wearing Google Glass. A live video feed is sent from Glass to the expert from which the expert can select individual images. When a still image is selected it is displayed to the technician through Glass, and the expert is able to provide instructions using real time annotations. An algorithm that divides the selected image into segments was implemented using WebGL. This made it possible for the expert to highlight objects in the image by clicking on them. The thesis also investigates different options for ac- cessing the hardware video encoder on Google Glass. Sammanfattning Internet har dramatiskt ändrat hur vi kommunicerar, och det blir allt viktigare för kommunikationssystem att kunna anpassa sig till kontexten som de används i. Målet med det här examensarbetet var att undersöka hur Google Glass och WebRTC kan användas för att skapa ett kommunikationssystem som är skräddarsytt för support av fälttekniker. En prototyp skapades som låter en expert ge väg- ledning åt en fälttekniker som använder Google Glass. En videoström skickas från Glass till experten, och den- ne kan sedan välja ut enstaka bilder ur videon. När en stillbild väljs så visas den upp på Glass för teknikern, och experten kan sedan ge instruktioner med hjälp av realtidsannoteringar. En algoritm som delar upp den utvalda bilden i seg- ment implementerades med WebGL. Den gjorde det möj- ligt för experten att markera objekt i bilden genom att klicka på dem. Examensarbetet undersöker också olika sätt att få tillgång till hårdvarukodaren för video i Google Glass. Contents 1 Introduction 11 1.1 Background . 11 1.2 Problem Definition . 11 1.3 Research Goals and Contributions . 12 1.4 Limitations . 12 2 Literature Review 13 2.1 Wearable Computers . 13 2.1.1 History of Wearable Computers . 13 2.2 Augmented Reality . 14 2.2.1 Direct vs. Indirect Augmented Reality . 15 2.2.2 Video vs. Optical See-Through Display . 15 2.2.3 Effect on Health . 16 2.2.4 Positioning . 16 2.3 Collaborative Communication . 18 2.4 Image Processing . 19 2.4.1 Edge Detection . 19 2.4.2 Noise Reduction Filters . 20 2.4.3 Hough Transform . 21 2.4.4 Image Region Labeling . 22 3 Technology 25 3.1 Google Glass . 25 3.1.1 Timeline.............................. 25 3.1.2 Interaction . 26 3.1.3 Microinteractions . 26 3.1.4 Hardware . 27 3.1.5 Development . 27 3.1.6 Display .............................. 28 3.2 AlternativeDevices............................ 28 3.2.1 MetaPro ............................. 28 3.2.2 Vuzix M100 . 29 3.2.3 Oculus Rift Development Kit 1/2 . 29 CONTENTS 3.2.4 Recon Jet . 29 3.2.5 XMExpert . 29 3.3 Web Technologies . 30 3.3.1 WebRTC ............................. 30 3.3.2 WebGL .............................. 31 3.4 Mario ................................... 32 3.4.1 GStreamer . 32 3.5 VideoEncoding.............................. 33 3.5.1 Hardware Accelerated Video Encoding . 34 4 Implementation of Prototype 35 4.1 Baseline Implementation . 35 4.2 Hardware Accelerated Video Encoding . 36 4.2.1 gst-omx on Google Glass . 36 4.3 Ideastoimplement............................ 36 4.3.1 Annotating the Technician’s View . 37 4.3.2 Aligning Annotations and Video . 37 4.4 Still Image Annotation . 38 4.4.1 WebRTC Data Channel . 39 4.4.2 Out of Order Messages . 39 4.5 Image Processing . 40 4.5.1 Image Processing using WebGL . 41 4.5.2 Image Segmentation using Hough Transform . 42 4.5.3 Image Segmentation using Median Filters . 42 4.5.4 Region Labeling . 43 4.6 Glass Application . 44 4.6.1 Configuration . 44 4.6.2 OpenGL ES 2.0 . 44 4.6.3 Photographs by Technician . 44 4.7 Signaling Server . 45 4.7.1 Sessions and users . 45 4.7.2 Serversentevents ........................ 45 4.7.3 Image upload . 46 5 Result 47 5.1 Web Application . 47 5.2 Google Glass Application . 48 6 Discussion 51 6.1 Analysis of Method and the Result . 51 6.1.1 Live Video Annotations . 51 6.1.2 Early Prototype . 51 6.1.3 WebGL .............................. 51 6.1.4 OpenMAX . 52 CONTENTS 6.1.5 Audio ............................... 52 6.2 FurtherImprovements .......................... 52 6.2.1 Image Segmentation . 52 6.2.2 Video Annotations . 52 6.2.3 UX Evaluation . 53 6.2.4 Gesture and Voice Input . 53 6.2.5 More Annotations Options . 53 6.2.6 Logging . 53 6.3 Reusability ................................ 53 6.4 Capabilities of Google Glass . 54 6.5 Effects on Human Health and the Environment . 54 6.5.1 Environmental Impacts . 54 6.5.2 Health Concerns . 54 7 Conclusion 57 Bibliography 59 Introduction 1.1 Background The Internet is dramatically changing the way we communicate, and it is becoming increasingly important for communication services to be adaptable to the context in which they are used. They also need to be flexible enough to be able to integrate into new contexts without excessive effort. Using a wearable device allows a communication system to be tailored for the context to a greater extent. With additional information available such as move- ment, heart-rate or perspective of the user, a richer user experience can be achieved. Wearable devices have a huge potential in many different business fields. An upcoming form of wearable devices are currently head-mounted displays (HMD). HMDs have the advantage being able to display information in a hands-free format, this has a huge potential for businesses such as medicine and field service. Perhaps the most recognized wearable device at the moment is Google Glass, which for brevity will sometimes be referred to simply as ‘Glass’. 1.2 Problem Definition The focus of the thesis involved a generalized use case where a field service technician has traveled to a remote site to solve a problem. The technician is equipped with Google Glass, or any equivalent HMD. When on his or her way to the site, the technician has information such as location of the site and the support ticket available. The technician can look up information such as the hardware on the site, the expected spare parts to resolve the issue, and recent tickets for the same site. Once the technician has arrived to the site, the back office support will be notified. When at the site, the technician can view manuals and use the device to document the work. If the problem is more complicated than expected, or the technician is unable to resolve the issue for some other reason, the device can be used to call an expert in the back office support. The purpose of this thesis was to research how a contextual communication system can be tailored for this use case. The part that was in focus was the call to 11 CHAPTER 1. INTRODUCTION the back office support which is made after the technician has arrived on site and requires assistance to resolve the issue. 1.3 Research Goals and Contributions The goal was to research collaborative communication and find ways to tailor a communication system for this specialized kind of call. Different ways to give instructions and display information to the wearer of a HMD were investigated, as well as how these could be implemented. An experimental prototype using some of the ideas was then constructed. The prototype was implemented using web real-time communication (WebRTC). At the time when the prototype was built there was no implementation of WebRTC available on Google Glass. A framework called Mario that was developed internally at Ericsson Research was used as WebRTC implementation, as it runs on Android among other platforms. The prototype comprised several different subsystems that were all implemented from scratch: • A web application built with HTML5, WebGL and WebRTC technology. • A NodeJS server acting as web server and signaling server. • A Google Glass application using the Glass Development Kit (GDK). An evaluation of the prototype was done with focus on how it could be further improved, and if any of the implemented ideas can be applied to similar use cases and devices. An evaluation of the capabilities of Google Glass with regard to media processing and the performance of Mario was also performed. 1.4 Limitations The time limit of the thesis is ten weeks, therefore a number of limitations where made so that it would be completable within this limit. No in-depth evaluation of the user experience (UX) of the prototype would be done. The prototype was to be designed with UX in mind, using the result of the initial research, but no further evaluation would be done. A broad enough UX evaluation was not considered possible within the limited time frame. The prototype would not be optimized for battery life and bandwidth usage. These restrictions are of course important and the limitations imposed by them would be taken into consideration. Although no analysis and optimization would be done to find the most optimal solutions with regard to these issues. 12 Literature Review 2.1 Wearable Computers A wearable computer is an electronic device that is worn by the user.

Field Service Support with Google Glass and Webrtc

Wayne Community College 2009-2010 Strategic Plan End-Of-Year Report Table of Contents

Työohjeistuksen Kehittäminen Lisätyn Todellisuuden Avulla

Exploratory Research Into Potential Practical Uses Of

A Viga T Ing R T Ificia L N Te Ll Igence

Augmented Reality Navigation

Augmented Reality Environments for Immersive and Non Linear Virtual Storytelling

Vancouver Cross-Border Investment Guide

Gstreamer and Dmabuf

Intel Brings Home Top Awards, Recognition During CES 2016

Ownerls Manual

Design, Modeling, and Analysis of Visual Mimo Communication

Head-Mounted Mixed Reality Projection Display for Games Production and Entertainment