LiU-ITN-TEK-A--17/012--SE

Telepresence and remote communication through Gabriella Rydenfors

2017-06-09

Department of Science and Technology Institutionen för teknik och naturvetenskap Linköping University Linköpings universitet nedewS ,gnipökrroN 47 106-ES 47 ,gnipökrroN nedewS 106 47 gnipökrroN LiU-ITN-TEK-A--17/012--SE

Telepresence and remote communication through virtual reality Examensarbete utfört i Datateknik vid Tekniska högskolan vid Linköpings universitet Gabriella Rydenfors

Handledare Karljohan Lundin Palmerius Examinator Camilla Forsell

Norrköping 2017-06-09 Upphovsrätt

Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare – under en längre tid från publiceringsdatum under förutsättning att inga extra- ordinära omständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/

Copyright

The publishers will keep this document online on the Internet - or its possible replacement - for a considerable time from the date of publication barring exceptional circumstances. The online availability of the document implies a permanent permission for anyone to read, to download, to print out single copies for your own use and to use it unchanged for any non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional on the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its WWW home page: http://www.ep.liu.se/

© Gabriella Rydenfors Master Thesis

Telepresence and remote communication through virtual reality

Supervisors: Jonathan Nilsson Voysys AB Author: Torkel Danielsson Gabriella Rydenfors Voysys AB Karl-Johan Lundin Palmerius Link¨opingUniversity

Examiner: Camilla Forsell

in the

Department of Science and Technology

June 2017 LINKOPING¨ UNIVERSITY

Abstract

Institute of Technology Department of Science and Technology

Master of Science in Computer Science and Media Technology

Telepresence and remote communication through virtual reality

by Gabriella Rydenfors

This Master Thesis concerns a telepresence implementation which utilizes state-of-the- art virtual reality combined with live 360◦ video. Navigation interfaces for telepresence with virtual reality headsets were developed and evaluated through a user study. An evaluation of telepresence as a communication media was performed, comparing it to video communication. The result showed that telepresence was a better communication media than video communication. Contents

Abstract i

List of Figures v

List of Tables vi

Abbreviations vii

1 Introduction 1 1.1 Background ...... 1 1.2 Problemdescription ...... 2 1.3 Purpose ...... 2 1.4 Researchquestions ...... 3 1.5 Method ...... 3 1.6 Limitations ...... 4

2 Background and related work 5 2.1 Virtualreality...... 5 2.2 Telepresence...... 6 2.3 Hardwareoverview...... 6 2.3.1 HTC Vive ...... 6 2.3.2 Telepresence robot ...... 6 2.4 Softwareoverview ...... 7 2.4.1 OdenVR...... 7 2.4.2 OpenVR...... 7 2.4.3 GStreamer ...... 7 2.5 Related works and technology ...... 8 2.5.1 Navigation algorithms in telepresence ...... 8 2.5.2 Telepresence as a communication tool ...... 9

3 Telepresence robot 11 3.1 Robot description ...... 11 3.1.1 Hardware ...... 12 3.1.1.1 Stepper Motor ...... 12 3.1.1.2 Zoom H2n ...... 14 3.1.1.3 OdenVR Cube ...... 14 3.1.1.4 Router ...... 15

ii Contents iii

3.1.2 Software ...... 15 3.1.2.1 TCP server ...... 15 3.1.2.2 Network protocol ...... 15 3.1.2.3 Automatic startup for required programs ...... 15 3.2 Audio communication ...... 16 3.2.1 GStreamer ...... 16 3.2.2 Audio communication ...... 17 3.2.2.1 Windows ...... 17 3.2.2.2 Raspberry Pi ...... 18 3.3 Navigation interfaces ...... 18 3.3.1 Button based navigation interfaces ...... 19 3.3.1.1 Car inspired remote control ...... 19 3.3.1.2 Touch based remote control ...... 19 3.3.2 Hand orientation based navigation interfaces ...... 20 3.3.2.1 Point based navigation ...... 21 3.3.2.2 Crosshair inspired navigation ...... 21 3.3.3 Head orientation based navigation interfaces ...... 21 3.3.3.1 Gaze based navigation ...... 21 3.3.3.2 Touch based remote control relative head direction . . . . 22

4 User evaluation 24 4.1 Evaluation of navigation interface ...... 24 4.1.1 Procedure...... 25 4.2 Evaluation of telepresence as a communication tool ...... 26 4.2.1 Measurement method ...... 26 4.2.2 Procedure...... 27

5 Results 29 5.1 Steeringinterfaces ...... 29 5.2 Telepresenceexperience ...... 30 5.2.1 The interactant ...... 30 5.2.2 The inhabitor ...... 31 5.2.3 The combined result ...... 32

6 Discussion 34 6.1 Navigation interface evaluation ...... 34 6.2 Telepresence compared to video communication ...... 35 6.2.1 Measurement method ...... 35 6.2.2 Evaluation procedure ...... 35 6.2.3 Result...... 36 6.2.3.1 The interactant ...... 36 6.2.3.2 The inhabitor ...... 37 6.3 Telepresencerobot ...... 38 6.3.1 ZOOM H2N microphone ...... 38 6.3.2 Camera height ...... 39 6.4 Audio communication ...... 39 Contents iv

7 Conclusion and future work 40 7.1 Navigation interfaces for telepresence ...... 40 7.2 Measurement method for communication ...... 41 7.3 Telepresence as a communication tool in comparison with video commu- nication ...... 41 7.4 Futurework...... 42 7.4.1 Semi autonomic navigation ...... 42 7.4.2 Image stabilization ...... 42 7.4.3 Global network ...... 43

A Communication measurement method 44 A.1 Efficiency ...... 44 A.2 Participation ...... 44 A.3 Process Satisfaction ...... 45 A.4 Solution Satisfaction ...... 45 A.5 Negative Socio-Emotional Behavior ...... 45 A.6 Co-presence...... 45 A.7 Cognitive load ...... 46

B Evaluation Questionnaire 47

Bibliography 51 List of Figures

3.1 Telepresencerobot ...... 12 3.2 Stepper motor ...... 13 3.3 OdenVRCube ...... 14 3.4 AGStreamerPipeline ...... 16 3.5 Soundpipelineonwindows ...... 18 3.6 Sound pipeline on Raspberry Pi ...... 18 3.7 Car inspired remote control ...... 20 3.8 Touch based remote control ...... 20 3.9 Point based navigation ...... 21 3.10 Crosshair inspired navigation ...... 22 3.11 Gaze based navigation ...... 22

4.1 Navigation interface study ...... 25 4.2 Telepresence evaluation ...... 27

5.1 Navigation interface test result ...... 29 5.2 Preferred communication method by the interactant ...... 31 5.3 Preferred communication method by the inhabitor ...... 32

6.1 The interactant’s process satisfaction for the telepresence communication 37 6.2 The interactant’s process satisfaction for the video communication . . . . 37

v List of Tables

3.1 List of components on the telepresence robot ...... 13

4.1 The test order in evaluation of telepresence and video as communication methods...... 27

5.1 Time and correctness comparison between telepresence and video com- munication ...... 30 5.2 Summary of measured variables for the interactant ...... 30 5.3 Summary of measured variables for the inhabitor ...... 32 5.4 Summary of measured variables for both the interactant and inhabitor . . 33

7.1 Preferable communication method with regard to the measured variable . 42

vi Abbreviations

VR Virtual Reality HMD Head Mounted Display IMU Inertial Measurement Unit USB Universal Serial Bus FOV Field Of View RTP Real-time Transport Protocol TCP Transmission Control Protocol SDK Software Development Kit API Aapplication Programming Interface LiPo Lithium Polymer UDP User Datagram Protocol LGPL Lesser General Public License

vii Chapter 1

Introduction

This master thesis concerns telepresence as a communication tool and is made for the Department of Science and Technology at Link¨oping University and Voysys AB. This chapter includes an introduction to telepresence, the background of this project and how it relates to Voysys.

1.1 Background

The thought of traveling instantly (like teleporting) has enchanted humanity and inspired the invention of ever faster transportation options. With the invention of the telephone, a paradigm shift occurred. Instead of physically traveling to the person you wished to speak with, your voice could electronically travel to that person almost instantly. This type of communication effectively replaced some of our travels as some questions could be sorted out on phone instead of having to set up a physical meeting.

With video technology and the world wide web, this kind of communication has grown even lager with video communication tools such as Skype, FaceTime, Hangouts and more.

Voysys AB in Norrk¨oping, Sweden, works with live video streaming in 360◦. This tech- nique makes it possible to digitally transport visual content in all directions at the same time and let the viewer decide what is relevant.

1 Chapter 1. Introduction 2

Combined with the rise in Virtual Reality (VR) display systems, this kind of technique can be used in order to achieve telepresence. Where instead of traveling to the desired location, the sensory impressions from the desired location is digitally transported to the place where the person is located.

One use case of this technique is to replace the crane cabin on a lumber truck with cameras and let the truck driver control the crane from the passenger seat of the truck. The truck driver uses a Head Mounted Display (HMD) to get the view from the place where the crane cabin used to be, and can thus continue his/her work as usual with the benefits of having more space for timber, a safer and more comfortable location and lighter trailer.

This lumber truck setup can be viewed as human-to-machine interaction. Voysys now desire to expand their type of telepresence experience to also involve human-to-human interaction by creating a telepresence robot.

1.2 Problem description

Today remote human-to-human communication is usually done through voice or video. With the emerging VR technology a more immersive option for communication is made possible, telepresence. This thesis aims to build a communication solution which uti- lizes state-of-the-art VR technology combined with Voysys live 360◦ video stitching and streaming solution.

However telepresence may not necessary be a better option for human-to-human commu- nication. In order to evaluate this new communication method, this thesis will compare it against an existing communication method such as video communication. A set of measurable variables on communication will be produced and used for this evaluation.

1.3 Purpose

This thesis aims to build a communication solution which utilizes state-of-the-art VR technology. This includes building a robot with live 360◦ cameras and connecting it to a HMD. The idea is that a user should be able to take the form of the robot and thus feel Chapter 1. Introduction 3 present in the remote location in a way that traditional video communication cannot offer.

1.4 Research questions

The project will investigate three main questions:

• Given an avatar robot on wheels that has an omnidirectional camera feed linked to an HMD, how to design an intuitive navigation interface?

• How good is the developed telepresence robot as a communications tool, compared to video calls?

• How to evaluate a remote communication experience in a way that allows compar- ison between different modalities in a remote communication system.

1.5 Method

The telepresence robot has been developed incrementally following an iterative process.

The first iteration started with setting up the video streaming between the robot and the HMD. Then the robot’s wheels had to be controlled from a Raspberry Pi that would work as the main computer on the robot. At the same time a network server was developed so that the computer could send control signals to the robot.

In the second iteration several navigation interfaces were developed as described in chapter 3.3. The interfaces were evaluated through a user study as described in section 4.1. At this point in time the robot was still using a wired network and power solution. However, it was still possible to evaluate the navigation interface using very long cables.

In the third iteration two way audio communication was implemented, as described in subsection 3.2. In this iteration the robot was also made wireless.

In the last iteration, the telepresence robot was evaluated as a communication tool. The evaluation was preformed as a user study where the telepresence robot was compared against the video communication tool Skype. Chapter 1. Introduction 4

1.6 Limitations

The robot’s power supply was based on two 2200mAh LiPo (Lithium Polymer) batteries. This limited the robots running time to about one hour before the batteries had to be recharged. However the project had several batteries available, so the development could continue while the batteries recharged.

The network setup was limited to only use a local network setup where the robot and the computer were connected to the same router. This limited the range of the telepresence experience, however it was still good enough to test the concept. Chapter 2

Background and related work

This chapter presents some of the basic theory behind the thesis and provides information about VR and telepresence.

This thesis has chosen to adapt the words inhabitor and interactant from Pang et al.[1]. An inhabitor refers to a person controlling a telepresence robot from a remote location and interactant is a person interacting with the telepresence robot.

2.1 Virtual reality

Virtual reality (VR) is a computer-generated environment that simulates several senses and lets the user interact with the content in a manner that is similar to a physical place [2]. Virtual reality can be described as a technology that combines 3D graphics with immersive display and tracking technology, in order to create a system where the displayed content matches the user’s viewpoint. However Riva [3] means that this description is too focused on hardware and would rather describe it as the inclusive relationship between the virtual content and the user. According to Riva, VR can be considered as a form of computer-mediated-communication in a multi-user scenario. Riva also belives that VR has the possibility to become the next dominant medium, outperforming television and telephones.

5 Chapter 2. Background and related work 6

2.2 Telepresence

Telepresence is the notion of being at another area with the help of technology. The term is often used in applications that create face-to-face meetings without travel, like remote conference rooms. The technologies range from a simple web camera to 3D reconstructions with multiple depth camera. [4]

This project will focus on the type of telepresence that uses remotely controlled robots. In this type of communication the robot’s surrounding is presented to a human that controls the robot remotely.

2.3 Hardware overview

This project contains two major hardware components, one VR ready desktop computer with the HMD HTC Vive and one telepresence robot.

2.3.1 HTC Vive

HTC Vive is an HMD with room scale tracking developed as a collaboration between HTC and Valve. The headset has an OLED screen with a total resolution of 2160x1200 and 110 degrees field of view (FOV). Integrated in the headset are 32 sensors for the tracking system SteamVR. The system also contains 2 wireless hand controllers that each has 24 sensors for the motion tracking. [5].

2.3.2 Telepresence robot

The telepresence robot has been developed continuously throughout the project. Basi- cally it consists of cameras and a microphone on wheels. The main computer on the robot is a Raspberry Pi. The robot is fully wireless and communicates through WiFi. Two LiPo batteries are used to power the robot. A full description of the robot with all components can be found in subsection 3.1 Chapter 2. Background and related work 7

2.4 Software overview

Several software development kits (SDK) were used during the development. Here the main SDKs are introduced together with a description of how they are used in the project.

2.4.1 OdenVR

OdenVR is a 3D graphics engine for live 360◦ video developed by Voysys. The program is able to stitch, stream and display a multi-camera feed in an Oculus Rift with a latency below 200ms. The program is written in C++ and built on a structure of entities that can be added in a scene graph for standalone or combined purposes. In this project, two entities have been created within the software. One that handles the HTC Vive and one that handles the navigation interfaces. The entities themselves relies on other entities within the OdenVR software for live streaming and stitching of the camera input.

2.4.2 OpenVR

OpenVR is a C++ API developed by Valve that enables software to communicate with Virtual Reality headsets without relying on a specific hardware vendor’s SDK.[6] This API was used for integrating the HTC Vive headset to the OdenVR system. By using OpenVR, OdenVR could access various states of the HMD system, like orientation and actions. The API also contains models for the hand controllers (and other modules which belonged to the tracking system) which were used in order to render a 1:1 relation between the real and virtual controllers in the system.

2.4.3 GStreamer

The audio communication between the robot and the HMD has been implemented as a separate system. This system is based on the GStreamer framework because of its low latency. The system consists of a Gstreamer pipeline which takes the active microphone on the computer and transmits it to the speakers on the robot through UDP. It also listens for audio from the robot’s microphone and plays it to the active speaker on the Chapter 2. Background and related work 8 computer. For more information about GStreamer and the audio pipeline see subsection 3.2

2.5 Related works and technology

This section will go through related work about the navigation and also how to evaluate the telepresence communication.

2.5.1 Navigation algorithms in telepresence

Pang et al.[1] have studied the navigation experiences for telepresence robots. In this study the inhabitor was using a computer interface to control the robot. Pang et al.[1] emphasize that the robot’s speed needs to match human walking speed, with a maximum forward speed of 0.6 m/s and a maximum rotation speed of 0.9 rad/s. Pang et al.[1] have also developed high-level navigational autonomy such as the robot automatically follows an interactant and avoids obstacles while moving. In order to perform high navigational autonomy, Pang et al.[1] use a Kinect RGB-D camera, both for human and obstacle detection.

Pang et al.[1] recognize the need for manual control while navigating the robot in crowded places, but also argue that high-level navigational autonomy is preferred in most situa- tions since it allows the inhabitor to concentrate on the conversation instead of naviga- tion. However the manual navigation interface did not seam to be very intuitive in the study of Pang et al.[1]. All participants had to preform a training session in steering the robot, before to preforming the actual test sessions. It would be interesting to know if a more intuitive manual navigation interface would reduce the need for autonomic navigation. Due to time constraints this study will not include high level navigational autonomy, but Pang et al.[1] realizations on manual navigation will be a base when designing the navigation system.

Jia Yunde et al.[7] designed a telepresence system where the user controls a robotic avatar through a tablet and interacts with objects in the remote location. Yunde et al.[7] focused on the design of the interface for controlling the robot. Yunde et al.[7] found that using finger gestures directly on the video images gave the inhabitor better performance Chapter 2. Background and related work 9 while executing different tasks through the robot, compared to using buttons to control the robot’s movements. The facial expression on the participants also indicated that the cognitive workload was less when interacting with the video images than using buttons.

Over all, there has been a lot of work on telepresence robotics and on self navigating robotics. However, these systems have been computer screen based and therefore lost a lot of the inhabitor’s immersion.

Some projects have acknowledged this problem as well. Carlos P´erez Mejias[8] has taken the step to implement a telepresence robot where the user is using the head mounted display Oculus rift. Mejias[8] primary goal was to improve the user immersion by implementing free look control. However the robot does not have any cameras in the project. Instead a 3D-environment is provided by two RGB-D cameras in the room, and presented to the user from the robot’s perspective. Even though the camera setup was distinctly different, the project contained relevant information on the movement restrictions for an avatar robot that moves on wheels.

2.5.2 Telepresence as a communication tool

Ralph Schroeder[9] studied the area of social interaction in a virtual environment already at the beginning of the 20th century. He categorized the area into 4 categories:

• Presence

• Co-presence

• Communication

• Group dynamic

Unfortunately Schroeder[9] did not present any concrete way to measure these areas. However he presented some interesting foundings regarding communication. Schroeder[9] found that the quality of the audio source strongly influenced the amount of collabo- ration between participants in a shared virtual environment. Schroeder[9] also found that the social interaction between participants in virtual environments follow several conventions of the real world. These conventions include looking at the person that is speaking and keeping a distance between each other. Chapter 2. Background and related work 10

Stephen Green et al.[10] presented a more concrete way to measure communication. In their study they compared the communication in three different ways of group decision making. Green et al.[10] developed 5 variables on communication that could be measured by a self-report questionnaire after a group exercise.

• Amount of participation

• Negative Socio-Emotional Behavior

• Process Satisfaction

• Solution Satisfaction

• Informal Leadership

Even though the study did not involved remote communication, these measurements of communication are still relevant as they enable a comparison between different ways to communicate.

Martin Hassell et al.[11] proved the flexibility of these variables by using them in a much more related area. Hassell et al.[11] studied the effect of seeing a mirrored version of oneself in video communication. They constructed an experiment where small groups of people performed a group decision exercise using video communication. Half of the groups were able to see their own video feed along with the videos of the other group members, the other half only saw the other group members video feed.

Hassell et al.[11] compared the two types of video communication using a self report questionnaire that covered a subset of Green et al.[10] variables

• Participation

• Process Satisfaction

• Solution Satisfaction

In addition to the self report questionnaire, records were kept of how long time it took to complete the task and if the group came to the right answer. Chapter 3

Telepresence robot

This chapter will go through the telepresence robot. It is structured into three sections: Robot description, Audio communication and Navigation interface.

3.1 Robot description

This section will describe the telepresence robot used in this project, see Figure 3.1.

The robot contains 3 plastic wheels, where 2 of the wheels are driven by one stepper motor each. The stepper motor is connected to the wheels by angled gears that are pressed against each other by a spring, the combination of the spring and the angled gears guarantees a constant contact area and therefor instant response between the wheel and the motor.

The step motors are connected to a driver each that are being controlled by an ar- duino. The arduino in turn gets control signals from a server on a raspberry pi. The same raspberry have a microphone and a speaker connected for audio communication. The raspberry is connected to a switch which is connected to a router through a WiFi bridge. The switch also has the OdenVR Cube connected, that sends camera data to the computer independently from the server.

11 Chapter 3. Telepresence robot 12

Figure 3.1: The robot used for the telepresence communication.

3.1.1 Hardware

A complete list of components for the robot is presented in Table 3.1. The following subsections will describe some of the components in more detail.

3.1.1.1 Stepper Motor

The stepper motor used in this project is 42BYGH47-401A. This type of stepper motor is mainly used in mechanical motion control, like in laser cutters or 3D printers, however it can also be used in robotics [12].

A stepper motor is a type of motor with very high precision. It uses magnetism in- troduced by electrical coils to move an inner gear a discrete number of steps.[13] By Chapter 3. Telepresence robot 13

Component Quantity Wheels 3 Stepping motors 2 Stepping motor drivers 2 Arduino 1 Raspberry Pi 1 12 to 5 volt converter 2 LiPo batteries 2 8-port switch 1 WiFi bridge 1 OdenVR Cube 1 Zoom H2n 1 Speakers 2 wooden pole 1 router 1

Table 3.1: List of components on the telepresence robot designing the inner gear so that it only fits one coil at a time the gear can be rotated a fixed number of steps by energizing the coils in sequence, see Figure 3.2.

Figure 3.2: The top coil is active which makes the inner gear to move so that it is as close as possible to the gear. By continuously changing which coil that is active, the gear can be rotated with a very precise velocity.

Because one coil always is activated, the stepper motor will consume electricity even while standing still. However a stepper motor has its maximum torque at low speeds which make them ideal for applications requiring high precision at low speed like a robot moving in walking speed. [14] Chapter 3. Telepresence robot 14

3.1.1.2 Zoom H2n

The zoom H2n is a portable recording devise with five integrated microphones. Three of the microphones are arranged in an Mid-Side pattern and the other two as a x/y pair. By combining these two stereo models, the H2n can create a surround sound recording and compress it into a stereo signal.

3.1.1.3 OdenVR Cube

OdenVR Cube, is a development kit designed by Voysys to test 360◦ streaming with low budget cameras before buying an expensive 360◦ camera rig. The cube consist of 6 Raspberry Pi cameras placed in a cubic formation, see Figure 3.3. The camera lenses have 160◦ FOV each, resulting in a total 360◦ × 360◦ FOV. The streaming resolution is 1440 × 1080 p per camera. This gives a total spherical resolution of 4300 x 2150 px, 12px/degree horizontally and 12 px/degree vertically. Each camera is connected to a separate raspberry pi with a program which allows OdenVr to connect to them and acquire their ID and camera stream over Real-time Transport Protocol (RTP). The latency on the system is around 200ms depending on network condition. [15]

Figure 3.3: OdenVR Cube, is a development kit designed by Voysys to test 360◦ streaming with low budget cameras before buying an expensive 360◦ camera rig. Chapter 3. Telepresence robot 15

3.1.1.4 Router

During this setup the robot and the computer are connected on a local network. This project does not rely on any specific type of router, as long as it is able to give out network addresses. During the first half of the project the D-Link DIR-655 router was used, however doe to external circumstances this router could no longer be used and was replaced by an Apple Time Capsule router.

3.1.2 Software

Aside from hardware, the robot also required custom software that is described in these subsections.

3.1.2.1 TCP server

A Transmission Control Protocol (TCP) server, implemented in the programming lan- guage Rust, was created in order to receive messages containing steering information from the user. The server manages one connection at a time in order to restrict several users from conflicting with each other over the same robot unit.

3.1.2.2 Network protocol

A basic network protocol has been created in order for the server and client to commu- nicate steering information, connections and disconnections between each others.

3.1.2.3 Automatic startup for required programs

The audio and server, both have different programs that have to be activated in order for the system to work. For a long time this was done by accessing the raspberry though SSH and starting them manually. However this was a tedious task and a script was developed to start up all programs when the raspberry starts. Chapter 3. Telepresence robot 16

3.2 Audio communication

This subsection will cover the audio communication solution used in the project. The subsection starts by covering the basics about GStreamer as this is required knowledge in order to understand the audio setup.

3.2.1 GStreamer

GStreamer is a framework, released under the Lesser General Public License (LGPL), that can create pipelines of media-handling components. The framework’s core func- tionality is to link plugins that will provide various functionalities. The pluggable com- ponents can be mixed and matched into arbitrary pipelines in order to satisfy unique needs. The pipeline is designed to have as little overhead as possible above what the applied plugins bring. This makes GStreamer a good framework for applications which require low latency. [16] [17]

GStreamer is built around the concept of elements, the pipeline is essentially just a chain of elements linked together. All plugins to GStreamer are built upon a GStreamer element in order for them to fit in to the pipeline. An element has one specific function, such as reading data from a file or decoding incoming data. By chaining the elements together in a pipeline, a complex task can be accomplished as illustrated in Figure 3.4. [18]

Figure 3.4: An example pipeline in GStreamer consisting of 6 elements. Together the elements create a basic Ogg player. Image taken from [18]

In order for the elements to be linked together, all elements contain at least one pad. A pad is an analogy to a “plug” or “port”. A pad can either be an incoming (sink) or outgoing (source) port from the element as illustrated in figure 3.4. A pad restricts the type of data that flows through the element so that it fits the element’s implementation. Chapter 3. Telepresence robot 17

Depending on the element, a single pad can be able to accept several types of data formats. In order for a link to be established between two elements, the second element’s sink pad must be able to accept the data format on the first element’s source pad. The GStreamer pipeline uses a process called caps negotiation in order to get the pads to settle for a common data type to stream between the elements. [18]

In order to accomplish a complex task, a lot of elements might be required in the GStreamer pipeline. For these kind of scenarios GStreamer provides a special kind of element called a bin that contains several elements. A bin basically contains a pipeline between several elements in the format of an element. The bin can be used as an ordinary element and serves as an abstraction to the main pipeline in order to make complex structure easier to overview. [18]

GStreamer provides a large amount of elements and bins by default, which makes it possible to handle the majority of needs. However if a required functionality is missing, it is possible to write custom elements and use them in a pipeline together with the provided elements from GStreamer. [18]

3.2.2 Audio communication

The sound solution consists of two GStreamer pipelines. One on the Windows platform, and one on the Raspberry Pi located on the robot.

3.2.2.1 Windows

The pipeline on the windows side listens to the main microphone through a directsound- src element and sends that through an udpsink element to the raspberry on the robot. This pipe also listens for audio sent through UDP using the udpsrc element and plays it on the main speakers through the directsoundsink element. The pipeline is illustrated in Figure 3.5.

When the HTC Vive is connected to the computer, it’s integrated speakers and micro- phone is automatically set as default input and output device in windows. However as described in section 6.4, the frequency setting might have to be adjusted manually to 48kHz. Chapter 3. Telepresence robot 18

Figure 3.5: The final GStreamer pipeline for the sound on the windows side.

3.2.2.2 Raspberry Pi

The pipeline on the Raspberry Pi side listens to the ZOOM H2N microphone through the alsasrc element and sends that to the webrtcdsp that removes echo as described in subsection 6.4. The echo free sound is then sent to the computer through an udpsink using the computer’s ip-address. This pipe also listens for audio from the HMD using the udpsrc element and lets the webrtcechoprobe element analyse it in order for the webrtcdsp element to work. The sound is then sent to the plugged-in speakers through the alsasink element. The pipeline is illustrated in Figure 3.6.

Figure 3.6: The final GStreamer pipeline for the sound on the Raspberry Pi side.

3.3 Navigation interfaces

Several navigation interfaces were designed and tested during the development. The de- veloped navigation interfaces were divided into categories based on their characteristics: Chapter 3. Telepresence robot 19

• Button based navigation interface.

• Head orientation based navigation interface

• Hand orientation based navigation interface

This section describes the different interfaces and their advantages and disadvantages.

3.3.1 Button based navigation interfaces

The first interfaces implemented were solely based on the vive control’s buttons.

The biggest drawback of this type of interface was that the user had to face the same direction as the robot in order for the controller to make sense. If the user was faced in another direction, steering the robot was very difficult and confusing.

3.3.1.1 Car inspired remote control

This interface used the trigger button on the back of the controller as a speed input (like a gas pedal) and the trackpad on the front in order to turn, see Figure 3.7.

This interface was early rejected as it required a lot of cognitive work from the user to both keep track of the trackpad and the trigger button to reach a desired movement. Also this user interface did not support moving backwards.

3.3.1.2 Touch based remote control

This interface used the distance from the middle of the trackpad to the finger to deter- mine both speed and direction on the robot, see Figure 3.8. The trigger button on the back of the controller was used to increase speed if the user wanted to move faster.

This interface required less cognitive work from the user as the robot could be controlled with only one input, this interface also allows the user to move backwards. The most appreciated ability of this interface was to slide with a finger over the trackpad to steer. However when the user slides from the upper half to the lower half of the trackpad, the robot started to move backwards, which could be confusing. Chapter 3. Telepresence robot 20

Figure 3.7: Car inspired remote control. The trackpad on the front (seen to the left) was used to turn, the trigger button on the back (seen to the right) was used to control the speed. Image courtesy of developer.viveport.com.

Figure 3.8: Touch based remote control. When the thumb rested in the middle of the trackpad (or did not touch the trackpad) the robot was still, the further out to the edge the thumb was located, the faster the robot was moving in that direction. The trigger button on the back of the controller was used to increase the speed if the user wanted to move even faster. Image courtesy of developer.viveport.com.

3.3.2 Hand orientation based navigation interfaces

These interfaces used the orientation and/or the position of the controller as an input in order to decide where to move. Chapter 3. Telepresence robot 21

3.3.2.1 Point based navigation

This interface used the controller’s orientation to determine which direction to move the robot and the trigger button on the back of the controller as velocity input, see Figure 3.9. Only one controller was acting as input.

Figure 3.9: Point based navigation. The direction of the controller determined which direction the robot would move in. The trigger button on the back of the controller was used to control the speed. Image courtesy of developer.viveport.com.

3.3.2.2 Crosshair inspired navigation

This navigation interface used the position of both the headset and the controller. The robot would turn in the direction of the controller relative the head as if aiming with a crosshair, see Figure 3.10. The speed of the robot would be determined by the distance between the controller and the headset in the horizontal plane. So that if moving the controller further out, the robot would move faster.

3.3.3 Head orientation based navigation interfaces

These interfaces used the orientation and/or the position of the head as an input in order to decide where to move.

3.3.3.1 Gaze based navigation

This Interface used the head orientation to determine the direction of the robot and the trigger button on the back of the controller as velocity input, see Figure 3.11.

In order for this input system to work, the head tracking coordinate system had to be aligned to the cameras coordinate system relative the robot. This was done by rotating Chapter 3. Telepresence robot 22

Figure 3.10: Crosshair inspired navigation. The position of the controller relative to the head determined which direction the robot would move in. In this example the robot would move straight forward. The speed of the robot was determined by the distance between the controller and the headset in the horizontal plane. Image courtesy of developer.viveport.com.

Figure 3.11: Gaze based navigation. The robot moved in the direction that the headset was orientated. The trigger button on the back of the controller was used to control the speed. Image courtesy of developer.viveport.com. the cameras local coordinate system, until it matched the SteamVR coordinate system. This calibration will last as long as the cameras position on the robat is fixed.

3.3.3.2 Touch based remote control relative head direction

This navigation interface is an extension to Touch based remote control explained in section 3.3.1.2.

This control system was designed to ease steering of the robot in cases when the user has turned his/hers body in the real world. The straight forward command on the controller Chapter 3. Telepresence robot 23 will always be the way that the user looks at and the user will no longer have to keep track of where forward is in the real world.

However it turned out that the users don’t usually turn around a lot while steering the robot, and the dual input system made it harder to steer the robot than using only the buttons. Chapter 4

User evaluation

Two evaluations were made during this project. The first was about the navigation interface. The result from this evaluation was used in order to choose a navigation interface for the second evaluation, that compared the telepresence experience with video communication. This chapter will describe how the two evaluations were performed.

4.1 Evaluation of navigation interface

In order to answer the question: How to design an intuitive navigation interface, the developed navigation interfaces were evaluated through a user study.

A qualitative test was performed with 5 participants, 3 male and 2 female. The test persons were of different ages between 26 and 49 years old. Half of the test group had previous experience of controlling remote virtual characters in computer games, while the other half had little or no previous experiences of this type of actions.

In order to shorten the test procedure to a reasonable length only one navigation interface from each category was chosen to the user test. The chosen navigation interfaces were: Touch based remote control (see 3.3.1.2), Point based navigation (see 3.3.2.1) and Gaze based navigation (see 3.3.3.1). The test took 20-30 minutes per participant.

24 Chapter 4. User evaluation 25

4.1.1 Procedure

The evaluation consisted of 3 iterations, one for each steering interface. Each iteration consisted of a quick explanation of the steering interface whereupon the test person was instructed to perform a task with the robot. When the first task was fulfilled, the test person was instructed with the next task while still wearing the HMD. For the first task, the test person was instructed to follow the path described in Figure 4.1. The second task was to find an object placed in an area close by, using vision and if necessary move closer to the object for identification. The time it took to complete each task was measured with a stopwatch by the test leader. When both tasks were performed, the user was instructed to remove the HMD and answer a questionnaire on a computer. This procedure was repeated for each steering interface system.

Figure 4.1: The first task is to move the robot from the start position (marked with S) to point A and further on to point B.

After all the test iterations were done, a few final questions where asked about which steering interface the user preferred. The full questionnaire used in the test is included in Appendix B. By alternating the tests with questions the user would easily remember the experience the question refered to. It also gave the user some rest between the tests, Chapter 4. User evaluation 26 giving the user ability to enter the next test with fresh eyes. The order of steering interfaces tried in the test was altered between the test persons. The test persons were also instructed to abort the test if they experienced to much dizziness, however none of the participant felt the need to abort the test.

4.2 Evaluation of telepresence as a communication tool

In order to evaluate the telepresence robot as a communication tool compared to a video call, the two communication methods were evaluated through a user study. The video communication tool used in this evaluation was Skype.

A subset of the measurable variables suggested in section 2.5.2 were chosen. The selected variables were:

• Efficiency

• Amount of participation

• Process Satisfaction

• Solution Satisfaction

• Co-presence

• Cognitive load

• Negative socio-emotional behavior

The evaluation study was performed with 24 participants, 16 male and 8 female. The test persons were of different ages between 20 and 53 years old, where the mean age was 28. The test group had a lot more previous experience of video calls, than of controlling remote virtual characters.

4.2.1 Measurement method

The communication variables were measured in different ways. Efficiency was measured through observations by the test leader. The rest of the variables were measured through Chapter 4. User evaluation 27 a questionnaire to the participants. Appendix A contains a complete description of how the variables were measured.

4.2.2 Procedure

The evaluation took place in 2 rooms and required two participants per turn. The evaluation consisted of 2 iterations, one for evaluating the telepresence communication and one for evaluating the video communication. Each iteration consisted of a cuboidal puzzle where the remote participant saw the solution and had to communicate it to the participant that was at the puzzle, see Figure 4.2. After the participants had solved the puzzle, they answered a questionnaire about how they perceived the communication. The time it took to complete the puzzle was measured with a stopwatch by the test leader. It was also noted if the puzzle was laid correctly or if there were any mistakes.

Figure 4.2: To the left is the remote persons view when using video communication, the laptop displayed both the video feed and a 3D model of the complete puzzle. To the right is the remote persons view when using telepresence communication, a 3D model of the complete puzzle was positioned above the user’s hand control (showed in blue).

Telepresence Video Puzzle 1 3 3 Puzzle 2 3 3

Table 4.1: The order of which puzzle and communication method to start with was varied as much as possible. Which puzzle to perform for each communication method were also varied.

In the user study twelve tests where performed in total. Since the tests consisted of both trying the telepresence and video communication, two puzzles where created so that the interactant would not know the solution beforehand when using the next communication method. The order of which puzzle and communication method to start with was varied as seen in Table 4.1. The table shows that 3 of the tests started with the task of solving Chapter 4. User evaluation 28 puzzle 1 using telepresence communication. The other half of those tests would therefore consist of solving puzzle 2 using video communication. Chapter 5

Results

5.1 Steering interfaces

The result displayed in Figure 5.1 shows that point based navigation was the most popular navigation interface.

Figure 5.1: The answer to which navigation interface the user likes the best.

There was no particular difference in performance between the participants with more technical experience than the ones without. The participants did increase their skill between the trials, but that did not affect their perception of which movement system they liked the best.

29 Chapter 5. Results 30

5.2 Telepresence experience

The telepresence communication was evaluated by 7 variables: Efficiency, Participation, Process Satisfaction, Solution Satisfaction, Co-presence, Cognitive load and Negative Socio-emotional behavior.

Efficiency was measured with time and accuracy. The result in Table 5.1 shows that the telepresence communication was clearly more efficient than video communication.

Variable Telepresence Video Correctness 100% 83% Mean time 5:10 5:43

Table 5.1: Time and correctness comparison between telepresence and video com- munication. The given time is the mean value of the time it took to complete the puzzle.

The other values were measured by a questionnaire answered by the participants. This enables these variables to be analyzed separately for the inhabitor and the interactant.

5.2.1 The interactant

Table 5.2 contains a summary of the interactants’ answers to the questions in Appendix A. The answers indicate that the telepresence experience had little impact on partici- pation, process satisfaction, solution satisfaction, and negative socio-emotional behavior for the interactant.

Variable Telepresence Video Amount of participation 2.95 2.87 Process Satisfaction 3.23 3.38 Solution Satisfaction 3.54 3.42 Negative Socio-emotional Behavior 1.31 1.42 Co-presence 3.16 2.89 Cognitive Load 2 2.58

Table 5.2: The mean of the measured variables from the interactant answers. The range of all variables was 1-4, the higher the number the more participation, satisfaction, etc. Values rounded to 2 decimals precision.

The interesting variables in the result are Co-presence and Cognitive load. The Telepres- ence experience showed an increased perception of co-presence than video communica- tion. It also put less of a cognitive load on the interactant which is desirable. These two Chapter 5. Results 31 variables give the telepresence experience a great advantage compared with traditional video communication.

When directly asking the test persons which communication tool they preferred, the interactants greatly favored the telepresence experience, as seen in Figure 5.2. The main reasons mentioned by the test persons were that the robot had a better view, and that it was easier to concentrate on the task at hand. One of the main problems with the Video communication was that the interactant always had to think about how to direct the camera.

Figure 5.2: The interactant’s preferred communication method.

5.2.2 The inhabitor

Table 5.3 contains a summary of the inhabitors’ answers to the questions in Appendix A. It shows that the inhabitors were more satisfied with the process and solution when using video communication. The telepresence communication also increased the cognitive load of the inhabitor. The amount of participation and negative socio-emotional behavior were sightly in favor of the telepresence communication, however the difference was too small to draw any conclusions from it.

The variable with most difference in the result was Co-presence. The telepresence com- munication showed a big increase in perceived co-presence compared to video commu- nication. Chapter 5. Results 32

Variable Telepresence Video Amount of participation 2.95 2.9 Process Satisfaction 3.06 3.23 Solution Satisfaction 3.58 3.79 Negative Socio-emotional Behavior 1.11 1.2 Co-presence 3.5 2.69 Cognitive Load 2.08 1.75

Table 5.3: The mean of the measured variables from the inhabitors’ answers. The range of all variables was 1-4, the higher the number the more participation, satisfaction, etc. Values rounded to 2 decimals precision.

When asking the test persons which communication tool they preferred, a majority of the inhabitors answered the telepresence experience, as seen in Figure 5.3. The main reason was that the robot offered a better overview since the field of view was superior in the telepresence experience. The increased immersion and the ability to control the camera position were also strongly contributing factors for the telepresence experience. A drawback in the telepresence experience was the limitations in movement.

Figure 5.3: The inhabitor’s preferred communication method

5.2.3 The combined result

In order to determine an overall result of the communication tools, the answers from the interactant and inhabitor where combined and illustrated in Table 5.4.

The overall participation and solution satisfaction were quite similar between the two communication methods. The telepresence experience had less cognitive load on the Chapter 5. Results 33

Variable Telepresence Video Amount of participation 2.93 2.88 Process Satisfaction 3.15 3.30 Solution Satisfaction 3.56 3.60 Negative Socio-emotional Behavior 1.21 1.35 Co-presence 3.33 2.79 Cognitive Load 2.04 2.15

Table 5.4: The mean of the measured variables for the combination of both the interactants and inhabitors answers. The range of all variables was 1-4, the higher the number the more participation, satisfaction, etc. Values rounded to 2 decimals precision.

users and there was also less negative socio-emotional behavior in that communication. Telepresence experience also showed a big increase in perceived co-presence compared to video communication. However video communication still holds a stronger process satisfaction. Chapter 6

Discussion

This chapter discusses the results and observations from the evaluation tests as well as other factors that influenced the project.

6.1 Navigation interface evaluation

When it came to steering the robot, Gaze based navigation and Point based navigation were the easiest navigation interfaces. The test group found that the gaze based navi- gation gave the best immersion, however it also increased the risk of dizziness since the view rotated when the user turned his/her head. Because of the robot’s slow movement speed, the users also liked to be able too look around while moving in a fixed direction.

In the Gaze based movement steering interface, one user thought that the robot always would turn so that it matched where the user faced and thereby would be ready to move in the desired direction on cue. In the user test, the robot did not move until the user commanded it to, resulting in a sharp turn if the user moved his/her head in a large angle while the robot was still. However constructing this kind of interaction was considered desirable as the users said that the biggest source of dizziness in this interaction was the turns.

The users also commented that the cameras’ color perception was bad and that it was hard to see in dark areas. Another comment was that the image stitching needed to improve especially at the ground. These drawbacks did not affect the steering ability, however they did increase the dizziness factor of the experience. 34 Chapter 6. Discussion 35

To summarize, Point based movement was the best steering interface in this user test. Overall the users did feel present at the remote location.However there are still things to improve like using an inertial measurement unit (IMU) to stabilize the images for the users while the robot turns.

As a result of the study, an attempt to make the steering interface more understandable was to add visual cues as to how the robot would drive. The visual cues were designed as abstract wheel tracks that pointed in the direction that the robot was going to move (see gray line in figure 4.2). This tool proved helpful when navigating the robot for the first time.

6.2 Telepresence compared to video communication

6.2.1 Measurement method

The measurement methods for the communication variables described in Appendix A, where developed in different ways.

The measurement method for the variables Participation, Process Satisfaction, Solution Satisfaction and Negative Socio-Emotional Behavior were following Green et al. [10] presented method for measuring communication.

The measurement method for Efficiency was inspired from the Hassell et al. [11] study in video communication.

The measurement method for Co-presence and Cognitive load was developed by the author.

6.2.2 Evaluation procedure

Since the test consisted of both trying the telepresence and video communication, two puzzles where created so that the interactant would not know the solution on beforehand when performing the next communication method. However when creating two puzzles there is no guarantee that the puzzles will have exactly the same difficulty. Also the participant might still have become better at the task between the tests which might Chapter 6. Discussion 36 have an influence on the result. In order to decrease influences of learning and difficulty variances, the test order was varied as much as possible as displayed in Table 4.1.

6.2.3 Result

The users were more satisfied with the process when using video communication, which is quite surprising given that video communication provided more negative socio-emotional behavior and a larger cognitive load than the telepresence communication.

The telepresence communication increased the cognitive load of the inhabitor, however not as much as it eased the load of the interactant. When combining the result, the total cognitive load was less in the telepresence communication.

6.2.3.1 The interactant

For the interactant telepresence communication was a clear winner in the comparison.

The telepresence experience was said to be more engaging to interact with, since the interactant used its hand to point at an object. This was perceived to be a more natural way to convey information than in the Video communication, where the interactant pointed at an object by directing the camera towards that object.

Some of the participants perceived video communication to be faster and more flexible when it comes to moving around. Another advantage of video communication was that the interactant could see the inhabitor’s face. Still a majority of the participants liked the telepresence communication better because of the cognitive relief of not having to concentrate on the camera.

One interesting finding in Table 5.2 is that while participation, solution satisfaction and socio-emotional behavior have sightly better values in the telepresence experience than the traditional video communication method, the interactant seams to slightly prefer the video communication process.

As described in Appendix A process satisfaction was measuerd by 4 questions. By analyzing Figure 6.1 and 6.2 one can see that the most contributing reasons to this result are that the participants found that the video solution was more understandable Chapter 6. Discussion 37

Figure 6.1: The interactant’s pro- Figure 6.2: The interactant’s pro- cess satisfaction for telepresence, av- cess satisfaction for the video commu- erages 3.23 nication, averages 3.38 and effective. As explained in section 4.2 the test group was more familiar with video communication than telepresence communication. With that in mind, one can argue that the familiarity of video communication to the test subjects could have influenced the result. On the other hand it is likely that telepresence communication within the next few years will have similar conditions, and therefore should be ready to overcome this challenge.

Another interesting observation was that the users were more satisfied with the solution when using video communication, while in reality more errors were committed while using video communication. As seen in Table 5.1 all puzzles laid with the telepresence robot were correct, while only 83% of the puzzles laid using video communication were correct.

To summarize, the telepresence experience was the best communication tool for the interactant, both with regard to the measured variables and when it comes to the users opinions.

6.2.3.2 The inhabitor

For the inhabitor telepresence communication was the winner in the comparison, however it was a small margin.

Among the participants that preferred video communication, familiarity with the tech- nology was one of the reasons. Many of the users had not tested VR before and some thought that the experience in itself was a distraction from the task. Another reason for Chapter 6. Discussion 38 video was the limitation in movement with the telepresence robot. A human is much more flexible when it comes to moving around objects compared to a robot.

Some of the comments that the inhabitors had about the telepresence experience was about the physical limitations of the robot. The most sought after function was the ability to point at objects. One person felt so limited by the lack of a pointing option that it was the decisive factor for choosing Skype over the telepresence robot in the question about which communication tool that was preferred. One suggestion was to implement a laser pointer for the inhabitor to control with the VR wand.

To summarize, the telepresence experience in its current condition was not the obvious communication tool for the inhabitor in regard to the measured variables. However it was still slightly more preferable than the video communication tool when it comes to the users opinions. It can be claimed that the telepresence experience is about as good as a communication tool as video, from the inhabitor’s perspective.

6.3 Telepresence robot

The biggest challenge when assembling the robot was to supply power to all electrical components of the robot. There was already an existing power supply to the stepping motors and the video cameras, however this solution needed to be reworked with the addition of a Raspberry Pi, microphone, speaker, WIFI bridge and a switch. Soldering of different wires and adapters were required in order to make the robot fully wireless.

6.3.1 ZOOM H2N microphone

The ZOOM H2N microphone is a very practical microphone with small size and good sound quality. However in a setup when the microphone is connected to a computer through USB, one needs to manually start its microphone feature each time the micro- phone starts up. This is because the microphone gives the user a choice if they want to transfer files to the computer or use it as an microphone. Ideally one would like the microphone feature to be automatically activated so that no extra manual steps would be required after adding power to the system. Chapter 6. Discussion 39

6.3.2 Camera height

During the evaluation described in section 4.1, the robot had a fixed camera position of 182 cm above ground, which fits the eye level of a person around 197 cm. The test group were of length between 166-183 cm and all but one commented on the height difference while driving the robot. None of the participants found the height difference distracting while driving, however it removed some of the immersion of the experience as it was more obvious that it was not their own body.

Worth noting is that the participants commented on the height difference before being asked about it or even being asked about their own length. The fact that they pointed it out, proves that camera height is an important part of a telepresence experience.

Another interesting note is that the test person that did not notice the height difference was not the tallest person in the test group. This indicates that it is quite personal how tolerant one is to height difference in telepresence experiences.

6.4 Audio communication

In the first implementation of the audio communication the microphone on the robot received the sound from the speakers and returns it to the inhabitor as an echo. When testing the implementation this was found to be so distracting that an echo cancellation was required to be implemented in order for the audio solution to be acceptable.

In order to solve the problem a webrtc element (see subsection 3.2) was added to the sound pipeline. The webrtc element uses a probe to analyze the outgoing sound on the speaker side in order to remove those frequencies from the audio being transmitted from the microphone. In order for the webrtc element to work the audio frequency had to be in 48kHz, not the ordinary 44.1kHz, which means that the properties on the microphone may have to be changed in order for the pipeline to work. Chapter 7

Conclusion and future work

Some drawn conclusions are summarized and discussed in this chapter. The conclusions are based on the user tests made in Chapter 4 and the research questions from section 1.4. Future work of this thesis is also discussed in this chapter.

7.1 Navigation interfaces for telepresence

Known to the author, there is no telepresence robot that can completely mimic a full human body in real time. Navigating interfaces for telepresence must therefore maintain a good balance between immersion and abstraction. With the types of robots developed in present time, the abstraction is crucial to reduce motion sickness in telepresence. Visual cues about how the robot will move is also important to the navigation interface.

Point based navigation as describe in section 3.3.2.1 has a good balance between im- mersion and abstraction. It works well for manual navigation, especially combined with visual clues, however it does increase the cognitive load on the user. As discussed by Pang et al.[1] this interface would probably benefit from a complementary high level navigation method.

40 Chapter 7. Conclusion and future work 41

7.2 Measurement method for communication

Based on the relevant literature discussed in section 2.5.2, the author was able to con- clude 7 variables on communication that could be measured as described in Appendix A.

The 7 variables were:

• Efficiency

• Participation

• Process Satisfaction

• Solution Satisfaction

• Co-presence

• Cognitive load

• Negative Socio-Emotional Behavior

None of the 7 variables were extensive enough to evaluate the communication by itself. However together they made a good foundation for evaluating remote communications in a way that allows comparison between different communication methods.

7.3 Telepresence as a communication tool in comparison with video communication

By evaluating Table 5.4 combined with Table 5.1 the following conclusions can be drawn about the communication variables:

By combining the extracted variables the conclusion can be drawn that telepresence surpasses video as a communication tool. This can be viewed as an impressive feat as the robot used in this project was very unpolished. However as discussed in section 6.2.3.2, the telepresence communication was far more appreciated by the interactant than the inhabitor. Chapter 7. Conclusion and future work 42

Variable More optimal communication tool Efficiency Telepresencecommunication Participation —– Process Satisfaction Video Solution Satisfaction —– Co-presence Telepresence communication Cognitive load Telepresence communication Negative Socio-Emotional Behavior Telepresence communication

Table 7.1: Which communication method that was preferable with regard to the measured variable. For the variables Participation and Solution Satisfaction, the two communication method were equally good.

As of today, telepresence robots may still not be mature enough for general usages. How- ever given the result of this evaluation, telepresence robots will probably start competing with the current communication tools within a not too distant future.

7.4 Future work

During the project a lot of ideas arose as of how to improve the telepresence robot. Since the project is limited in time not all ideas could be realized. The most desired improvement the author would like to have done is semi autonomic navigation and image stabilization through the use of an IMU.

7.4.1 Semi autonomic navigation

One desirable navigation interface for the telepresence robot would be to point at a location to which the robot would automatically travel. As Pang et al.[1] pointed out (see section 2.5), this type of navigation might have significantly reduced the cognitive load of the inhabitor. Unfortunately there was not enough time for implementation of this type of navigation in this project.

7.4.2 Image stabilization

As discussed in section 6.1, an IMU could significantly improve the telepresence robot. Enabling it to rotate without the user noticing and stabilizing the image feed as the robot moves. The telepresence robot is actually equipped with an IMU, and some Chapter 7. Conclusion and future work 43 development for integrating it into the telepresence solution was started. However due to time constraints it was never fully implemented and had to be left for future work.

7.4.3 Global network

This project used a local network solution to test the concept of telepresence communi- cation. However in order to utilize telepresence for long distance telecommunication a global network solution is required. The use of a global network is expected to increase the latency of both video and sound. For telepresence robots based on two cameras that would move accordingly to how the user moved its head, this kind of latency would cause motion sickness. But with the 360◦ camera solution used in this project, the tracking system is detached from the video feed which should make it a bit more latency toler- ant. It would be interesting for future works to investigate how much latency would be acceptable in this kind of telepresence solution. Appendix A

Communication measurement method

This Appendix will cover how the communication variables were measured.

A.1 Efficiency

Efficiency was measured with time and accuracy, which was noted by the test leader.

A.2 Participation

Participation was measured by 5 questions in the questionnaire. All question were on a four grade scale.

Made suggestions about doing the task (Not at all - Very much) Gave information about the problem (Not at all - Very much) Asked others for their thoughts or opinions (Not at all - Very much) Showed attention and interest in the team’s activities (Not at all - Very much) Asked for suggestions from others on the team (Not at all - Very much)

44 Appendix A. Measurement method 45

A.3 Process Satisfaction

Process satisfaction was measured by 4 questions in the questionnaire. All question were on a four grade scale.

How would you describe your team’s problem solving process? (Inefficient - Efficient) How would you describe your team’s problem solving process? (Uncoordinated - Coordinated) How would you describe your team’s problem solving process? (Confusing - Understanding) How would you describe your team’s problem solving process? (Dissatisfying - Satisfying)

A.4 Solution Satisfaction

Solution satisfaction was measured by 2 questions in the questionnaire. All question were on a four grade scale.

To what extent do you feel committed to your team’s solution? (Not at all - Very much) To what extent are you confident that your team’s solution is correct? (Not at all - Very much)

A.5 Negative Socio-Emotional Behavior

Amount of negative socio-emotional behavior was measured by 3 questions in the ques- tionnaire. All question were on a four grade scale.

Felt frustrated or tense about others’ behavior (Not at all - Very much) Rejected others’ opinions or suggestions (Not at all - Very much) Your opinions or suggestions were rejected (Not at all - Very much)

A.6 Co-presence

Co-presence was measured by 3 questions in the questionnaire. All question were on a four grade scale.

How good track did you feel that you had on your partner’s activity? (Not at all - Very much) How good track did you feel that your partner had on your activity? (Not at all - Very much) How present did you feel your partner was at the scene? (Not at all - Very much) Appendix A. Measurement method 46

A.7 Cognitive load

Amount of cognitive load was measured by 1 question in the questionnaire. The question was on a four grade scale.

How much did you concentrate on the puzzle and how much did you concentrate on the robot? (Concentrate on the robot - Concentrate on the puzzle) Appendix B

Evaluation Questionnaire

The following questionnaire was used at the evaluation of the steering interface. The first part was presented to the user before the test started, the second part was presented after each test session, and the last part was presented to the user after all tests had been completed, as described in sec 4.1.

47 User interface evaluation

Welcome and thanks for participating in this evaluation.

You will test 3 different steering experience for remotely controlling a robot. Each test consists of 2 tasks for you to perform. The performance will be measured by time, however it's not you who are being measured but the interaction method, so don't feel any pressure. After each test you will be asked to answer 5 questions regarding the experience.

First a few background questions for statistical reasons.

Age? ______

Gender? - Male - Female

What the type of industry do you work in? ______

How much experience do you have in remotely controlling objects or virtual characters? For example, do you play 1:st or 3:rd person computer games or have a remotely controlled vehicle at home.

______

For each control model test: ------

How was it to move the robot in the desired direction? How much did you concentrate on how to drive the robot in comparison to performing the task you were given?

Very easy 1 2 3 4 Very hard

How emerged did you feel while driving the robot?

I felt like I was 1 2 3 4 I felt totally the robot disconnected from the place the robot was in

Did you experience any dizziness during the interaction?

None at all 1 2 3 4 A lot of dizziness

Did you experience any dizziness after the interaction?

None at all 1 2 3 4 A lot of dizziness

Do you have anything else to say about the experience? ______

How long time did it take to complete the movement test? This will be filled in by the administrator ______

How long time did it take to find the object? This will be filled in by the administrator ______

------Final questions:

Which of the movement controls did you like the most? - Point based movement - Touch based remote control - Gaze based movement - Other: ​______

What made that control model better than the others? ______

How tall are you? (in cm) ______

Did you notes any difference in height between you and your robot avatar? If so, how did that make you feel? Was it distracting when you performed your task?

______

How did you experience the movement speed? Was the speed easy or hard to regulate? Was the maximum speed too fast or too slow?

______

Do you have anything else you would like to add? Anything that comes into mind, maybe ideas for future control models?

______

Bibliography

[1] W.C. ( 1 ) Pang, G. ( 2 ) Seet, and X. ( 2 ) Yao. “A Study on High-Level Au- tonomous Navigational Behaviors for Telepresence Applications.” In: Presence: Teleoperators and Virtual Environments 23.2 (2014), pp. 155–171. issn: 15313263. url: https://login.e.bibl.liu.se/login?url=https://search-ebscohost- com.e.bibl.liu.se/login.aspx?direct=true&AuthType=ip,uid&db=edselc& AN=edselc.2-52.0-84905509642&lang=sv&site=eds-live&scope=site.

[2] Thomas D. Parsons, Andrea Gaggioli, and Giuseppe Riva. “Virtual Reality for Research in Social Neuroscience.” In: Brain Sciences (2076-3425) 7.4 (2017), pp. 1 –21. issn: 20763425. url: https://login.e.bibl.liu.se/login?url=https: //search-ebscohost-com.e.bibl.liu.se/login.aspx?direct=true&db=a9h& AN=122753109&site=eds-live&scope=site.

[3] Giuseppe Riva. “Virtual reality as communication tool: A socio-cognitive analy- sis.” In: Communications through virtual technologies: Identity, community and technology in the communication age. Studies in new technologies and practices in communication. IOS Press, 2001, pp. 47 –56. isbn: 1-58603-162-7. url: https:// login.e.bibl.liu.se/login?url=https://search-ebscohost-com.e.bibl. liu.se/login.aspx?direct=true&db=psyh&AN=2001-18701-002&site=eds- live&scope=site.

[4] Allen J. Fairchild et al. “A Telepresence System for Collaborative Space Operation.” In: IEEE Transactions on Circuits Systems for Video Tech- nology 27.4 (2017), pp. 814 –827. issn: 10518215. url: https://login.e.bibl. liu.se/login?url=https://search-ebscohost-com.e.bibl.liu.se/login. aspx?direct=true&db=buh&AN=122420456&site=eds-live&scope=site.

51 Bibliography 52

[5] HTC Corporation. INSIDE THE HEADSET. url: https://www.vive.com/eu/ product/ (visited on 05/26/2017).

[6] Valve. OpenVR API documentation. url: https://github.com/ValveSoftware/ openvr/wiki/API-Documentation (visited on 05/10/2017).

[7] Yunde Jia et al. “Telepresence Interaction by Touching Live Video Images”. In: CoRR abs/1512.04334 (2015). url: http://arxiv.org/abs/1512.04334.

[8] Carlos P´erez Mej´ıas. “Design of a telepresence interface for direct teleoperation of robots : The synergy between Virtual Reality and FreeLook Control”. MA thesis. KTH, Computer Vision and Active Perception, CVAP, 2016, p. 53.

[9] Ralph Schroeder. “Social Interaction in Virtual Environments: Key Issues, Com- mon Themes, and a Framework for Research”. In: The Social Life of Avatars: Presence and Interaction in Shared Virtual Environments. Ed. by Ralph Schroeder. London: Springer London, 2002, pp. 1–18. isbn: 978-1-4471-0277-9. doi: 10.1007/ 978-1-4471-0277-9_1. url: http://dx.doi.org/10.1007/978-1-4471-0277- 9_1.

[10] Stephen G. Green and Thomas D. Taber. “The Effects of Three Social Decision Schemes on Decision Group Process.” In: Organizational Behavior Human Per- formance 25.1 (1980), pp. 97 –106. issn: 00305073. url: https://login.e.bibl. liu.se/login?url=https://search-ebscohost-com.e.bibl.liu.se/login. aspx?direct=true&AuthType=ip,uid&db=buh&AN=6336736&lang=sv&site= eds-live&scope=site.

[11] Martin D. Hassell and John L. Cotton. “Some things are better left unseen: Toward more effective communication and team performance in video-mediated interac- tions.” In: Computers in Human Behavior 73 (2017), pp. 200 –208. issn: 0747-5632. url: https://login.e.bibl.liu.se/login?url=https://search-ebscohost- com.e.bibl.liu.se/login.aspx?direct=true&AuthType=ip,uid&db=edselp& AN=S0747563217301966&lang=sv&site=eds-live&scope=site.

[12] Seeed Technology. Step Motor 42BYGH47-401A Data Sheet. 2016. url: ftp : / / download . epson - europe . com / pub / download / 3785 / epson378504eu . pdf (visited on 03/02/2017).

[13] Anton Bondesson and Fredrik Johansson. Modellering av mekatroniksystem f¨or spj¨allstyrning. 2013. Bibliography 53

[14] Adafruit. What is a Stepper Motor? 2014. url: https : / / learn . adafruit . com / all - about - stepper - motors / what - is - a - stepper - motor (visited on 03/02/2017).

[15] Torkel Danielsson. Full-360◦ VR live-streaming development kit. url: http:// odenvr.com/devkit/ (visited on 05/04/2017).

[16] GStreamer. GStreamer open source multimedia framework. url: https://gstreamer. freedesktop.org (visited on 03/14/2017).

[17] GStreamer. What is GStreamer? url: https://gstreamer.freedesktop.org/ documentation / application - development / introduction / gstreamer . html (visited on 03/15/2017).

[18] GStreamer. Foundations. url: https://gstreamer.freedesktop.org/documentation/ application-development/introduction/basics.html (visited on 03/15/2017).