Linköping University | Department of Computer Science Master thesis, 30 ECTS | Datateknik 2020 | LIU-IDA/LITH-EX-A--20/058--SE

A Comparison of WebVR and Native VR – Impacts on Performance and User Experience

Matteus Hemström Anton Forsberg

Supervisor : Vengatanathan Krishnamoorthi Examiner : Niklas Carlsson

Linköpings universitet SE–581 83 Linköping +46 13 28 10 00 , www.liu.se Upphovsrätt

Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare – under 25 år från publiceringsdatum under förutsättning att inga extraordinära omständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns lösningar av teknisk och admin- istrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sam- manhang som är kränkande för upphovsmannenslitterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/.

Copyright

The publishers will keep this document online on the Internet – or its possible replacement – for a period of 25 years starting from the date of publication barring exceptional circum- stances. The online availability of the document implies permanent permission for anyone to read, to download, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the con- sent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping Uni- versity Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/.

Matteus Hemström © Anton Forsberg Abstract

The (VR) market has grown considerably in recent years. It is a technol- ogy that requires high performance in order to provide a good sense of presence to users. Taking VR to the web would open up a lot of possibilities but also poses many challenges. This thesis aims to find out whether VR on the web is a real possibility by exploring the current state of performance and usability of VR on the web and comparing it to native VR. The thesis also aims to provide a basis for discussions on the future of VR on the web. Two identical VR applications were built to make the comparison, one built for the web and one built for native Android. Using these applications, a two-part study was con- ducted with one part focusing on performance and the other on the user experience. The performance evaluation measured and compared performance for the two applications, and the user study used two separate questionnaires to measure the users experienced presence and VR sickness. The performance study shows that the is clearly lagging behind the native application in terms of pure performance numbers, as was expected. On the other hand, the user study shows very similar results between the two applications, contradict- ing what would be expected based on performance. This hints at successful mitigation techniques in both the hardware and used. The study also suggests some interesting further research, such as investigating the relationships between performance, VR sickness, and presence. Other possible further research would be to investigate the effect of prefetching and adaptive streaming of re- sources, and how it could impact VR on the web. Acknowledgments

We would like to express our greatest thanks to Niklas Carlsson. Without your help, motiva- tion and continued interest through these years, this thesis would never be finished.

A big thank you to Johanna and Mimmi for being our greatest supporters and always believing in us.

iv Contents

Abstract iii

Acknowledgments iv

Contents v

List of Figures vii

List of Tables 1

1 Introduction 2 1.1 Motivation ...... 2 1.2 Aim...... 2 1.3 Research questions ...... 2 1.4 Delimitations ...... 3

2 Background 4 2.1 Valtech ...... 4 2.2 Valtech Store ...... 4

3 Theory 5 3.1 What is Virtual Reality? ...... 5 3.2 History of VR ...... 5 3.3 Immersion ...... 7 3.4 Presence and ...... 7 3.5 Native VR ...... 10 3.6 VR on the Web ...... 10 3.7 Web versus Native ...... 16 3.8 The Importance of Performance ...... 16 3.9 Real-time 3D ...... 19 3.10 Prefetching and Level of Detail ...... 21

4 Method 23 4.1 Research Method - A Multi-method Strategy ...... 23 4.2 Virtual Experience Implementations ...... 25 4.3 Hardware - Test Devices and HMD ...... 29 4.4 Performance Measurements ...... 29 4.5 Presence and Simulator Sickness Measurements ...... 30 4.6 Analysis Procedures ...... 31

5 VR Applications 32

6 Experimental Results 35 6.1 Performance Test Procedure ...... 35

v 6.2 Performance Test Results ...... 35 6.3 SSQ Test Results ...... 38 6.4 IPQ Test Results ...... 40 6.5 Correlation between SSQ and IPQ ...... 41

7 Discussion 42 7.1 Results ...... 42 7.2 Method ...... 45 7.3 The work in a wider context ...... 45

8 Conclusion 47 8.1 Future Work ...... 47

Bibliography 49

A User Study Test Protocol 56

B Performance Test Protocol 57

C Questionnaires 58 C.1 Demograhics Questionnaire ...... 58 C.2 Post-session Questionnaire ...... 59

D IPQ results 64

E SSQ results 65

F WebVR Specification 66 List of Figures

2.1 Valtech Store ...... 4

3.1 WebVR Overview ...... 13 3.2 WebGL Overview ...... 15

4.1 Implementation Overview ...... 26 4.2 Overview of the 3D Scene Building Process ...... 27 4.3 Component and Services Architecture ...... 28

5.1 Start view - WebVR application ...... 33 5.2 Start view - Native application ...... 33 5.3 Teleport indicator - WebVR application ...... 33 5.4 Teleport indicator - Native application ...... 34 5.5 Information window - WebVR application ...... 34 5.6 Information window - Native application ...... 34 5.7 Wall material menu - WebVR application ...... 34 5.8 Wall material menu - Native application ...... 34

6.1 Performance Measurements ...... 36 6.2 Individual FPS Measurements ...... 37 6.3 SSQ score change between pre and post exposure (calculated as post score ´ pre score)...... 39 6.4 SSQ total score change between pre and post exposure without outliers ...... 39 6.5 IPQ scores for native compared to web (zero means equal scores, positive means native scored higher than web) ...... 40

vii List of Tables

3.1 Standardized loadings of items on respective subscales ...... 10 3.2 SSQ categories ...... 17

6.1 Performance Data Description ...... 38 6.2 Native correlations ...... 41 6.3 Web correlations ...... 41

1 CHAPTER 1

Introduction

1.1 Motivation

In recent years, the popularity of VR in the consumer market has grown considerably. There are great challenges with VR technology since it is very sensitive to performance and other factors that may influence usability and presence. The portability and mobility that comes with the combination of smartphones and the web has been proven very successful in the consumer market. Great investments in products such as VR and Samsung Gear VR show that there is a clear strive to take VR to the mobile web. Taking new technology and products to the web is not always easy. History reveals that demanding applications is often built for native platforms to take use of performance and de- velopment tools often provided by the native platform. For example, there is still an ongoing battle with "native versus web" in the business of smartphone applications. With the combination of WebGL and WebVR it is possible to build VR experiences using web technology only, allowing VR to utilize the success and accessability of the web platform. But there is a concern of performance; can real-time rendered VR on the web achieve the high performance requirements?

1.2 Aim

The aim of this thesis is to find out if VR is ready for the web and vice versa. The thesis will explore the current state of performance and user experience of real-time rendered VR on the web by comparing with native VR. The result of the thesis could be used as basis when choosing whether or not to use a native platform or the web platform to develop a VR application. It should be noted that there is a rapid development of the technologies used to develop VR and especially for VR on the web, thus the aim of this thesis is also to discuss and predict the future of VR on the web.

1.3 Research questions

1. How does a WebVR application compare to a native VR application in terms of user experience?

2. Can the web platform deliver the performance required for VR applications in terms of frame rate, latency and other performance metrics?

2 1.4. Delimitations

1.4 Delimitations

This thesis focuses on the comparison of native VR and WebVR in terms of performance and experienced presence. Although VR is available both as native and web applications on multiple platforms and hardware devices, this thesis will only perform measurements and user study on the Android platform and the GearVR headset due to limited time and access to hardware. The performance and user experience could also differ depending on the development process and choice of tools. The applications used in this thesis will be developed using the game engine Unity and the JavaScript framework three.js. Other engines such as Unreal or similar will not be taken into consideration.

3 CHAPTER 2

Background

The comparative study of this thesis was done at the Stockholm office of Valtech. The ap- plications developed for the comparative study were VR experiences of a physical concept store, called Valtech Store. This chapter gives a short background about the company and Valtech Store.

2.1 Valtech

Valtech is a global IT consulting firm, specializing in digital experiences. Founded in France in 1993, Valtech has grown to more than 2500 employees in over 15 countries. In Sweden, Val- tech has 265 employees in two offices and aims to be a digital partner focusing on technology, digital strategy and design.

2.2 Valtech Store

To stay relevant in a digital world, Valtechs customers need to be able to adapt and evolve regardless of business. This is also true for Valtech, who strives to stay up to date with trends and movements in the digital world. As a way to gather insights, knowledge, and to show what Valtech can offer customers, a smart online store concept called Valtech Store was cre- ated. Valtech Store is a physical experiment store at the Stockholm office where a number of prototypes are shown. These prototypes demonstrates how new and emerging technologies such as Internet of Things, Big Data and Virtual Reality can be used in a physical retail space.

Figure 2.1: Valtech Store

4 CHAPTER 3

Theory

This chapter provides a theoretical background for the thesis. First, we present a brief expla- nation of the term Virtual Reality, followed by an introduction to some terms and methods for measuring and evaluating VR experiences. This is followed by a historical walkthrough of Virtual Reality. After this, technical descriptions of WebVR and the Unity game engine used for the native application are given. The chapter also describes why performance is a critical factor for VR, the side effects that poor performance might lead to, and presents some ways of mitigating performance issues.

3.1 What is Virtual Reality?

To the general public Virtual Reality (VR) is known by consumer available products such as the Rift and HTC Vive. The term is understood as a set of technologies that enables its users for immersive experiences that are even more captivating than high definition movies and games. But in the scientific community, Virtual Reality is often interpreted in terms of human experiences. Back in 1992, Jonathan Steuer introduced a definition of Virtual Real- ity that is not tightly coupled to technological hardware, but instead based his definition on "presence" and "telepresence". He defined presence as “the sense of being in an environment”, telepresence as “the experience of presence in an environment by means of a communication medium”, and Virtual Reality as “a real or simulated environment in which a perceiver expe- riences telepresence” [1]. Virtual Reality have other interchangeable terms such as "Artificial Reality", "Cyberspace", "Microworlds", "Synthetic Experience", "Virtual Worlds", "Artificial Worlds", and "Virtual En- vironment" [2] [3] [4]. Michael A Gigante stated in his book from 1993 [4] that "many re- searcher groups prefer to avoid the term VR because of the hype and the associated unreal- istic expectations”. He also writes that telepresence “represents one of the main areas of re- search in the VR community” which is also supported by the fact that the longest-established academy journal in the field is named "Presence: Teleoperators and Virtual Environments" [5].

3.2 History of VR

The ideas and concepts behind Virtual Reality far predates the term itself and can be argued to date back to the 1800s. In 1838, the first stereoscope was invented by Charles Wheatstone [6], using two images and mirrors to give the viewer a sense of depth. Technology has come a long way since then, but the concept of stereoscopic images is still used in todays virtual reality.

5 3.2. History of VR

3.2.1 The Beginning In 1965, American computer scientist Ivan Sutherland introduced the concept of "The Ulti- mate Display" [7]. This concept describes the ultimate display as “a room within which the computer can control the existence of matter.”, a description that fits well when applied to Virtual Reality technology of today. A few years later, Sutherland and his team at MIT built what is regarded as the first VR head-mounted display (HMD) system, called “The Sword of Damocles” [8]. The display was a large, ceiling-suspended device that a user strapped to his or her head. It was capable of displaying simple “wire-frame” shapes according to the users head movements. In the following years, plenty of research were done in the field improving both computer graphics and display capabilities, laying the foundation for what was yet to come.

3.2.2 The First Wave - 1990s In 1982, an advanced flight simulator that used a HMD to present information to pilots, called Visually Coupled Airborne Systems Simulator (VCASS) was developed at the US Airforce Medical Research Laboratories. Two years later, in 1984, a stereoscopic HMD called Virtual Visual Environment Display (VIVED) was developed at the NASA Ames research center [4]. But it was not until the late 1980s that the actual term “Virtual Reality” was coined by Jaron Lanier, founder of the Visual Programming Lab (VPL) [9]. VPL Research was the first com- pany to develop and sell commercial VR devices, most notably the DataGlove and EyePhone HMD [3]. By the early 1990s, Virtual Reality had become very popular in both the scientific community and the general population. The potential of Virtual Reality started to show and a number of companies started to incorporate VR in their production and daily operations [10]. In 1994, Frederick Brooks held a lecture at University of North Carolina called “Is There Any Real Virtue in Virtual Reality?”. In this lecture Brooks assessed the current state of VR and concluded that “VR almost works” and listed a number of areas that needed to improve in order to fulfill Sutherlands 1965 vision. In 1999, Brooks revisited VR to see how much progress had been made in the five years since his lecture at UNC [11]. He stated “VR that used to almost work now barely works. VR is now real.” and found that VR was now ac- tually in use, not only in research labs but in actual production. He finished the report by presenting the main challenges that still needed to be solved: minimizing latency, rendering large models in real time, choosing the best type of display for each application, and improv- ing haptic augmentation. The research and further development of VR technology carried on but the consumer market and public interest collapsed and would not return until over ten years later.

3.2.3 The Rise of VR - 2010s In 2012, a small company called OculusVR started by a VR enthusiast called Palmer Luckey reignited the VR hype. Being a VR enthusiast, Luckey collected old VR HMDs for his own use. Eventually he grew frustrated with the poor performance and immersion these ’90s era HMDs provided and decided to build his own version which he called the . One of the prototypes Luckey built eventually landed in the hands of John Carmack, creator of games such as Doom and Quake. Carmack, in turn, showed this prototype at the Electronic Entertainment Expo (E3) which led to articles on tech sites and suddenly the Rift was the hottest device of the gaming industry [12]. In August 2012, the recently founded OculusVR launched a crowd-funding campaign on Kickstarter for their VR HMD, the Oculus Rift. The goal was to raise $250,000 in order to produce 100 HMDs. After 24 hours, the campaign had raised over $600,000 and when it was over the number had reached almost $2,5 million [13]. After the success of the Oculus kickstarter and the developer kits that followed, VR was once again a hot topic and more HMDs soon followed. As of today, there are a number of HMDs available on the consumer market. Apart from the Oculus Rift, the main alternatives are the

6 3.3. Immersion

HTC Vive and the Playstation VR [14] [15]. There are also a number of HMDs that works by inserting a smartphone into the headset. Of these, the main players are the Samsung GearVR, Google Daydream and the very cheap [16] [17] [18].

3.3 Immersion

Immersion describes to what extent a system can deliver an environment that shuts out the real world. A good computer display will increase immersion because it will be a able to provide greater detail. A set of noise canceling head phones will increase immersion because it will hide the sounds of the real world. A good head tracking device will improve immersion because the user will be less likely to notice the computer rendered frames lagging behind. Immersion is thus defined in technological terms, and was well described by M. Slater and S. Wilbur in 1997 [19]. In a paper written by B. Witmer and M. Singer in 1998, another interpretation of immer- sion is used. They define immersion as “a physiological state characterized by perceiving oneself to be enveloped by, included in, and interacting with an environment that provides a continuous stream of stimuli and experiences” [20]. Immersion as described by B. Witmer and M. Singer is a psychological state affected by the quality of the virtual reality experience. They recognize technology as one of the factors that enables immersion. In 1999, M. Slater writes in response to B. Witmer and M. Singer that he will use the term "system immersion" to denote his meaning, and "immersive response" to denote the meaning of B. Witmer and M. Singer. This distinction is also made explicit by Schubert et al. in a paper written in 2001, in which they denote B. Witmer and M. Singers meaning of immersion as "psychological immersion" [21]. To avoid confusion, we will follow the distinction made by Schubert et al. and simply use "immersion" to denote M. Slaters meaning while using the term "psychological immersion" to denote B. Witmer and M. Singers meaning.

3.4 Presence and Telepresence

Telepresence was first explained by Marvin Minsky in an article published in 1980 in which he wrote about telepresence as a remote presence in which you could feel, see and interact at a remote location through telecommunication and robotics [22]. Sheridan and Furness introduced a new journal, "Presence: Teleoperators and Virtual Environments" in 1992 and continued to use Minsky’s wording of telepresence only when dealing with teleoperation and remote control. Over time the two terms, presence and telepresence, started to merge and are sometimes used interchangeably. The International Society of Presence Research states that presence is a shortened version of telepresence [23]. In 1997, Lombard and Ditton published "At the Heart of It All: The Concept of Presence" [24] in which they examined the concept of presence by identifying the previous takes of pres- ence found in literature and explicating the concept as a whole. Some of the takes on presence are conceptualized from social perspectives while others focus on perceptual and psycholog- ical immersion. Lombard and Ditton summarize the concept as “the perceptual illusion of nonmediation” which occurs when “a person fails to perceive or acknowledge the existence of a medium in his/her communication environment and responds as he/she would if the medium were not there”, or as Biocca and Levy puts it “the real world is invisible” [25]. The Lombard-Ditton definition of presence is based upon perception, mediation, medium and environment; terms that are well discussed and explained by J. Gibson in his book "The Ecological Approach to the Visual Perception of Pictures" written in 1978 [26]. The book chal- lenges earlier work in perception and brings multiple fields (physics, optics, anatomy and visual physiology) into account as he suggests “an entirely new way of thinking about per-

7 3.4. Presence and Telepresence ception and behavior”. He writes that the “environment of animals and men is what they perceive” which is not the same as the physical environment in which we also exist.

3.4.1 Measuring Presence One of the first well known and established methods of measuring presence is the use of the presence questionnaire (PQ) and the immersive tendencies questionnaire (ITQ) introduced by B. Witmer and M. Singer in 1998 [20]. They base the presence questionnaire on the fac- tors thought to contribute to presence: Control Factors, Sensory Factors, Distraction Factors and Realism Factors. The factors in these groups were derived from earlier work and is de- scribed in the first issue of Presence: Teleoperators and Virtual Environments. They conclude that individuals that report a high PQ score tend to report less simulator sickness. About one year later M. Slater criticizes the Witmer-Singer questionnaire in "Measuring Presence: A Response to the Witmer and Singer Presence Questionnaire" [27]. He writes that “The purpose of this note is to explain why I would never use the W&S questionnaire for studying presence - even though I am sure that in itself it can lead to useful insights about the nature of VE experience”. His reasoning is that the questions measure differences in the individuals and not the immersive system, meaning that for example individual skill and experience will influence the result. He also criticizes the scale and summarization of scores by arguing that “A question that has high variability across subjects in the experiment will obviously correlate more highly with the sum than one with a lower variance”. Freeman et al. finds in 1998 that “prior training given to observers significantly affects their presence ratings” and discusses that measuring the subjective construct of presence is potentially unstable [28]. In "Using Presence Questionnaires in Reality" published in 1999, M. Usoh et al. warns that presence questionnaires such as the Witmer-Singer questionnaire should not be used for comparisons across environments since the subjects will “"relativise" their responses to presence questions to the domain of their given experimental experiences only”. A good property of questionnaires is that they can be completed after the experiment, and thus does not interfere with the subjects focus during the experiment. It does however make the results influenced by the memory of the subject, thus events in the later part of the experiment might have greater impact on the questionnaire result. There are many different versions of questionnaires to measure presence. In the sum- marization of presence measurement techniques by W. Ijsselsteijn and J. van Baren in 2004, there are 28 presence questionnaires evaluated which they report as the largest category of methods [29]. In 2014, C. Rosakranse et al. identifies the five canonical questionnaires as SUS (Slater, Usoh, Steed), PQ (Witmer, Singer), IPQ (Schubert, Regenbrecht, Friedmann), ITC- SOPI (Lessiter, Freeman, Keogh, Davidoff) and the Lombard-Ditton questionnaire [30]. The IPQ is made up of items from previous presence questionnaires. As such it contains items from SUS, PQ and other questionnaires. It was first presented in 2001 by Schubert et al. in an article published in Presence [21]. The article reports two exploratory factor analyses conducted as two sequential dependent studies, followed by a confirmatory factor analysis that resulted in 14 items. Schubert et al. report that three distinct presence components were identified: spatial presence (SP), involvement (INV) and experienced realism (REAL). Out of the 14 items, 5 items load on spatial presence, 4 items load on involvement, and 4 items load on experienced realism. There is one additional item, a general “sense of being there” (G1), that load on all three components. The IPQ is based and supported by previous work such as the ITC-SOPI. It supports the critique of Slater against the PQ. Further, it is a short question- naire with categorized items, allowing distinction between the presence components. In 2000, M. Slater and A. Steed described a new method to measure presence in "A Vir- tual Presence Counter" [31]. The motivation for a new measurement method for presence is anchored in, among others, the work by Freeman et al. Because of the subjective nature of presence, the subjects can “clearly be influenced in their responses by the information

8 3.4. Presence and Telepresence that they gather during the course of the experiment” [31]. Any data accumulated and re- sults derived thereof may be influenced by the expectations of the subjects. It is clear that both Freeman et al., and M. Slater and A. Steed aims towards a more objective measurement method. The Virtual Presence Counter method is based on the assumption that an individ- ual only acts towards one environment in any given moment. Even though an individual perceives input from multiple environments, such as the virtual environment, the internal mental environment and the physical environment, each action performed by the individual will be a response to a single environment. The method presented in "A Virtual Presence Counter" counts each transition from virtual environment (V) to real world environment (R). A Markov Chain can be constructed with the two states and used to calculate the probability of being in the virtual environment, a value that can be used as measurement of presence. The transition V Ñ R is called a "break in presence" (BIP). The Virtual Presence Counter is more objective than presence questionnaires but it is not fully objective since individuals make sub- jective decisions to report a BIP. Another problem with the method is that it requires focus from the subject during the experiment, which in turn leads to less psychological immersion and presence. There are strengths and weaknesses with both subjective and corroborative objective mea- surement methods. An underlying problem is that presence is a multifaceted term. Most measurement methods tend to be subjective because of the early statements by T. Sheridan that presence is “[...] a mental manifestation, not so amenable to objective physiological def- inition and measurement”, and therefore “subjective report is the essential basic measure- ment” [32]. In 2000, Ijsselsteijn et al. concludes that the most promising method for measur- ing presence is to use an aggregated combination of both subjective and objective methods [33].

3.4.2 IPQ - Igroup Presence Questionnaire The Igroup Presence Questionnaire (IPQ) is a questionnaire and scale for measuring presence, first introduced in 2001 by Schubert et al. [21]. The authors argue that presence is a result of the users interpretation and mental model of the virtual environment (VE) and identify two cognitive processes that combine to create a sense of presence: the construction of a mental model and attention allocation. The authors propose that the sense of being in and acting from within a virtual environment, together with the sense of concentrating on the virtual environment while ignoring the real environment are two components included in the sense of presence. Based on this theory, also backed up by Witmer and Singer [20], the authors perform two exploratory factor analysis studies to identify items that make up the presence construct. The exploratory factor analyses used questionnaires from previous studies by Wit- mer and Singer [34], Ellis et al. [35], Carlin et al. [36], Hendrix [37], Slater, Usoh and Steed [38], Towell and Towell [39] and Regenbrecht et al. [40]. Following the exploratory factor analyses, a confirmatory factor analysis was performed to reach a good model for measuring presence. Table 3.1 shows how the items load on their respective category. The general item G1 is not included in the table, but it loads strongly on all three categories [41].

9 3.5. Native VR

Table 3.1: Standardized loadings of items on respective subscales

Category Item Study 1 Study 2

SP Sense of VE continuing behind me 0.583 0.623 SP Sense of seeing only pictures ´0.686 ´0.643 SP Sense of being in the virtual space 0.756 0.741 SP Sense of acting in the VE 0.847 0.727 SP Felt present in the VE 0.821 0.789 INV Awareness of real world stimuli 0.740 0.652 INV Awareness of real environment 0.762 0.845 INV Attention to the real environment ´0.780 ´0.783 INV Captivated by the VE 0.724 0.695 REAL How real VE seemed in comparison to the real world ´0.753 ´0.633 REAL Consistency of experiencing the VE and a real environment 0.708 0.728 REAL How real VE seemed in comparison to an imaginary world 0.795 0.730 REAL The seemed more realistic than the real world 0.564 0.767

3.5 Native VR

The majority of native 3D VR experiences are developed in the same way as traditional 3D material and games. The most common way is by using a game engine such as Unity or Unreal [42]. Both Unity and Unreal have SDK integrations for the major VR HMDs, making it easy to get started with VR development. In this thesis, the native application is developed in Unity. In general, native applications have higher performance since the code is more platform specific and can utilize hardware in a better way, as explained further in Section 3.7. Native Android applications running on Android 4.3 or newer can use OpenGL ES 3.0 [43], which allows for a number of performance improvements over OpenGL ES 2.0 [44]. WebGL 1.0 (which is explained further in Subsection 3.6.3) does only expose the OpenGL ES 2.0 feature set.

3.5.1 Unity Unity is a cross-platform game engine that is popular for developing games for PC, console and mobile devices. Launched in 2005 by Unity Technologies it was initially only available for Mac OS X but Windows support has been added since that. The list of supported target platforms has also grown since launch and now supports a large number of desktop, mobile, console, TV and web platforms. Unity is also a common choice for VR development and currently has support for Oculus, OpenVR, GearVR and Playstation VR [45]. Building a native VR application in Unity lets developers access VR devices directly from Unity through a base API compatible with multiple VR platforms. A number of automatic changes are made when VR is enabled, such as setting the correct render mode and FOV for HMDs and enabling head tracking input to the camera of the scene [46].

3.6 VR on the Web

When the web emerged in the early 1990s, it was mainly a collection of interlinked docu- ments. The typical document was a static web page written in HTML which was down- loaded, parsed and rendered by a . By the late 1990s, dynamic web pages gained popularity which was achieved by embedding client-side scripts in web pages. The scripting language named JavaScript was created and popularized, as was the style sheet language called CSS.

10 3.6. VR on the Web

The evolved set of technologies from the 1990s enabled the rich web applications of the 2000s. The web showed promise as a platform for software applications as they “look and act more like desktop applications” [47]. Web technology was no longer used to display documents, but to create rich applications such as Google Maps. When introducing Google Maps via the official Google blog in 2005, Bret Taylor writes “Click and drag the map to view the adjacent area dynamically - there’s no wait for a new image to download.” [48].

3.6.1 Web Standards The web provides an open platform that can be further developed by anyone. Thus, the web never seems to settle down; new technology, requirements and ideas are ever increasing. Open standards is an essential keystone to the success of the web. There are many groups working to create and establish standard specifications for emerging web technologies. One of the most important standards organizations for the web is the W3C, short for World Wide Web Consortium. The design principles of W3C encourages a "Web for all" and a "Web on everything" [49]. W3C define an umbrella of standards called the Open Web Platform that make up a major part of what makes the web. WHATWG, short for Web Hypertext Application Technology Working Group, is a com- munity that plays an important role for the web standards. In 2000, W3C published the XHTML 1.0 specification, that "reproduce, subset, and extend HTML 4." [50] The WHATWG was founded in 2004 as a reaction to the direction W3C had taken with the introduction of XHTML, showing a lack of interest in HTML [51]. Since then, WHATWG has provided con- tinuously updated specifications documents without versioning for HTML, DOM, URLs and other web technology. Some of the most important web technologies with well established standards are:

HTML "HTML is the World Wide Web’s core markup language." [52] W3C provides a ver- sioned standard for HTML, such as HTML 5.1 [52], while WHATWG provides a living standard, dropping the version number from the name and simply calls it "HTML" [53].

CSS Describes the rendering of structured documents. W3C provides an umbrella of CSS module specifications that together defines CSS. In the W3C document titled "CSS Snap- shot 2015" [54], 24 specifications are listed to define what CSS is.

JavaScript Most modern web pages use JavaScript, and can be considered the programming language of the web [55, p .1]. In the late 1990s, JavaScript was standardized by ECMA (European Computer Manufacturer’s Association) and given the name ECMAScript. In general, the language is still referred to as JavaScript, even though the language standard is ECMAScript [56, Brief History] [55, p .265].

Document Object Model (DOM) An API that describes events, nodes and tree structure of structured documents [57]. In practice, the DOM is the API exposed to JavaScript that enables web applications to read and update the structure and style of a document. Most of the development of the DOM is done by WHATWG, while W3C provide stable snapshots of the standard.

JavaScript APIs Both W3C and WHATWG has published a number of standards for JavaScript APIs, such as the WebStorage standard. In general, W3C publishes JavaScript API standards separated from the HTML specifications [58], while WHATWG keeps the standards within the HTML specification document [53].

3.6.2 WebVR With the release of the Oculus Rift and the newly ignited VR hype came the idea of bringing support for VR HMDs to the web. In 2014, work had begun at both and Google to

11 3.6. VR on the Web bring VR support to the web [59]. In March 2016, a proposal specification for the WebVR API 1.0 was announced and the W3C WebVR Community Group was launched [60] [61].

The Goal of the W3C WebVR Community Group is to help bring high- performance Virtual Reality (VR) to the open Web. Currently, access to VR de- vices is limited to specific native platforms. For example, Oculus Rift supports Windows and HTC Vive supports SteamVR platform only. With WebVR specs, VR experiences could be available on the Web across platforms. ([62])

The WebVR API aims to provide support for VR HMDs in web browsers by exposing access to VR displays and head tracking data to the web browser. This enables developers to read and translate movement and position data from the headset to movement in a three dimensional scene. A VR experience can be created by rendering a 3D scene using the per- spectives provided by the headset, and presenting frames to the headset display. The typical steps to render a frame to a VR display, without error handling, is:

1. Get a VRDisplay object by using getVRDisplays (WebVR Specification line 198):

let vrDisplay; navigator.getVRDisplays().then(displays => { vrDisplay= displays[0]; });

2. Get (or create) a canvas element and set it’s size so that it is large enough for both viewports (left viewport and right viewport). The recommended viewport sizes can be acquired through getEyeParameters() (WebVR Specification line 24):

const leftEye= vrDisplay.getEyeParameters('left'); const rightEye= vrDisplay.getEyeParameters('right'); const canvas= document.getElementById('target-canvas'); canvas.width= Math.max(leftEye.renderWith, rightEye.renderWidth) * 2; canvas.height= Math.max(leftEye.renderHeight, rightEye.renderHeight);

3. In response to a user gesture, pass the canvas to requestPresent() (WebVR Specification line 99) and start a render loop specific to the VRDisplay:

vrDisplay.requestPresent([{ source: canvas }]).then(() => { vrDisplay.requestAnimationFrame(onVRFrame); });

4. In a render loop, get the current projection and view matrices along with the current head pose using getFrameData() (WebVR Specification line 41), render each eye, and submit the frame using submitFrame() (WebVR Specification line 117):

let frameData;

function onVRFrame() { vrDisplay.requestAnimationFrame(onVRFrame);

// Update the frameData object vrDisplay.getFrameData(frameData);

// Render left eye

12 3.6. VR on the Web

Figure 3.1: WebVR Overview

render( frameData.leftProjectionMatrix, frameData.leftViewMatrix, frameData.pose.position, frameData.pose.orientation );

// Render right eye render( frameData.rightProjectionMatrix, frameData.rightViewMatrix, frameData.pose.position, frameData.pose.orientation );

// Capture current state of the canvas and display it vrDisplay.submitFrame(); }

As a whole, the WebVR specification extends 4 existing interfaces with WebIDL par- tials: the Navigator interface, the Window interface, the Gamepad interface and the HTM- LIFrameElement interface. In addition to the partials, the specification defines 9 other in- terfaces, one of which is an event that is tightly coupled with the Window interface. The architectural overview is visualized in Figure 3.1. The VRDisplay interface (WebVR Specification line 1) is the main piece of the WebVR. It includes generic information of a VR device along with necessary functions to display content to its display. It is the core interface for accessing other WebVR interfaces associated with function or data related to the VR device, such as VRFrameData (WebVR Specification line 169) and VRDisplayCapabilities (WebVR Specification line 138).

13 3.6. VR on the Web

The most important partial is the Navigator extension (WebVR Specification line 197) which provides access to the VRDisplay interface. It is a small extension, but acts as host of the functionality required to bootstrap a WebVR application because it can be used to check whether WebVR is enabled and to fetch the VRDisplay objects. The Window interface extension (WebVR Specification line 221) defines a set of events that can be subscribed to, as such it is not a crucial interface that is required to create a VR experience. It does however allow the application to react to events such as when a VR device is connected or disconnected. A VR application can do without these events and instead require that the VR display is connected when loading the web application. The Gamepad interface extension (WebVR Specification line 235) contains one single at- tribute called displayId, which can be used to identify if a Gamepad is associated with the specified VR display. It should be noted that the W3C Gamepad specification is still a work- ing draft as of January 2017 [63]. As such, the Gamepad API is not stable, has not reached W3C recommended status and is not very well supported by web browsers. The extension of HTMLIFrameElement (WebVR Specification line 231) is to add the ap- propriate security requirements when combining IFrames and WebVR. A IFrame should not be allowed to access the WebVR API if not explicitly allowed by the ancestor.

3.6.3 WebGL The idea of displaying 3D scenes on the web is as old as the web itself. Back in 1994, David Raggett wrote a paper "Extending WWW to support - Platform Independent Virtual Real- ity" [64] for the first WWW conference, describing his ideas of a platform independent 3D standard through a new declarative markup language called VRML (Virtual Reality Markup Language). The format became an ISO standard in 1997 and became known as VRML97. It enabled 3D scenes to be described in text, which could be transferred over the web and dis- played in a VRML viewer. VRML was superseded by X3D in 2004, which was more powerful and considered closer to the web as it supported the XML format [65, p .3]. To further close the gap between the web and 3D, X3DOM was developed which defined an XML names- pace with a tag allowing X3D to be embedded in a HTML or XHTML page. Because of the declarative nature of the web page, 3D web applications struggled to meet the prac- tice of 3D programming which was to use imperative APIs such as OpenGL and DirectX. To fill this gap, Google developed O3D while Mozilla developed Canvas3D as attempts to give Javascript (which is imperative in nature) access to OpenGL or DirectX [66] [65, p .6]. Through these independent attempts emerged WebGL, developed and standardized through the Khronos Group. The Khronos Group describes WebGL in the specification:

WebGL™ is an immediate mode 3D rendering API designed for the web. It is derived from OpenGL® ES 2.0, and provides similar rendering functionality, but in an HTML context. WebGL is designed as a rendering context for the HTML Canvas element. ([67])

As such, WebGL like X3DOM is not plugin-based but tightly coupled to the DOM. But in contrast to X3DOM, instead of manipulating the DOM tree to modify the 3D scene, the developer is in direct control of a OpenGL context through a JavaScript API. As shown in Figure 3.2, the WebGL API is exposed to the developer by the getContext method of the canvas element. When a WebGLRenderingContext is created, WebGL also creates a drawing buffer in the process, onto which later API calls are rendered. The drawing buffer is automatically presented to the HTML page. When exiting back into the event loop, thus allowing the web browser to render, the screen will be updated with the drawing buffer if changes has occurred.

14 3.6. VR on the Web

Figure 3.2: WebGL Overview

WebGL 1.0 is derived from OpenGL ES 2.0, and as such rendering is done in a similar manner. This allows programmers familiar with OpenGL to ease into WebGL development without having to learn a lot of new concepts.

3.6.4 Gamepad API The Gamepad API is a W3C Specification that as of January 2017 is in the Working Draft stage. The Gamepad API enables developers to directly access and utilize gamepads and other controllers in web applications. It does this by providing interfaces that describes and interacts with gamepad data. For VR on the web, the Gamepad API is needed in order to use handheld controllers such as the or HTC Vive controls, or the buttons and touchpad on the Samsung GearVR. To use a gamepad, a Gamepad object is accessed through an extension of the Navigator interface in the following way:

var gamepads= navigator.getGamepads();

This will return an array with all connected gamepads that have been interacted with [68]. The state of the gamepads can then be queried to check for pressed buttons or position of analog sticks in the following way:

var gp= gamepads[0];

// True if button with index 0 is currently pressed. gp.buttons[0].pressed;

// Value between -1.0 and 1.0 representing position of axes with index 0 gp.axes[0];

15 3.7. Web versus Native

3.7 Web versus Native

The debate on whether or not to develop mobile applications on the web platform or the na- tive platforms such as Android and iOS have been researched by A. Charland and B. LeRoux [69]. They list a number of factors that are important to consider when making the decision to develop a native or web application. These factors include cross-platform, programming lan- guage, API access, platform conventions, performance and more. A. Charland and B. LeRoux writes that “the performance argument that native apps are faster may apply to 3D games or image-processing apps, but there is a negligible or unnoticeable performance penalty in a well-built business application using Web technology”. VR applications require real time 3D rendering and can thus be victim of the performance argument. There have been research in 3D performance web applications in comparison to native 3D applications. M. Mobeen and L. Feng claim that a WebGL call is now comparable to a native call because the JavaScript engine performance have been improved significantly in recent years [70]. R. Hoetzlein finds in a performance comparison that the native OpenGL application in his tests are about 7 times faster than the WebGL application [71]. There is a vast difference between how web applications and native applications load its assets. A web application must load assets using HTTP network requests, while a native application can load assets from disk. A web application must be careful not to load too many resources, because HTTP requests have high latency in comparison to loading from disk. Prefetching and level of detail are techniques that can be used to mitigate these problems, which are described later in Section 3.10.

3.8 The Importance of Performance

For a VR HMD to deliver a truly immersive experience and to actually convey a sense of presence, there are a lot of requirements. The fact that the display is right in front of you and takes up the entire field of vision raises the bar for performance and quality by a significant amount. To deliver a sense of presence through VR means that the vision and mind of the viewer needs to be convinced that the virtual images he or she is seeing is real. Michael Abrash names three broad areas that affect how we perceive virtual environments through a HMD: tracking, latency and the way that the display interacts with the user’s vision [72]. If these things are not done well it will impact immersion in a very negative way, resulting in loss of detail, graphical artifacts or the form of motion sickness that is commonly referred to as when talking about VR.

3.8.1 Virtual Reality Sickness One big drawback of immersive VR is the fact that some users experience symptoms very similar to those of motion sickness including nausea, eye strain, disorientation, headache and more. VR sickness is different from motion sickness due to the fact that the user does not actually move, the perceived motion is completely visually induced [73]. VR sickness is not a new phenomenon, a similar affliction known as simulator sickness has been known since at least the 1950s when flight simulators started to see use [74]. It is still not entirely known what causes VR sickness on a biological level and a few theories exist. The most common theory is sensory mismatch, which is when visual and other outside parameters are experienced differently by the human senses. For example if a user is experiencing motion in a virtual environment but in reality is sitting still would cause the visual system and brain to experience and expect a motion that the vestibular system does not feel. [75] There are also a number of technical factors that can contribute to VR sickness, such as errors in the positional tracking, latency, display refresh rate and image persistence [73] [76].

16 3.8. The Importance of Performance

3.8.1.1 Measuring VR Sickness There are three primary ways of measuring VR sickness: questionnaires, postural instabil- ity, and physiological state. Questionnaires is the most commonly used method and was made popular with the Pensacola Motion Sickness Questionnaire (MSQ) in 1965. Today, the most widely used questionnaire is the Simulator Sickness Questionnaire (SSQ), developed by Kennedy et al. in 1993 [75]. The SSQ was derived from MSQ and made more suitable for what Kennedy et al. calls simulator sickness by removing parts that only concerned motion sickness [77]. The SSQ consists of a number of questions asking the user to rate the severity of symptoms on a four point scale. These scores are then computed for three individually weighted categories; nausea (N), oculomotor (O), and disorientation (D). A total score is cal- culated as the sum of all category scores multiplied by a constant. It should be noted that the SSQ does not provide an absolute measurement of sickness, but should be used as an instru- ment for correlation analysis. Table 3.2 shows the scoring and weights for each category in the SSQ, as well as the total score weight.

Table 3.2: SSQ categories

SSQ Symptom Weight Nausea Oculomotor Disorientation General discomfort 1 1 0 Fatigue 0 1 0 Headache 0 1 0 Eyestrain 0 1 0 Difficulty focusing 0 1 1 Increased salivation 1 0 0 Sweating 1 0 0 Nausea 1 0 1 Difficulty concentrating 1 1 0 Fullness of head 0 0 1 Blurred vision 0 1 1 Dizzy (eyes open) 0 0 1 Dizzy (eyes closed) 0 0 1 Vertigo 0 0 1 Stomach awareness 1 0 0 Burping 1 0 0

Total NOD

Weighted category score Nw = N ˆ 9.54 Ow = O ˆ 7.58 Dw = D ˆ 13.92

Total score is calculated as Nw + Ow + Dw ˆ 3.74

Each symptom is scored between 0 and 3 by the user, according to how much they expe- rience each symptom. The three categories N, O, D are then calculated by adding the user scores for the symptoms that are weighted with a 1 for each category in Table 3.2. As an example, if a user gives the "General discomfort" question a score of 2, N and O will get 2 added to their respective category score and D will not have anything added. This is done for all questions, giving a total score for each category. This score is then multiplied with the unit weight for each respective category to provide a more stable and reliable score. The total score is the sum of all weighted category scores multiplied by the total score weight 3.74. The

17 3.8. The Importance of Performance weights used in the formula do not have any interpretive meaning, but are used to produce similarly varying scales to allow for easier comparisons [77].

3.8.2 Frame Rate, Latency and Persistence Since VR is supposed to mimic the real world in a believable way, the performance of what you see needs to be as high as possible. Of course, it will never be as good as reality but hopefully technological advances will get us close enough for a good experience.

3.8.2.1 Frame Rate Frame rate is the term for the frequency with which an image shown on a display is updated. Frame Rate is often measured and expressed in frames per second (FPS). The frame rate of a 3D application depends on the graphical complexity as well as the graphical computation power of the computer running the application. Another limiting factor is the fact that dis- plays have a limit to how fast they can update the shown image. This is commonly called the display’s refresh rate and is different from frame rate in that it includes repeating identical frames if the display has not received an updated frame from the source. Most traditional LCD displays today have a refresh rate of 60Hz, though displays capable of 120 or 144Hz are becoming increasingly common and some models are capable of over 200Hz. For VR HMDs, most have a refresh rate of around 90Hz, although truly immersive VR requires frame rates considerably higher than that. Michael Abrash theorizes that the refresh rate needed for the eye to not be able to differentiate VR from reality is between 300 and 1000Hz for 1080p reso- lution at 90 degree field of view [76].

3.8.2.2 Latency Latency refers to the delay in time between an input to the system and the output produced and shown to the user. In the case of VR the output is the updated visual image shown in the HMD. For VR, the delay between head movement and updated images is one of the most critical factors in delivering a good experience and is often called Motion-To-Photon latency (MTP) [78]. Too much MTP latency is one of the primary causes of VR sickness [79]. Generally, latencies lower than 20ms tend to be unnoticeable by the human senses although some put the threshold for latency in VR systems as low as 3ms [80] [81].

3.8.2.3 Persistence One visual artifact that is more commonly noticed in VR HMDs than in normal displays is smearing. This occurs when a user’s head is rotating while at the same time the eyes are focused on an object in the virtual world. Since the eye can focus and see clear images even during very fast rotations, a display that is refreshing images at a limited rate will end up looking smeared. This phenomenon happens when displays are full-persistence, meaning the pixels are lit during the entirety of the frame that the display is showing. One way that smearing can be minimized is by using low-persistence displays. With low-persistence, pixels are only lit for a short time during each frame but with higher intensity to compensate. This means that images are not displayed long enough to appear smeared during head rotation [76]. The most popular HMDs of today all incorporate low-persistence displays. Although low-persistence seems to solve one problem it might make another problem more noticeable, namely strobing. Strobing is the perception of multiple copies of the same image at the same time [82] and is mostly hidden if smearing is apparent. Strobing occurs when the distance an image moves between frames is greater than some threshold, often around 4-5 arcmin which would convert to 4-5 degrees/second of eye movement relative to the image at 60 frames per second [76] [83]. The clear solution to the strobing issue is increasing the frame rate. A

18 3.9. Real-time 3D higher frame rate would reduce the time between frames and in turn the distance images move between frames.

3.8.3 Mitigation Techniques Immersive VR obviously calls for high performance, both in terms of frame rate and MTP latency. Even though computers and smartphones has gotten more and more powerful for each year, maintaining a high frame rate and low MTP latency is still a challenge and might not always be possible. In order to help achieve this and to some extent mask the negative effects that occur when it is not possible, there are some techniques that can be used.

3.8.3.1 Time Warp / Reprojection Time warp, also known as reprojection is a technique used in all of the major VR HMDs today. Time warp reduces the perceived MTP latency by modifying the generated image before sending it to the display to reflect movements that happened after rendering finished. Each rendered frame in VR is based on the positional data received from the HMD at the very start of the render loop. This means that at 60 FPS, where a new frame is displayed approximately every 17 ms, each image shown is based on positional data that is 17 ms old. Time warp reduces this latency by modifying the rendered image right before sending it to the display using newly captured positional data. When time warp is run separate from the rendering loop, it can also help to maintain a high frame rate. This technique is known as asynchronous time warp (ATW) and is the version most commonly used. Separating the rendering and warping process makes it possible for time warp to intervene when a frame is taking too long to render. If a frame is not rendered in time for submitting to the display it will cause the last shown frame to be shown again, causing a noticeable judder. With ATW, time warp can be applied to the last finished frame to reflect new movement, masking the missed frame and smoothing out visual artifacts that might otherwise occur.

3.8.3.2 Space Warp Another technique that is only used in the Oculus Rift at the moment but is under develop- ment for the HTC Vive is what Oculus calls Asynchronous Space Warp (ASW). Where time warp only accounts for rotational movements of the user’s head, space warp goes one step further and also accounts for animations and movements of objects within the virtual world as well as movement of the first-person camera. When a VR application fails to maintain the set frame rate, ASW steps in just like ATW and modifies the last rendered frame to account for any movement and animations happening in the scene. When used together with ATW it will help VR applications to maintain low latency and high frame rates and allow VR to be run on less powerful hardware. Since space warp performs prediction and extrapolation of movements, there is a risk of visual artifacts when it fails. Some typical situations where ASW might fail are rapid changes in lighting and brightness, and rapid object movements where ASW needs to fill in the space left behind when the object moves.

3.9 Real-time 3D

A 3D scene is typically made up of a set of vertices. The process of rendering a 3D scene is a multi step process. A simplified, high-level description of these steps are:

1. The vertices of the 3D scene are projected into 2D screen space.

2. Each vertex is assembled into triangles which in turn are transformed through the ras- terization process into "fragments", that represent the area of the screen that the triangle occupies.

19 3.9. Real-time 3D

3. Each fragment is colored and can then be used to composite the final image by keeping the fragments closest to the camera for each pixel.

The process of rendering a 3D scene is called the graphics pipeline. In reality it is a more complex process than the steps described above and typically involves tasks such as lighting, texture lookups and shadow mapping [84]. Most of the steps above can be grouped into steps that can be effectively parallelized. One such step is the fragment operation step; the color of each fragment can be calculated in parallel [85, pp .880–881]. Because of this property of the graphics pipeline, it did not suit the CPU very well, which is better at executing a series of dependent operations. Thus, the graphics pipeline is normally executed on a GPU (Graph- ics Processing Unit) that in contrast to the CPU offers a much higher throughput (executed instructions per time unit) at the cost of having a high latency [85]. Up until the early 2000s the GPU was a fixed function processor. The 3D graphics pipeline was a single purpose engine and it enforced a specific step by step rendering process [85]. The GPUs of today enable a flexible and general purpose graphics pipeline, broadening and empowering new usage areas such as protein folding, while offering an astonishing amount of computational power [84]. Software developers uses graphic APIs such as OpenGL or DirectX to utilize GPUs. These APIs implements a graphics pipeline and enable a programmable pipeline through shader programming languages: HLSL (High Level Shading Language) for DirectX and GLSL (OpenGL Shading Language) for OpenGL [84] [85].

3.9.1 Measuring Real-time 3D Applications When B. Goldiez et al. published "Real-Time Visual Simulation on PCs" in 1999 they dis- cussed the metrics and performance of 3D rendering of applications and hardware by intro- ducing a benchmark suite [86]. In regards to evaluating a PC graphics system, they write that “Average frame rate is typically the metric for such bench- marks, with triangle throughput and pixel fill rate not uncommon alternatives.” Although 1999 may seem like ancient times in regards to 3D applications, average frame rate is still a popular metric [71] [70] [87]. Frame rate is a rate metric as it is calculated by dividing the number of times an event (in this case a frame is produced) occurs over a given time. The frame rate is derived from the execution time required by a computer system to render a frame. D. J. Lilja writes in [88] that “program execution time is one of the best metrics to use when analyzing computer system performance” and that we can “[...] use them to derive appropriate rates”. A problem with measuring programs, including 3D and VR applications, is that the inves- tigated programs and instrumentation programs execute on the same hardware. The result is that by observing the program, we change what we are trying to measure. As such, the measurement tools are not entirely predictable as they introduce error and noise. As such, it is important to be aware of the accuracy and precision of any results. Further, because of the complex nature of the computer, there are many variables such as cache misses that are hard to account for. This can produce outliers and introduce errors to the result, making it important to measure several times and to report mean, median and variance [88]. Computer processors of today often use dynamic frequency scaling to decrease generated heat and to save power, which is especially important on mobile devices with limited battery capacity [89]. While conducting a test, it is important to account for frequency and thermal changes of CPU and GPU as they impact performance. For example, if one test is conducted directly after the other, the processors may start with higher temperatures on the second test, causing lower clock rates, lower performance and a biased result.

3.9.1.1 Measuring Performance in WebGL To measure rendering times and frame rate in WebGL applications, researchers in [90] and [87] force the WebGL canvas to redraw continuously while counting the number of times

20 3.10. Prefetching and Level of Detail the scenes were rendered. Congote et al. report in [87] that they use setTimeout() function to create the redraw loop. However, Mozilla recommended the usage of requestAnimationFrame() function when performing animations because the callback interval will generally match the display refresh rate [91].

3.10 Prefetching and Level of Detail

A 3D web application require many resources, such as JavaScript, style sheets, images, and models. To keep load times low, it is possible to defer resources that are not required initially, and to begin with resources of lower quality. There is a trade-off between visual detail and performance. The process of adapting rendering quality to the executing environment is called level of detail, often shortened as LOD.

3.10.1 Prefetching Prefetching is the process of fetching resources that may be accessed in the near future. Prefetching techniques are widely used many computer fields, such as hardware [92], com- pilers [93], and the Web [94]. In early work on prefetching in memory references by A. Smith in 1978, he states that “[b]y prefetching these pages before they are actually needed, system efficiency can be significantly improved.“ [92] Smiths pages are not web pages, but memory pages; but his statement is supposedly true for web pages as well. Prefetching for the Web was researched as early as 1995 when Padmanabhan et al. in [94] suggested new HTTP meth- ods (GETALL and GETLIST) to allow the server send both the HTML document and images in a single response. Their suggestions never cherished on the Web. But in 2004, Fisher et al. presented other techniques in Link Prefetching in Mozilla [95], which are the techniques used on the Web today, that is also standardized by W3C as resource hints. A Web server can propose resources using a HTTP link header or a HTML link element, which the browser can act upon:

Link: ; rel="prefetch"

By deferring HTTP requests and keeping initial resources to a minimum, the page load time can be decreased. The motivation being, as Padmanabhan et. al. put it: “People use the Web to access information from remote sites, but do not like to wait long for their results“ [96]. Similarly, when it comes to prefetching and network optimization for 3D-scenes and VR- applications, there are a several techniques to use aswell. The most obvious one is to only transfer the resources that are required initially, and wait with other resources until they are needed, as mentioned above. For advanced applications where a user can move freely between different scenes, this poses another challenge of optimizing the scene fetching by downloading only the scene that the user will move to next. This requires some sort of prediction or smart prefetching algorithm to avoid interruptions for loading resources. In [97], the authors present a player for branched video that uses prefetching policies, smart buffer management and parallell TCP connections for buffer workahead. The authors show that their implementation can provide seamless playback even when users wait with their branching decisions until the last second. A branched video is similar to a 3D application with multiple possible scene choices, where each scene can be seen as the equivalent of a possible video branch choice. As such, similar solutions for prefetching could be used for a 3D web application.

21 3.10. Prefetching and Level of Detail

3.10.2 Level of Detail Level of detail (LOD) techniques can be used to initially load low quality textures and mod- els, and then swap them for high quality resources as they are downloaded. LOD is a well researched topic, but the main motivator for the topic has been to fully utilize the hardware, not to speed up page load. Luebke et al. write in [98] that ”[t]he complexity of our 3D models - measured most commonly by the number polygons - seems to grow faster than the ability of our hardware to render them.”, a statement that is magnified by the fact that VR applications has to render each polygon twice. While growing 3D models is a rendering problem, it is also a fetching problem for the Web platform. There are a number of other ways to reduce the visual complexity of a 3D scene. One solution is to calculate what parts of a 3D-scene are hidden from view and then not render those objects at all. This is often done with Z-buffers [99] and occlusion culling [100]. Another approach that is highly applicable to VR applications builds on the fact that the human eye only senses fine detail within a circle of only a few degrees. This technique is called and works by tracking the user’s gaze and adapting the rendered resolution and LOD based on this. This is especially relevant for displays with a high field of view (FOV), such as VR headsets, where big performance gains can be had by not rendering the entire im- age in high quality. Studies have shown significant performance savings on both VR headsets and desktop displays when using foveated rendering [101] [102].

22 CHAPTER 4

Method

To make the comparison between native VR and VR on the web, two VR applications were developed. One was a native Android application and the other a web application. The applications were made as identical as possible to allow for a fair comparison. The imple- mentations of these applications are described in Section 4.2. A literature review was made, helping with the choice of evaluation method. The evalua- tion and comparison consisted of two parts, one that measured the application performance, and one that measured the presence and VR sickness through a user study. The user study was made separate from the performance measurements, meaning that no comparisons can be made between individual user sessions and the application performance in that session. The user study let users try both versions of the application in separate sessions and answer both SSQ and IPQ to measure VR sickness and presence. Section 4.4 presents the performance measurement procedures while Section 4.5 describes the presence and VR sickness measurement procedures along with the user study. The rea- soning and theory behind this multi-method approach is described in Section 4.1.

4.1 Research Method - A Multi-method Strategy

Multi-method research as a distinctive research method was described in 1989 by Brewer and Hunter [103], although its origins can be traced back to 1959 when Campbell and Fiske [104] used a multi-method to study psychological traits. In 2006, Brewer and Hunter em- phasize that multi-method research does not imply that “one must always employ a mix of qualitative and quantitative methods in each project” [105]. This is what differentiates multi- method from mixed-methods, which imply a mix of both qualitative and quantitative data [106]. A multi-method strategy is more flexible than mixed-methods as it advocates choosing a combination of methods based on the particular problem, instead of relying on a fixed set of methods.

4.1.1 Rationale for the Chosen Method The literature review showed, as described in Subsection 3.8.1.1, that bad VR experiences can lead to VR sickness. As such, VR sickness was a construct of interest because it could potentially differ between the implementations. The literature review revealed three major factors to be measured and compared between the web and native VR applications: presence, VR sickness and application performance. As shown in Subsection 3.9.1, measuring software is not entirely predictable. In addition, presence is a highly subjective construct and is hard to measure. By combining multiple mea- surements, presence, VR sickness and application performance, the validity of the research is improved. Brewer and Hunter argues that using a multi-method approach “tests the validity

23 4.1. Research Method - A Multi-method Strategy of measurements, hypothesis and theories by means of triangulated cross-method compar- isons” [105]. Measuring both presence, VR sickness and application performance is a type of triangulation, which is important “to increase the precision of empirical research” [107]. Triangulation is notably important when relying on subjective data. Robson also stresses the use of multiple sources of evidence [108].

4.1.2 Research Process Steps For readers familiar with empirical studies, this research process will be familiar. The re- search process has been heavily influenced by research literature by Runeson and Höst [107], Wohlin et al. [109], and Robson [108]. The typical steps were executed: scoping, planning, operation, analysis, and reporting. The process steps have not been executed in a strict se- quential, waterfall model process fashion. A more flexible approach was required to cope with the unknowns of VR on the web. However, as stated by Runeson and Höst [107], a quantitative analysis assume a fixed research design. This was taken into account by making the data collection procedures fixed before any operational tests were conducted.

4.1.3 Validity “Can we have been fooled so that we are mistaken about [our findings]? Unfortunately, yes - there is a wide range of possibilities for confusion and error.” — Robson [108]

Validity refers to the extent of which the research method measure what is intended. It af- fects the trustworthiness of the results and how well the results represent answers to the research questions. Validation must be addressed before the analysis phase and must also be considered throughout the earlier research phases [109] [107]. Easterbrook et al. [106] state that “[v]alidity is a multifaceted concept” and that there are different conventions of differ- ent types of validity. Runeson and Höst also acknowledge that “[t]here are different ways to classify aspects of validity and threats to validity in the literature.” Wohlin et al. [109] distinguishes four types: conclusion validity, internal validity, construct validity and external validity. The categorization scheme by Robson [108] is very similar, with the difference being that he uses the term reliability instead of conclusion validity. According to Wohlin [109], reliability is the counterpart to conclusion validity for qualitative analysis.

4.1.3.1 Construct validity Construct validity concerns the aspect of how well the measurements represent what is in- tended to be measured according to the research questions. The chosen research questions for this thesis imply two measurement attributes: user experience and application performance. To increase construct validity, a multi method approach is used which is a suggested tactic by Robson [108] to handle the shortcomings of single measurements. Another tactic men- tioned by Robson is to stay at the level of what is observed and not draw conclusions beyond the observed data. Rosenberg [110] writes that “[t]he simpler and more direct the concept, the easier it is to establish construct validity” which highlights the problem with presence, a highly complex and debated construct.

4.1.3.2 Conclusion validity Conclusion validity is closely related to replicability and reliability in measurement theory. It is the ability to draw the correct conclusions from the results with a statistical relationship. Threats to conclusion validity include statistical power, subject selection, error rate and reli- ability [109]. Robson [108] suggest that bias and error by subjects and observers are sources of error for reliability. Several actions have been taken to address and establish conclusion

24 4.2. Virtual Experience Implementations validity. As Subsection 3.9.1 reveals, performance measurements in software is an acknowl- edged source of error. As such, the accuracy and precision of performance measurements will be accounted for and reported with corresponding mean, median, and variance. The clock rates and temperatures of CPU and GPU will be observed during tests and reported to ensure consistency and replicability. Runeson and Höst [107] suggest using a case study protocol to assist researchers and prevent errors when collecting data. No separate case study proto- col was developed and maintained, instead the suggested structure by Pervan and Maimbo in 2005 [111] for a case study protocol has been incorporated in the thesis report and updated along the way. The thesis itself contains details of research instrumentation and procedures used to collect data. Uniformity of data is ensured by keeping the thesis methodology chapter up to date, and data collection have not been conducted by any other part than the authors whom are working side by side. Since the thesis report have been under version control through git, the history log can provide a timeline of changes to the methodology. An ini- tial problem description with research questions and initial approach was defined in a thesis planning document.

4.1.3.3 Internal validity Internal validity ensure that the treatment causes the outcome; the relationship between mea- surement and result is causal, it is not the result of an unknown effect. Wohlin et al. [109] lists thirteen threats to internal validity which we used to identify threats to internal validity of this thesis. The test must be conducted in two sessions to ensure isolation between the two applications. Isolation is required so that the VR sickness that might be induced by the first test does not influence the second test. This isolation leads to the threat of history and matu- ration; there is a risk that participants are exposed to other VR experiences between the two sessions gaining tolerance to VR sickness. In regard to those concerns, the two sessions are conducted only one day apart to keep the period between sessions short. Further, the second session start with a short questionnaire to address VR usage since the last session. Mortality, i.e. the effect of drop outs, is regarded as a threat, although not severe. However, the number of drop outs are observed and analyzed.

4.1.3.4 External validity External validity is the ability to generalize the results to a larger audience and other appli- cations, beyond the VR experiences developed for this experiment. While generalizability is important to some extent, the targeted generalization for this thesis is delimited to business oriented application usage and a population excluding senior groups. The target population is expected to have rather high exposure to new technology and to be well familiarized with computers. Age, gender, daily technology usage and prior VR usage is addressed by the de- mographics questionnaire (Section C.1). The external validity is determined to be of lower priority than internal validity and conclusion validity. In practice this means that this study prioritize a large number of participants for the sake of statistical advantage, over the risk of having a homogeneous group that decrease generalizability. The conflict between different types of validity and the task of prioritizing them has been discussed by Wohlin et al. [109].

4.2 Virtual Experience Implementations

Two virtual experiences were developed, one was built with web technologies and the other was built using Unity which outputs native Android code. This section is divided into four subsections. Subsection 4.2.1 describes the general strategy and implementation method. Subsection 4.2.2 describe the common work of both implementations, how the 3D scene was built. The last two subsections are detailed descriptions of the native implementation and web implementation respectively.

25 4.2. Virtual Experience Implementations

Figure 4.1: Implementation Overview

4.2.1 Implementation Overview It is important to keep the two implementations at the same level of quality and with equal conditions, to make sure that the same constructs are measured. To address this issue, the virtual experience applications were developed with a common pipeline workflow and the development workflow was split only when the technology and tools required it. Figure 4.1 shows a visualization of the implementation process. The scene design process is the common step for both implementations and as such it enables both applications to be based on the same 3D models, textures, positions, etc., thereby increasing the visual similarly between the implementations.

4.2.2 Building the 3D Scene The 3D scene was built in multiple steps using a set of different applications. Figure 4.2 shows the different steps with corresponding input and output. The physical Valtech Store was designed in Autodesk 3ds Max, and the existing 3D files were available in .max format. Using 3ds Max, the .max file was exported to .fbx, a file format supported by both Unity and Blender, a 3d graphics software. The original 3D scene file obtained was extremely detailed and not suitable for high per- formance real time rendering. The purpose of the original 3D scene file was to render still images with high detail and realistic reflections. The 3D scene was too large in size and con- tained far too many polygons to be viable as a VR scene intended for mobile platforms. To address this issue, the 3D scene was processed using Blender to reduce scene complexity, vertices, normals, faces and materials. Using Blender, each 3D object was then split by material with the purpose of avoiding objects with multiple materials since it was not supported in the export from Unity to three.js. Further modifications to the 3D scene was made with Unity. The scene was exported from Blender as a .fbx file, and imported into a Unity project. Unity was used to map textures and materials, but most importantly it was used to bake lightmaps. The export to three.js JSON format was done with a Unity plugin called UnitySceneWeb- Exporter, originally found on GitHub [112].

4.2.3 Implementing the Native Application The native application was developed with Unity 5.5, which has some support for virtual reality development. In addition to Unity as a game engine and editor, the third-party library LibOVR and Oculus Utilities for Unity (provided by Oculus VR) were used. The

26 4.2. Virtual Experience Implementations

Figure 4.2: Overview of the 3D Scene Building Process

library and utility package were downloaded from the Oculus Developer download page [113]. Virtual reality was enabled from the Unity player settings. By enabling virtual reality, the main camera is automatically rendered stereoscopically to the connected HMD and the main camera transform is overridden to track the head pose.

4.2.3.1 Unity Scripts To implement the interaction, some additional scripts were added to the scene:

VRInput.cs Acts as wrapper for input by translating raw boolean checks into high level events. Supported events are: OnClick, OnDoubleClick, OnDown, OnUp, OnCancel and OnSwipe.

VREyeRaycaster.cs Combines VRInput and a ray caster to trigger events on objects the player is looking at. Objects with the VRInteractiveItem script are considered.

VRInteractiveItem.cs Acts as a layer between the VREyeRaycaster and event handler scripts. By placing this script onto a object, the VREyeRaycaster will trigger e.g. a click event if the player looks at the object and press the HMD action button.

Reticle.cs Used to manage the cursor that is projected onto what the player is looking at. The script allows the cursor icon to be changed depending on what item is focused.

InfoGazeHandler.cs, FloorGazeHandler.cs, WallGazeHandler.cs Subscribes to and han- dles the events set up by VRInteractiveItem. InfoGazeHandler will show the informa- tion pop-up when the user clicks an object that can show information. FloorGazeHan- dler moves the player object when the user clicks on the floor. Finally, WallGazeHandler shows the menu for changing the wall texture.

WallChanger.cs Changes the texture of the walls in the scene to one selected by the user.

4.2.3.2 Optimizing for VR in Unity A number of optimizations and techniques were applied to ensure that the VR experience would have high performance to reduce the risk of VR sickness. The largest impact on per- formance came from optimizing the project settings. The quality setting used were derived

27 4.2. Virtual Experience Implementations from the Unity tutorial "Optimisation for VR in Unity" [114]. Static batching, dynamic batch- ing, GPU skinning, multithreaded rendering and default orientation were set in the player settings as suggested in "Squeezing Performance out of your Unity Gear VR Game" [115]. Static batching is used to reduce the number of draw calls in the application. Unity allows this technique to be applied easily by marking objects as static. The static flag means that the object’s transform will not be invalidated, i.e. will not move, rotate or scale. All static objects that share material can be combined into one big mesh and rendered faster. The Unity scene was made almost entirely static, enabling heavy usage of static batching. The number of materials used in the scene was kept low, by sharing materials between objects when possible. To further improve performance, real-time lighting was avoided entirely. This was pos- sible since the whole scene was static. Instead of real-time lighting, the scene fully utilized baked lighting, where the effects of lights on objects are computed and baked into lightmaps. Both direct and indirect lighting was baked. The lighting in the scene was set using four spotlights and one area light. All lights were set to only provide baked lighting. The lighting settings were carefully adjusted and experimentally determined. Multi-pass shaders and per-pixel lighting requires pixels to be filled multiple times, which comes with a performance cost. This was avoided by using a modified version of the built in Mobile Unlit shader in Unity. This shader is simply a texture with lightmap support, not affected by any lighting. The modification of the shader was required due to a bug making textures appear half as bright as they should on mobile platforms.

4.2.4 Implementing the Web Application The web application was developed with three.js1 and webvr-polyfill2 as its main dependencies. As explained in Subsection 4.2.2, the 3D scene was processed and designed using Unity and then exported as a .json file, a format used by three.js. The exported scene file was loaded into a three.js scene using the ObjectLoader. To implement the interactive elements, a component system accompanied by a service container was created. Each component is associated with an entity (three.js Object3D) and constructed using the service container, allowing the component to depend and interact with the scene through the provided services. The architecture is visualized in Figure 4.3.

Figure 4.3: Component and Services Architecture

The Controls Manager and Reticle services acts as the counterpart to the Unity implemen- tation’s VRInput, VREyeRaycaster, VRInteractiveItem and Reticle scripts. The Object Library service provides a cached game object lookup map. The scene and assets services are simple references to the three.js scene and assets container. Three components called behaviours, were created, one for each interaction:

1https://github.com/mrdoob/three.js/ 2https://github.com/immersive-web/webvr-polyfill

28 4.3. Hardware - Test Devices and HMD

ClickToTeleport Added to the floor allowing the player to gaze and click to move around. InfoPopup Added to four objects to provide popup information on click. MaterialChanger Added to the wall, allowing the player to open a menu of materials which can be selected.

4.2.4.1 Web Optimizations The built-in shadow maps feature in three.js was disabled since it comes with a significant performance cost. Performance tests were performed both with anti-aliasing (AA) enabled and disabled, to see if it makes a discernible difference. Anti-aliasing is a technique to im- prove image quality by smoothing edges so that they do not appear as jagged lines.

4.3 Hardware - Test Devices and HMD

A Samsung Galaxy S6 and a Samsung GearVR headset was used to run the applications when conducting performance measurements and user tests. The device is based on an Exynos 7420 chipset that leverages 4 CPU cores at 2.1 GHz and another 4 CPU cores at 1.5 GHz. The chipset has a Mali-T760 MP8 GPU which runs at 772 MHz. In general, the bottleneck for 3D applications running on mobile devices are caused by a slow bus and/or memory controller between the CPU and GPU [115] [116] [117].

4.4 Performance Measurements

Application performance was measured with the Oculus Remote Monitor (OVRMonitor) tool and the VrCapture library provided by Oculus. The VrCapture library is automatically sup- ported in Unity. The remote devices host a Capture Server to which the monitor tool can connect and fetch the captured logging data. After the pairing mechanism between monitor tool and remote device has completed using UDP, the actual capture stream is transferred over a TCP connection. While the monitor tool provides live graphs and display of streamed data, it also saves the received data by streaming it to disk which enable later inspection and analysis. Any network stalls that are long enough to cause filled buffers and lead to gaps in the captured data is easily discovered. Performance measurements with incomplete capture data was discarded. To ensure higher capture reliability, Frame Buffer Capturing was disabled as recommended in the documentation for Oculus Remote Monitor [118]. Other capture set- tings not relevant for the study, such as detailed thermal data and head tracking info, were also disabled. The output capture file from Oculus Remote Monitor is a binary file of unknown struc- ture. However, when opened with Oculus Remote Monitor it is possible to save the con- sole logging messages as a tab separated text file. The log messages include Android’s log- cat messages in addition to log messages from the Oculus library. Each line contain four columns: Thread, Priority, Time, and Message. The most important messages are from the OVR::Stats thread which report FPS, stale, tear, latency, memory usage, CPU/GPU clock rates, temperatures, and more.

FPS Is short for frames per second, and is a measurement of the frame rate (explained in Subsection 3.8.2.1). Stale Is related to time warp, which is explained in Subsection 3.8.3.1. The Stale column shows the number of frames per second that did not complete in time for the predicted display time, causing time warp to re-project and display an old frame. Tear Also related to time warp. The Tear column shows the number of frames per second in which the time warp did not complete in time. A value of zero is desired.

29 4.5. Presence and Simulator Sickness Measurements

Prd When the applications begins to render a new frame it will poll the predicted head track- ing pose at the predicted display time. If the prediction is correct, the frame will be cor- rectly transformed and match the user pose. Prd is the reported number of milliseconds between the latest sensor sampling and the anticipated display time.

4.5 Presence and Simulator Sickness Measurements

A user study with questionnaires was conducted to collect data for presence and VR sickness. Data collection of presence and VR sickness was done using the IPQ and SSQ respectively.

4.5.1 Rationale for using IPQ and SSQ As described in Subsection 4.1.1, both presence and VR sickness were constructs of inter- est. The SSQ was chosen to measure VR sickness because of its popularity and reputation (see Subsection 3.8.1.1). Presence was also of interest but presented a challenge, since by adding another questionnaire the total questionnaire length would increase, causing the risk of motivation loss and lack of focus. By keeping user study sessions short, the willingness to participate increases. Robson discusses and explains the usage of questionnaires and face- to-face interviews in his book, Real World Research [108]. While Subsection 3.4.1 reveals that there are many alternatives for measuring presence, the IPQ was deemed a good fit for the test suite, because it is a short and popular questionnaire.

4.5.2 Acquiring Participants The number of participants in the user study was mostly limited due to availability, location and setting. Getting the chance to experience virtual reality increased the motivation to par- ticipate. The internal communication channels within Valtech was used to gather voluntary participants for the user study. In addition to Valtech employees, some participants were acquired through friendship or other relationships to the authors. As such, the participants were acquired as volunteers, through different channels and with different motivations. Each participant was assigned to one of two groups. The two groups were used to alter- nate the order of implementation presented to the participants. The first group was exposed to the native implementation first, and the web implementation last. The second group ex- perienced the implementations in reversed order. The group assignment was not strictly randomized, but was instead chosen in an alternating fashion, keeping the two groups equal in size while conducting the user study. This was done since the number of participants was not fixed in advance.

4.5.3 User Study Each subject attended two sessions, one for the web implementation and another for the native implementation. A test protocol document (Appendix A) was used by the instructors to ensure that that all tests were executed systematically and with consistency. The protocol was designed as a step-by-step guide. A pilot study was conducted to improve the test procedure. A number of improvement points were found and fixed. The test protocol was refined and additional texts were added to questions that were hard to understand or interpret. The pilot study allowed us to find problems while allowing the instructors to gain some experience with the procedure. The first session was started by letting the user fill a demographic questionnaire (Sec- tion C.1) and a pre exposure SSQ. The user was informed about the VR scene so that the user would not feel too unfamiliar with the scene. This was done to ensure that the first session would measure the same constructs as the second session when the user is already famil- iar with the scene. The user was also informed about the GearVR headset and its controls.

30 4.6. Analysis Procedures

The user was asked to stand up during the test to match the standing position of the virtual character in the VR application. Upon entering the VR scene, the user was guided by the in- structor through the first interactions (steps 7a-7c in Appendix A). The user was then asked to find the remaining scene interactions and explore freely. The instructor started a timer when the headset was put on, and made sure the user was engaged in the scene for two to three minutes. After the exposure, the user filled out the post exposure SSQ and the IPQ. The questionnaires were created using Google Forms, reducing the risk of parsing errors. The questionnaires were split into three different forms: demographics, pre session and post ses- sion. The pre session form contained only a pre exposure SSQ, while the post session form contained both the post exposure SSQ and the IPQ. All three forms contained a part filled by the test instructor which included a subject ID field to identify the subject across the three forms. The pre session and post session forms also contained a field called App to identify whether the web or native implementation was used in the session, this field was filled by the test instructor and not shown to the subject. A test session consisted of 46 questionnaire items, with an additional 5 items for the demo- graphics questionnaire used in the first session. A session took about 15 minutes to complete.

4.6 Analysis Procedures

Data analysis was done using python with the pandas library for data structures and data analysis utilities, and matplotlib for plotting. The questionnaires from the user study were exported as three separate CSV-files: Demographics.csv, PreSession.csv and PostSession.csv. The files were parsed and processed as pandas data frames using the Subject ID field and App field as a multi index. The python code were contained and executed in Jupyter notebooks with additional markdown cells for documentation.

31 CHAPTER 5

VR Applications

This chapter presents both VR applications built for the thesis. The applications were devel- oped simultaneously and made to look as similar as possible by using the same 3D models as the starting point and by sharing textures and materials wherever possible. When an ap- plication is launched, the user is standing just inside the entrance to Valtech Store and is immediately able to look around by moving his or her head. Figure 5.1 shows the starting view of the WebVR application, and Figure 5.2 shows the same for the native application. A small reticle in the center of the users view is used to interact with the scene, in a similar fashion as a mouse cursor. The user is able to move around in the scene by teleporting to desired locations in the room. When looking at the floor, a blue circle is shown to indicate that movement is possi- ble, as shown in Figure 5.3 and Figure 5.4. To move to a new location, the user taps on the touchpad on the GearVR headset. Upon selecting a new position, the user is immediately teleported to the new position. There are a number of interactable objects in the scene. To indicate to the user that these objects can be interacted with, the reticle will change when the user looks at one of these objects. When a user taps on one of the interactable objects, a window displaying information about the object will open up as shown in Figure 5.5 and Figure 5.6. Finally, the user is also able to change the material of the walls in the scene by tapping on any of the walls and selecting one of three available materials in a popup menu. Figure 5.7 and Figure 5.8 show the material menu for the two applications. Although the applications are meant to look identical, there are some visible differences. The native application looked a little warmer and had a little more contrast between bright and dark areas, giving a little more natural feel to the lighting and shadows overall.

32 Figure 5.1: Start view - WebVR application

Figure 5.2: Start view - Native application

Figure 5.3: Teleport indicator - WebVR application

33 Figure 5.4: Teleport indicator - Native application

Figure 5.5: Information window - WebVR application

Figure 5.6: Information window - Native application

Figure 5.7: Wall material menu - WebVR application

Figure 5.8: Wall material menu - Native application

34 CHAPTER 6

Experimental Results

This chapter presents the results from the performance tests and the user study that evalu- ated presence and VR sickness as described in Chapter 4. The following chapter is divided into four sections. Section 6.1 outlines the procedure used for the tests. Section 6.2 presents the results from the performance measurements, followed by Section 6.3 presenting the SSQ results and Section 6.4 which presents the IPQ results.

6.1 Performance Test Procedure

The performance measurements were performed manually following a pre-defined structure. The results were recorded using Oculus Remote Monitor (OVRMonitor). Each test started with the test driver placed in the starting position in the VR application, looking in the initial camera direction. 10 seconds into the test the test driver turned to look at the VR headset (within the virtual world). At 15 seconds the test driver teleported close to the VR headset and looks towards the clothes. At 25 seconds the test driver turns around and looks outside through the window. At 35 seconds the test driver teleport to the iPad and looks at it. At 45 seconds the test driver looks up and focuses on the TV for another 15 seconds. These steps make the test 60 seconds long and include most of the actions and interactions possible in the application. The protocol for the performance tests can be found in Appendix B.

6.2 Performance Test Results

The performance measurements as shown in Figure 6.1 shows difference in FPS, stale frames, and tear between the implementations: Native, WebVR with AA (anti-aliasing), and WebVR without AA. WebGL and WebVR does not allow the developer to choose which AA technique to use. As such, WebVR was tested with and without anti-aliasing for validity reasons; as it makes sure that it did not have big impact on performance. The vertical lines in Figure 6.1 represent an event, which is one of the time markers at which the test driver performs a task. Note that the event times are not exact because of human error when executing these tasks manually. Figure 6.1a shows how the FPS (frames per second) for each application vary over time. Note that the figure shows the amount of frames rendered per second, which is not to be confused with the amount of frames displayed per second. This distinction is necessary be- cause of ATW (asynchronous time warp), which mitigates the effect of dropped frames as explained in Subsection 3.8.3.1. FPS is closely related to stale frames, which is shown in Figure 6.1b. A stale frame is a frame that was produced by the ATW. For every missed frame by the application, the ATW will produce a frame in its place. Since the aim is to maintain 60 FPS, we can clearly see that adding the FPS and stale frames sums up to approximately 60 FPS. Figure 6.1a shows that

35 6.2. Performance Test Results

60 60 Event Native 50 WebVR no AA 50 WebVR with AA 40

40 30

20 Event

frames per second 30 Native 10 WebVR no AA WebVR with AA Reprojected frames per second 0 20 0 10 20 30 40 50 0 10 20 30 40 50 Timestamp Timestamp

(a) FPS (b) Stale

3.5 Event 65 Native 3.0 WebVR no AA 60 WebVR with AA 2.5 55 2.0 ms 1.5 50

1.0 45 Event frames per second Native 0.5 WebVR no AA 40 0.0 WebVR with AA 0 10 20 30 40 50 0 10 20 30 40 50 Timestamp Timestamp

(c) Tear (d) Prd

Figure 6.1: Performance Measurements the native application produce close to 60 FPS throughout the test. As a result of that, the number of stale frames per second as shown in Figure 6.1b is close to zero. When looking at the same figures again, but focusing on the WebVR, we can see that when the application render approximately 45 FPS, the number of stale frames per second is approximately 20, which sum up to just over 60 FPS. In short, the native implementation produce close to 60 FPS throughout the test, while the web implementations struggle to keep up. The native application is not affected by the tasks; all measurements keep steady through- out the whole test. There are however some spikes in the WebVR tests that can be connected to the specific timestamps of tasks carried out during the test. 15 seconds into the test, a tele- portation and a camera swipe is carried out. Within the same time range, in Figure 6.1, the WebVR application has an apparent spike in FPS, stale frames, and prd. After the 25 second mark one can see that the FPS and stale frames begin to stabilize. At the 35 second mark all values are more or less back to initial values. The camera swipe at the 45 second mark does not seem to affect the WebVR nor the native implementation. Between the 25 and 35 second mark Figure 6.1d shows the lowest sensor delay, which in combination with the same time period in Figure 6.1c suggests that the ATW work well during that time frame. Further, Figure 6.1 shows that there is no indication on performance difference between WebVR with or without anti aliasing. Figure 6.2 shows the FPS performance for each individual test. The figure show that each test has the same characteristics as the other tests performed with that application. This suggests that even though performance tests were not automated, the tests show consistency which improves the validity. There are no obvious outliers.

36 6.2. Performance Test Results

WebVR no AA 1 60 60 WebVR no AA 2 WebVR no AA 3 WebVR no AA 4 50 50 WebVR no AA 5

40 40

Native 1

frames per second Native 2 frames per second 30 30 Native 3 Native 4 Native 5 20 20 0 10 20 30 40 50 0 10 20 30 40 50 Timestamp Timestamp

(a) Native (b) WebVR no AA

WebVR with AA 1 60 WebVR with AA 2 WebVR with AA 3 WebVR with AA 4 50 WebVR with AA 5

40

frames per second 30

20 0 10 20 30 40 50 Timestamp

(c) WebVR with AA

Figure 6.2: Individual FPS Measurements

Table 6.1 shows a summary of min, max, variance and mean values from the performance tests. Looking at the data in Table 6.1 we see that the native application performs very well through the entire test. In the graph in Figure 6.1a we see that the FPS seems to stay steady around 60 FPS. This is confirmed by the performance data that shows that FPS for the native application stays between 58.2 and 59.8 with a mean value of almost 59.5 FPS and very low variance. On the other hand, it is clear that the WebVR applications struggle with perfor- mance and does not reach a frame rate above 51 FPS, both with and without anti-aliasing. They also show a very unstable frame rate, with very high variance. We can also again see that anti-aliasing does not make any noticeable performance impact, the mean values for all categories are almost identical for WebVR with AA and WebVR without AA. Looking at the other data points we see the same pattern with the native application show- ing very good performance with low numbers for both stale frames and tear and very stable numbers for prd, and the WebVR applications with very high variance and worse numbers than the native application.

37 6.3. SSQ Test Results

Table 6.1: Performance Data Description

Native

FPS Stale Prd Tear

Mean 59.489 0.075 50.993 0.229 Var. 0.070 0.068 0.004 0.041 Min. 58.200 0.000 50.600 0.000 Max. 59.800 1.600 51.200 0.600

WebVR no AA

FPS Stale Prd Tear

Mean 39.243 27.664 55.911 0.925 Var. 71.780 246.432 84.073 0.351 Min. 21.600 8.800 39.600 0.000 Max. 51.000 60.400 65.400 2.400

WebVR with AA

FPS Stale Prd Tear

Mean 39.379 27.396 55.839 1.025 Var. 61.000 218.028 80.639 0.614 Min. 22.800 11.200 38.800 0.000 Max. 49.400 58.400 65.200 3.600

6.3 SSQ Test Results

The SSQ (Simulator Sickness Questionnaire) was used to measure to what extent the VR ap- plications induced VR sickness. The SSQ scores each subject in three categories. See Table 3.2 for detailed explanation of SSQ calculations. The subjects filled out the SSQ two times for each application: before and after VR appli- cation exposure, referenced below as pre and post measurements. During the VR application exposure the subjects were instructed to perform a series of interactions: moving around in the virtual world, interacting with product information popups, and changing wall texture using a menu. The amount of VR exposure was kept between two and three minutes, de- pending on how fast subjects completed all interactions. 18 subjects participated in the user study. The subjects were acquired through Valtech communication channels, and by ties with the authors. The user study method is explained in detail in Section 4.5. These results risk being biased because of different amount of head turning and rotation speed between subjects. Even though the subjects perform the same set of tasks, there is nothing that prevent subjects from making high-speed head turning. For further information on biases and validity, see Subsection 4.1.3. Figure 6.3 shows how the score is changed between the pre and post measurements. The majority of the SSQ results show an increasing sickness after use of both VR application, which is expected. Figure 6.4 shows the SSQ change score of the total score from the SSQ calculation, exclud- ing two outliers that can be seen in Figure 6.3. Again, a positive change value means that the SSQ score increased after exposure, or simply put, subjects were experiencing more sickness

38 6.3. SSQ Test Results

75 75

50 50

25 25

0 0

25 25 SSQ Delta Score SSQ Delta Score 50 50

75 75 N O D T N O D T

(a) Web (b) Native N = Nausea O = Oculomotor D = Disorientation T = Total

Figure 6.3: SSQ score change between pre and post exposure (calculated as post score ´ pre score)

50

40

30

20

10 SSQ Total Score

0

10

native T web T

Figure 6.4: SSQ total score change between pre and post exposure without outliers after they had used the virtual reality applications. There is no evident difference between the applications. The web application has a more scattered result.

39 6.4. IPQ Test Results

60%

40%

20%

0%

-20%

IPQ Score Difference (%) -40%

-60%

G1 SP INV REAL

Figure 6.5: IPQ scores for native compared to web (zero means equal scores, positive means native scored higher than web)

6.4 IPQ Test Results

IPQ is the questionnaire that was used to measure presence. The concept presence is ex- plained in Section 3.4, and the IPQ is explained in Subsection 3.4.1. The output of the ques- tionnaire is a score in four categories: G1 for general "sense of being there", SP for spatial presence, INV for involvement, and REAL for realism. High IPQ scores indicate a high sense of presence. Figure 6.5 shows the native IPQ scores in relation to the web scores. Positive values indicate that the native implementation yielded higher scores than the web imple- mentation. The results shows that the native application’s median score is somewhat higher in three out of four categories, and has an equal median score in G1. The native application scores 11% higher on average on SP. The IPQ results show a slightly lower standard deviation for the native application in all categories but INV. See Appendix D for the specific IPQ scores for both implementations.

40 6.5. Correlation between SSQ and IPQ

6.5 Correlation between SSQ and IPQ

It has been argued that there is a relation between SSQ and IPQ, see Subsection 3.4.1. The tables below (Table 6.2 and Table 6.3) show the pairwise Pearson correlation coefficients be- tween each SSQ category and IPQ category. For the native application, there is no correlation between the SSQ and IPQ scores. There is however some weak and strong correlations within the SSQ and IPQ categories. For ex- ample, G1 (general sense of being there) has a strong correlation with SP (spatial presence) in the native application and moderate correlation with the other IPQ categories in both applica- tions. There is also a moderate correlation between the SSQ categories, such as disorientation and nausea. The web application show similar correlations but stronger. There are strong correlations between IPQ categories, and moderate correlations between SSQ categories. It is worth not- ing that there are low correlations between the SSQ and IPQ categories, which is not the case for the native applications. Recall the categories: SSQ: N = Nausea O = Oculomotor D = Disorientation T = Total IPQ: G1 = General sense of being there INV = Involvement REAL = Realism SP = Spatial Presence

Correlation coefficent colors 0 ď r ă 0.3 0.3 ď r ă 0.5 0.5 ď r ă 0.7 0.7 ď r ă 1.0 1.0

Table 6.2: Native correlations

G1 INV REAL SP NODT G1 1.0 0.39 0.74 0.79 -0.26 -0.22 -0.17 -0.25 INV 0.39 1.0 0.2 0.27 0.16 -0.22 0.03 -0.05 REAL 0.74 0.2 1.0 0.61 -0.3 -0.04 -0.02 -0.1 SP 0.79 0.27 0.61 1.0 0.07 0.07 -0.12 -0.02 N -0.26 0.16 -0.3 0.07 1.0 0.41 0.63 0.76 O -0.22 -0.22 -0.04 0.07 0.41 1.0 0.42 0.78 D -0.17 0.03 -0.02 -0.12 0.63 0.42 1.0 0.87 T -0.25 -0.05 -0.1 -0.02 0.76 0.78 0.87 1.0

Table 6.3: Web correlations

G1 INV REAL SP NODT G1 1.0 0.5 0.54 0.67 0.15 0.13 0.36 0.22 INV 0.5 1.0 0.61 0.74 0.32 0.15 0.33 0.27 REAL 0.54 0.61 1.0 0.67 0.41 0.35 0.51 0.45 SP 0.67 0.74 0.67 1.0 0.27 0.08 0.35 0.23 N 0.15 0.32 0.41 0.27 1.0 0.82 0.83 0.93 O 0.13 0.15 0.35 0.08 0.82 1.0 0.81 0.95 D 0.36 0.33 0.51 0.35 0.83 0.81 1.0 0.93 T 0.22 0.27 0.45 0.23 0.93 0.95 0.93 1.0

41 CHAPTER 7

Discussion

This chapter is divided in three sections. Section 7.1 discuss and explains the results. This is followed by a discussion of the method in Section 7.2. The last section of this chapter, Section 7.3, broadens the discussion to include ethical and social aspects.

7.1 Results

The performance metrics show a difference in all measurements reported in Section 6.2; FPS, Stale, Tear, and Prd all favor the native implementation. The theory chapter bring forward the importance of frame rate, both specifically for VR and for measuring computer programs in general. It could be argued that this answers one of the main goals of this thesis, which is to provide a basis for when to choose web or native implementation. The native implementation simply outperforms the web implementation when it comes to the measured performance metrics. However, the user study shows contradicting results. Neither the SSQ nor the IPQ show any convincing results that favors one application over the other. As such, we argue that low FPS does not necessarily lead to bad user experience. It may sound unlikely at first though, mostly because it seem to contradict the common knowledge that applications with user input should show response quickly. But it is very important to consider the impact of efficient mitigation techniques such as ATW and ASW. The user does not recognize low FPS in the same way as traditional applications that lacks the mitigation techniques used in modern VR systems.

7.1.1 Development of VR Applications Both the WebVR application and the native application were built based on pre existing 3D models of Valtech Store. This saved a lot of time, since 3D modeling is very time consuming and would not have been feasible within the given time frame. It also meant that both ap- plications used the exact same 3D model for the scene, which helped in making them look as similar as possible. Since Unity is so widely used, it has a lot of documentation and tutorials as well as a large community. This is also true for VR development in Unity, even though it is fairly new. Having step by step tutorials, well-documented example projects and SDK’s for the most popular VR platforms was a big help when the native development started. In general, the tooling for native VR development is a lot better than for WebVR devel- opment. A major part of that is the 3D part of development. Unity is built for 3D and the fact that there is a drag-and-drop style editor for 3D scenes is a very different experience compared to Three.js, which requires a lot more hand written code. It is possible to export scenes from Unity to WebVR using third-party tools, but we found that the performance was too poor for it to be usable. WebVR support will certainly improve with time since it is still

42 7.1. Results an experimental API, and frameworks such as React 3601 and A-Frame2, both built on top of Three.js, makes it easier to develop VR for the web.

7.1.2 Performance measurements The performance results shows that the WebVR application when executed on mobile hard- ware does not manage to maintain 60 FPS. This is a serious issue, as emphasized in Sec- tion 3.8. The performance struggle of the WebVR application demonstrate that the real-time 3D rendering of a VR scene is a hard and performance demanding task. The measured mean FPS of the native application is close to 60 FPS, while the WebVR is barely 40 FPS. This is a substantial but very reasonable and expected result that is supported by the theory presented in Section 3.7. Our work can not display any result on how much faster native is, because the FPS was limited to max 60. We can confirm M. Mobeen and L. Feng’s claim that a WebGL-call is comparable to a native call, to the extent that it can compete with native GL for less demanding 3D scenes.

7.1.3 User study There is a significant FPS drop in the performance results of WebVR that should be noticeable by the user. However, the user study does not show any tendency that WebVR is significantly worse in neither SSQ nor IPQ. While the performance results are recorded in separate sessions than the user sessions, and also not recorded for the same set of tasks; we believe that the performance data recorded (such as FPS) should be similar to what the users experienced during their sessions. As such, we hypothesise that it is the work of the mitigation techniques (ASW and ATW) that make the WebVR experience on par with the native one. ASW and ATW seem to be surprisingly effective in masking low performance. There are many aspects that should improve the effectiveness of the mitigation in our scene. Our scene is static, without moving objects and particles. Further, the user navigate the scene using teleportation, meaning the only moving part is the user’s head. The user is also not exposed to any stressing moments in the scene that could lead to fast head movements. All of this makes it easier to predict the next frame, since the visual difference between frames are small and does only depend on the variable of head movement. ATW is tracking and predicting head movements. ASW is concerned with tracking movements within the scene, but since there is no movement nor change of lighting there is not much work to do for the ASW. We believe that this gives more time to the ATW, which is more important in our application.

7.1.3.1 SSQ The SSQ results indicate an increase of VR sickness after exposure to the VR experiences. It is expected that the SSQ score is equal or lower in the post-measurement because otherwise, it would mean that the VR exposure somehow cured VR sickness. The SSQ result may be af- fected by bias and error of the subjects because they understand what is being measured and therefore report higher sickness after the VR experience. By comparing the increased score between the applications, the potential error should be of less significance. The SSQ, much like the IPQ, should not be used for absolute measurements, and instead, rely on comparative analysis. There are a few outliers that report higher SSQ scores, especially disorientation, af- ter their VR usage. That could be due to misinterpretation of the questionnaire, or the subject feeling disoriented before the test. The results show no apparent difference between the native and VR on the Web, except that the Web application has more scattered results than the native one. One could hypothe-

1React 360 - https://facebook.github.io/react-360/ 2A-Frame - https://aframe.io/

43 7.1. Results size that it could be due to some subjects being more sensitive to varying FPS, or that ATW works better for some subjects, or that some subjects performed hastier head movements. We did expect low VR sickness in both applications because the VR experience in both applications does not contain much action. The leading theory for VR sickness is sensory mismatch, which effectively is when the camera moves in the virtual world while the real world body is in rest [75]. The teleportation method for navigation in VR is very kind to sen- sory mismatch and should lead to less VR sickness than other methods in which the character is moving [119]. The alternative method of having a smooth camera transition to navigate in the VR scene should induce more sensory mismatch. Think of the extreme case with a scene in which the user experiences a rollercoaster ride; that would most likely yield much higher VR sickness. In addition to kind navigation, VR sickness is reduced by ATW. The SSQ result gives an excellent basis to claim that VR on the Web is not too bad to stand a chance against native VR. It does not mean that VR on the Web is good enough, nor that it is as good as native VR, but it is an essential insight that native VR does not stand unmatched.

7.1.3.2 IPQ The results of the IPQ questionnaire suggests that the native application delivers a slightly better experience in terms of presence. This could be due to the fact that the native application had better quality in shadows and lighting and looked a little more natural, as mentioned in Chapter 5. The native application scores higher than the WebVR application in three out of the four IPQ categories, with the fourth category having equal scores. This suggests that the native application delivers a higher sense of presence than the WebVR application. Interest- ingly, there are a number of outliers that strongly prefer the WebVR application. This could be due to personal preference or some coincidental occurrence which broke the sense of pres- ence for the test subject. Another possible reason for the native application scoring higher in IPQ could be due to higher performance. The native application was capable of providing a higher frame rate throughout the test which could contribute to a higher sense of presence. We cannot give an absolute score for how good our VR applications are in terms of pres- ence, since the IPQ questionnaire should not be used to make comparisons across environ- ments as mentioned in Subsection 3.4.1. What we can look at is the differences between our two applications and if that matches our hypothesis. Because of this, we only present relative data comparing our two applications. A big variation in score can be seen in the INV category as seen in Figure 6.5. This could be due to a number of reasons, but one possible explanation is the fact that we conducted a guided test with spoken instructions during the VR experience. Speaking and listening during the VR experience definitely prevent the user from disconnecting from the real world, leading to a lower score for involvement. When the second test was conducted, the user had already been through the steps once before and did not need as much guidance. This meant that they could focus more on the virtual world and felt a higher sense of involvement leading to a large variation of scores between the first and second tests. For the G1 category, the median is very close to zero. This indicates that the scores for the "general sense of being there" are very close between the native and WebVR applications. One explanation for this can be that the overall experience in the two applications are very similar. They scenes are close to identical, the tasks performed are the same, using the same VR headset in the same room. We were not expecting any big differences in presence since the applications are very similar in look and were running on the same HMD and hardware. This was confirmed by the IPQ results which showed a slight favor towards the native application, a difference that most likely is due to somewhat better quality in lighting and shadows and better performance.

44 7.2. Method

7.1.3.3 Correlation We can see correlation between the categories within the IPQ and SSQ scores respectively. This is expected since it is logical that nausea would follow disorientation for the SSQ scores and that the feeling of spatial presence would correlate with the general sense of "being there" in the IPQ scores. We can not find any strong or even moderate correlation between IPQ and SSQ scores. For the native version we find no correlation between the two, while for WebVR there is a low correlation between them. WebVR also shows stronger correlations within the categories of SSQ and IPQ. We do not have a good explanation for this and it could be a coincidence in the scores making them seem to correlate stronger.

7.2 Method

The multi-method strategy was executed successfully; the most valuable results was due to comparing the results between the user study and performance measurements. If we were to measure only hard data such as FPS, the underlying mechanisms would be missed. It is very important to note that this would be a very different thesis if the mitigation techniques would have been disabled. By disabling the ATW (asynchronous time warp) we hypothesise that the user study would produce much lower SSQ and IPQ scores. That could have given us results that confirm that low FPS is very bad for VR applications, and as such recommend developers to build native applications. But our results (with ATW enabled) show that not all VR applications must produce 60 FPS to achieve acceptable VR sickness and presence. A limitation with our chosen method is that we can not do any detailed comparisons between application performance and user experience. The user study was done using only two applications which had varying FPS. To show a relationship between FPS and IPQ, we would need to perform more user studies, and try each application with a set of fixed FPS. We suggest such a study in Section 8.1. By comparing two applications with such difference in rendering process and perfor- mance, there is a severe hit to internal validity. The isolation of variables would be improved a lot by only having one application with a setting to limit certain properties, such as FPS. On the other hand, such experiment would not give any general evidence for choosing between native and WebVR. This thesis does in fact provide guidelines to choose between the two. We believe that the thorough research on methodology improved the thesis by avoiding mistakes that certainly could hurt validity. This is even more important when researching a field such as VR, in which there are many unknowns and a tendency for unproven constructs.

7.3 The work in a wider context

As Oculus, Sony, and HTC top the number of VR devices shipped, we believe that it is safe to assume that the primary usage for virtual reality is consumer entertainment [120]. The leading companies promote their HMDs as devices to be used for gaming and viewing VR content. Nevertheless, there is much academic research covering many other disciplines and areas, such as educational use, psychological treatments, architecture, art, music, and sports. The best VR experiences still rely on stationary hardware such as a PC or gaming console, but 2019 show a shift towards standalone headsets according to SuperData year review [121]. With standalone HMDs, mobility is increased, which widens the space for other types of VR applications. In recent years, the gaming industry has seen a rise in cloud gaming platforms, a technology that makes it possible to utilize the hardware of cloud platforms to render and stream game scenes to less powerful hardware such as mobile devices [122]. Approaching technologies such as 5G and Wifi 6 (IEEE 802.11ax) can further improve viable mobile VR use. With a cloud gaming experience that could further take advantage of 5G technology,

45 7.3. The work in a wider context such as the proposed novel framework by Zhang et al., we might see more advanced mobile VR applications in the future [123]. Virtual Reality-based treatments for eating disorders have been studied since the early 1900s. In 2008 Gutiérrez-Maldonado et al. note that “there are an increasing number of stud- ies focused on other aspects, such as anxiety, craving, avoidance, grooming behaviors, self- esteem, and self-efficacy“ [124]. Gutiérrez-Maldonado et al. discuss problems with VR that might hinder usage in clinical settings, as they mention both economics, technological diffi- culties, and simulator sickness. As VR technology is refined, minimizing the drawbacks with the medium, we could see more widespread use of VR in clinical settings, making a good impact on society. VR has also been heavily studied in the educational area and examined as a tool for learn- ing in classrooms. One of the selling points for using VR to learn is the possibility to simulate many situations, which has the benefit of being cheap in cases where it can substitute consum- able materials needed in other learning environments. Further, the same set of equipment can be reused for many different simulations and learning experiences. In a review study from 2018, Jensen and Konradsen finds that VR is a good fit for some situations, while in other situations there is no evident advantage with the immersive experiences that VR provides [125]. Jensen and Konradsen discuss two fundamental barriers to using VR in education: content availability and production, and that the HMDs of today are primarily focused on entertainment. The first barrier with VR content exemplifies the need of studies required to learn more about what makes good VR content. We believe that both of these barriers will be overcome and are not yet solved because the technology is new. There are of course negative sides to any widespread technology and VR is no exception. When talking about risks and potential negative aspects of VR usage, the most obvious ones are physical. Simulator sickness, resulting in nausea and disorientation, is a common thing and discussed in length in this thesis. Another physical risk is the fact that the physical space around you is not visible while inside the VR experience. This creates a real possibility of tripping or knocking over something in the room that you simply could not see and forgot was there. Other concerns are based on the fact that VR is highly immersive and might amplify ef- fects that are already known for other forms of media such as movies and video games. Stud- ies have shown that video games can lead to emotional desensitization [126], and it is not a far stretch to imagine that a VR game that increases immersion and presence also could increase these negative effects. Another example that shows the possible impact of highly immersive experiences is a study that examined the formation of false memories. The study showed that immersive imaging can contribute to forming false memories in young children [127]. Other concerns involve harassment and cyberbullying. In online VR applications, other people can appear to physically get close to another person, bringing a physical dimension to cyberbullying and harassment.

46 CHAPTER 8

Conclusion

In this thesis, we explored the performance and user experience of VR on the Web and native VR, with the aim to create a basis for when, and if ever, to develop VR experiences using Web or native technology. We posed two research questions: how does a WebVR application com- pare to a native VR application in terms of user experience, and can the web platform deliver the performance required for VR applications in terms of frame rate, latency and other per- formance metrics? Two VR applications were developed to answer these questions, one with web technology and a second one with native technology. We compared the applications with a multi-method strategy, combining a comparison of performance metrics with a user study. The result showed no apparent difference in user experience between applications, yet there is still a big difference in performance metrics. This result contradicts the expectation that higher performance should yield higher user experiences. We believe that the contradiction is due to the work of successful mitigation techniques used by the virtual reality hardware. The presented result answers the research questions, but we found that the questions missed the aim to some extent. With mitigation techniques such as ATW and ASW, appli- cation performance does not have the last word. There is no clear-cut answer to whether or not to develop VR with Web or native technology because it depends on the type of virtual experience, VR scene, audience, and expected VR equipment.

8.1 Future Work

This thesis does not try to explore nor prove any relationship between performance and VR sickness. The same goes for the relationship between performance and presence. Such re- search would be of great interest. Preferably, the attempt to measure that relationship should only use a single VR application, as to minimize unrelated variables. The user study could use the same VR application, but use different target FPS limits. By running the experiment in 10 FPS, 20 FPS, 30 FPS, up to 60 FPS, it should provide insight to how VR sickness or presence is affected by each frame rate limit. Likewise, it would be interesting to run the same experiments with different levels of latency to explore the relationship between input delay and VR sickness and presence. Another interesting aspect to explore is the concept of prefetching and adaptively stream- ing resources such as 3D-models and textures. This could help minimize the size of the in- stalled application for native VR by only including the essential files in the installed package and downloading the rest in the background when needed. The drawback would be that the native application would then require internet connection to work properly. Likewise for WebVR it would minimize the size of the files needed to start the VR application, providing a better user experience. Work has been done regarding prefetching 360 video by Almquist et al. in [128]. The idea suggested by Almquist et al. is a data-driven characterization approach

47 8.1. Future Work that can be investigated as an approach to optimize prefetching techniques in 3D rendered scenes, in which textures and models are prefetched instead of video. VR techhnology is constantly moving forward with a lot of innovation happening. Since some time has passed since our experiments, a lot of new exciting products and technolo- gies has been introduced. This includes new and more powerful smartphones, all-in-one VR headsets like the and new technologies like Fixed Foveated Rendering (FFR) for mobile devices and Asynchronous Space Warp 2.0 for the Oculus Rift on PC. It would be interesting to see how much these new developments would impact our experiments by re-making and re-running them using the latest technologies.

48 Bibliography

[1] Jonathan Steuer. “Defining Virtual Reality: Dimensions Determining Telepresence”. In: Journal of Communication 42.4 (1992), pp. 73–93. [2] S R Ellis. “What are virtual environments?” In: Computer Graphics and Applications, IEEE 14.1 (1994), pp. 17–22. [3] Tomasz Mazuryk and Michael Gervautz. Virtual Reality - History, Applications, Tech- nology and Future. Tech. rep. Institute of Computer Graphics and Algorithms, Vienna University of Technology, 1996. [4] Michael A Gigante. “Virtual reality: definitions, history and applications”. In: Virtual Reality Systems (1993), pp. 3–14. [5] Presence: Teleoperators and Virtual Environments. 2019. URL: http://www. mitpressjournals.org/loi/pres (visited on 02/09/2019). [6] D.S. Brewster. The Stereoscope; its History, Theory, and Construction, with its Application to the fine and useful Arts and to Education: With fifty wood Engravings. John Murray, 1856. [7] Ivan E Sutherland. “The Ultimate Display”. In: IFIP Congress (1965), pp. 506–508. [8] Ivan E. Sutherland. “A head-mounted three dimensional display”. In: Proceedings of the AFIPS ’68 (Fall, part I) (1968), pp. 757–764. [9] Howard Rheingold. Virtual Reality. New York, NY, USA: Simon & Schuster, Inc., 1991. [10] John A. Adam. “Virtual Reality is for Real”. In: IEEE Spectrum 30.10 (1993), pp. 22–29. [11] F.P. Brooks. “What’s Real About Virtual Reality?” In: Proceedings IEEE Virtual Reality (Cat. No. 99CB36316) December (1999), pp. 2–3. [12] Peter Rubin. “The inside story of oculus rift and how virtual reality became reality”. In: Wired. com (2014). [13] Oculus Rift. Kickstarter - Step Into the Game. 2012. URL: https://www. kickstarter.com/projects/1523379957/oculus-rift-step-into-the- game (visited on 11/22/2016). [14] HTC. HTC Vive. 2016. URL: https://www.vive.com/ (visited on 11/22/2016). [15] Sony. Playstation VR. 2016. URL: https://www.playstation.com/explore/ playstation-vr/ (visited on 11/22/2016). [16] Samsung. Samsung GearVR. 2016. URL: http://www.samsung.com/global/ galaxy/gear-vr/ (visited on 11/22/2016). [17] Google. Daydream View headset. 2016. URL: https://vr.google.com/daydream/ (visited on 11/22/2016). [18] Google. Cardboard headset. 2016. URL: https://vr.google.com/cardboard/ (vis- ited on 11/22/2016).

49 Bibliography

[19] Mel Slater and S Wilbur. “A Framework for Immersive Virtual Environments (FIVE): Speculations on the Role of Presence in Virtual Environments”. In: Presence: Teleopera- tors and Virtual Environments 6.6 (1997), pp. 603–616. [20] Bob G Witmer and Michael J Singer. “Measuring Presence in Virtual Environments: A Presence Questionnaire”. In: Presence: Teleoper. Virtual Environ. 7.3 (1998), pp. 225–240. [21] Thomas Schubert, Frank Friedmann, and Holger Regenbrecht. “The Experience of Presence: Factor Analytic Insights”. In: Presence 10.3 (2001), pp. 266–281. [22] Marvin Minsky. “Telepresence”. In: Omni (June 1980), pp. 45–52. [23] The Concept of Presence: Explication Statement. 2000. URL: https://smcsites.com/ ispr/ (visited on 11/22/2016). [24] Matthew Lombard and Theresa Ditton. “At the Heart of It All: The Concept of Pres- ence”. In: Journal of Computer-Mediated Communication 3.2 (1997). [25] Frank Biocca and Mark R Levy. “Communication Applications of Virtual Reality”. In: Mindlab.Org June 1995 (1993). [26] James J. Gibson. “The Ecological Approach to the Visual Perception of Pictures”. In: Leonardo 11.3 (1978), p. 227. [27] Mel Slater. “Measuring presence: A response to the Witmer and Singer presence ques- tionnaire”. In: Presence: Teleoperators and Virtual Environments 8.5 (1999), pp. 1–13. [28] Jonathan Freeman et al. “Effects of Sensory Information and Prior Experience on Di- rect Subjective Ratings of Presence”. In: Presence: Teleoperators and Virtual Environments 8.1 (1999), pp. 1–13. [29] J van Baren and W IJsselsteijn. “Deliverable 5 Measuring Presence : A Guide to Current Measurement Approaches”. In: Measurement 0 (2004), pp. 1–86. [30] Christine Rosakranse and Soo Youn Oh. “Presence Questionaire Use Trends Measur- ing Presence: The Use Trends of Five Canonical Presence Questionaires from 1998- 2012”. In: International Society on Presence Research (2014), pp. 25–30. [31] Mel Slater and Anthony Steed. “A Virtual Presence Counter”. In: Presence: Teleoperators and Virtual Environments 9.5 (Oct. 2000), pp. 413–434. [32] T.B. Sheridan. “Musings on telepresence and virtual presence”. In: Presence Teleopera- tors and Virtual Environments 1 (1992), pp. 120–126. [33] Wijnand A. IJsselsteijn et al. “Presence: concept, determinants, and measurement”. In: ed. by Bernice E. Rogowitz and Thrasyvoulos N. Pappas. Vol. 31. 0. June 2000, pp. 520– 529. [34] Bob G Witmer and Michael J Singer. Measuring immersion in virtual environments. Tech. rep. ARI Technical Report 1014). Alexandria, VA: US Army Research Institute for the Behavioral and Social Sciences, 1994. [35] Stephen R Ellis et al. “In search of equivalence classes in subjective scales of reality”. In: Advances in human factors/ergonomics (1997), pp. 873–876. [36] Albert S Carlin, Hunter G Hoffman, and Suzanne Weghorst. “Virtual reality and tactile augmentation in the treatment of spider phobia: a case report”. In: Behaviour research and therapy 35.2 (1997), pp. 153–158. [37] Claudia Mary Hendrix. “Exploratory studies on the sense of presence in virtual envi- ronments as a function of visual and auditory display parameters”. MA thesis. Uni- versity of Washington, 1994. [38] Mel Slater, Martin Usoh, and Anthony Steed. “Depth of presence in virtual environ- ments”. In: Presence: Teleoperators & Virtual Environments 3.2 (1994), pp. 130–144.

50 Bibliography

[39] John Towell and Elizabeth Towell. “Presence in text-based networked virtual envi- ronments or “MUDS””. In: Presence: Teleoperators & Virtual Environments 6.5 (1997), pp. 590–595. [40] Holger T Regenbrecht, Thomas W Schubert, and Frank Friedmann. “Measuring the sense of presence and its relations to fear of heights in virtual environments”. In: In- ternational Journal of Human-Computer Interaction 10.3 (1998), pp. 233–249. [41] IPQ Factor Analysis. 2018. URL: http://www.igroup.org/pq/ipq/factor.php (visited on 09/22/2018). [42] Marcus Toftedahl. Which are the most commonly used Game Engines? 2019. URL: https://www.gamasutra.com/blogs/MarcusToftedahl/20190930/ 350830/Which_are_the_most_commonly_used_Game_Engines.php (visited on 10/11/2020). [43] OpenGL ES | Android Developers. 2017. URL: https://developer.android.com/ guide/topics/graphics/opengl.html (visited on 05/04/2017). [44] Khronos Group. OpenGL ES Specification - Version 3.0.5. 2016. URL: https://www. khronos.org/registry/OpenGL/specs/es/3.0/es_spec_3.0.pdf (visited on 01/30/2017). [45] Unity. Unity - Manual: Virtual Reality. 2017. URL: https://docs.unity3d.com/ 560/Documentation/Manual/VirtualReality.html (visited on 01/31/2017). [46] Unity. Unity - Manual: VR overview. 2017. URL: https://docs.unity3d.com/560/ Documentation/Manual/VROverview.html (visited on 02/01/2017). [47] Linda Dailey Paulson. “Web Applications with Ajax”. In: IEEE Computer 38.10 (2005), pp. 14–17. [48] Bret Taylor. Mapping your way. 2005. URL: https://googleblog.blogspot.se/ 2005/02/mapping-your-way.html (visited on 01/11/2017). [49] World Wide Web Consortium. W3C Mission. 2017. URL: https://www.w3.org/ Consortium/mission (visited on 02/09/2019). [50] Steven Pemberton et al. XHTML™ 1.0 The Extensible HyperText Markup Lan- guage (Second Edition). 2002. URL: https://www.w3.org/TR/2002/REC-xhtml1- 20020801/ (visited on 01/12/2017). [51] WHATWG. FAQ - WHATWG Wiki. 2016. URL: https://whatwg.org/faq (visited on 02/09/2019). [52] Steve Faulkner et al. HTML 5.1. 2016. URL: https://www.w3.org/TR/2016/REC- html51-20161101/ (visited on 01/12/2017). [53] HTML Standard. 2017. URL: https://html.spec.whatwg.org (visited on 01/12/2017). [54] Tab Atkins Jr., Elika J. Etemad, and Florian Rivoal. CSS Shapshot 2015. 2015. URL: https://www.w3.org/TR/2015/NOTE-css-2015-20151013/ (visited on 01/12/2017). [55] David Flanagan. JavaScript: The Definitive Guide. O’Reilly Media, 2011, p. 1096. [56] ECMA. “ECMAScript Language Specification, 3rd edition”. In: World Wide Web Inter- net And Web Information Systems December (1999), pp. 1–188. [57] Anne van Kesteren et al. W3C DOM4. 2015. URL: https://www.w3.org/TR/2015/ REC-dom-20151119/ (visited on 01/12/2017). [58] World Wide Web Consortium. Web Design and Applications. 2019. URL: https://www. w3.org/standards/webdesign/ (visited on 02/09/2019).

51 Bibliography

[59] Brandon Jones. Bringing VR to Chrome. 2014. URL: http://blog.tojicode.com/ 2014/07/bringing-vr-to-chrome.html (visited on 11/23/2016). [60] Casey Yee. Introducing the WebVR 1.0 API Proposal. 2016. URL: https://hacks. mozilla.org/2016/03/introducing-the-webvr-1-0-api-proposal/ (visited on 01/12/2017). [61] W3C Team. Call for Participation in WebVR Community Group. 2016. URL: https:// www.w3.org/community/webvr/2016/03/01/call-for-participation- in-webvr-community-group/ (visited on 01/12/2017). [62] WebVR Community Group Charter. URL: 2016 (visited on 01/12/2017). [63] Scott Graham, Ted Mielczarek, and Brandon Jones. W3C Gamepad. 2016. URL: https: //www.w3.org/TR/2016/WD-gamepad-20161202/ (visited on 01/18/2017). [64] David Ragget. Extending WWW to support - Platform Independent Virtual Reality. 1994. URL: https://www.w3.org/People/Raggett/vrml/vrml.html (visited on 01/30/2017). [65] Alun Evans et al. “3D graphics on the web: A survey”. In: Computers & Graphics 41 (2014), pp. 43–61. [66] Johannes Behr et al. “X3DOM – A DOM-based HTML5 / X3D Integration Model”. In: Proceedings of the 14th International Conference on 3D Web Technology 1.212 (2009), pp. 127–137. [67] Khronos Group. WebGL Specification - Version 1.0.3. 2014. URL: https://www. khronos.org/registry/webgl/specs/1.0.3/ (visited on 01/30/2017). [68] Theodore Mielczarek, Brandon Jones, and Scott Graham. Gamepad. W3C Working Draft. W3C, Jan. 2017. [69] Andre Charland and Brian Leroux. “mobile application Development : Web vs . na- tive”. In: Communications of the ACM 54 (2011), pp. 1–5. [70] Movania Muhammad Mobeen and Lin Feng. “High-performance volume rendering on the ubiquitous WebGL platform”. In: Proceedings of the 14th IEEE International Con- ference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE In- ternational Conference on Embedded Software and Systems, ICESS-2012 (2012), pp. 381– 388. [71] Rama C. Hoetzlein. “Graphics performance in rich internet applications”. In: IEEE Computer Graphics and Applications 32.5 (2012), pp. 98–104. [72] Michael Abrash. Why virtual isn’t real to your brain. 2013. URL: http://blogs. valvesoftware.com/abrash/why-virtual-isnt-real-to-your-brain/ (visited on 12/08/2016). [73] Joseph J LaViola Jr. “A discussion of cybersickness in virtual environments”. In: ACM Sigchi Bulletin 32.1 (2000), pp. 47–56. [74] Clare Regan. “An investigation into nausea and other side-effects of head-coupled immersive virtual reality”. In: Virtual Reality 1.1 (1995), pp. 17–31. [75] Lisa Rebenitsch and Charles Owen. “Review on cybersickness in applications and vi- sual displays”. In: Virtual Reality 20.2 (2016), pp. 101–125. [76] Michael Abrash. Down the VR rabbit hole: Fixing judder. 2013. URL: http:// blogs.valvesoftware.com/abrash/down-the-vr-rabbit-hole-fixing- judder/ (visited on 12/15/2016). [77] Robert S Kennedy et al. “Simulator sickness questionnaire: An enhanced method for quantifying simulator sickness”. In: The international journal of aviation psychology 3.3 (1993), pp. 203–220.

52 Bibliography

[78] John Carmack. Latency Mitigation Strategies. 2013. URL: https://www. twentymilliseconds.com/post/latency-mitigation-strategies/ (vis- ited on 01/12/2017). [79] T. J. Buker, D. a. Vincenzi, and J. E. Deaton. “The Effect of Apparent Latency on Sim- ulator Sickness While Using a See-Through Helmet-Mounted Display: Reducing Ap- parent Latency With Predictive Compensation”. In: Human Factors: The Journal of the Human Factors and Ergonomics Society 54.2 (2012), pp. 235–249. [80] Jason J Jerald. “Scene-Motion- and Latency-Perception Thresholds for Head-Mounted Displays by”. In: (2009). [81] Michael Abrash. Latency, The sine qua non of AR and VR. 2012. URL: http://blogs. valvesoftware.com/abrash/latency-the-sine-qua-non-of-ar-and- vr/ (visited on 01/11/2017). [82] Michael Abrash. Why virtual isn’t real to your brain: judder. 2013. URL: http://blogs. valvesoftware.com/abrash/why-virtual-isnt-real-to-your-brain- judder/ (visited on 01/12/2017). [83] PJ Bex, GK Edgar, and AT Smith. Multiple images appear when motion energy detection fails. 1995. [84] David Luebke and Greg Humphreys. “How GPUs work”. In: Computer 40.2 (2007), pp. 96–100. [85] J D Owens et al. “GPU Computing”. In: Proceedings of the IEEE 96 (2008), pp. 879–899. [86] Brian Goldiez, Rodney Rogers, and P. Woodward. “Real-time visual simulation on PCs”. In: IEEE Computer Graphics and Applications 19.1 (1999), pp. 11–15. [87] John Congote et al. “Interactive visualization of volumetric data with WebGL in real- time”. In: Proceedings of the 16th International Conference on 3D Web Technology - Web3D ’11 May 2016 (2011), p. 137. [88] David J. Lilja. Measuring Computer Performance - A Practitioner’s Guide. Cambridge Uni- versity Press, 2000, p. 279. [89] Oculus. Oculus Documentation - Power Management. 2017. URL: https:// developer3.oculus.com/documentation/mobilesdk/latest/concepts/ mobile-power-overview/ (visited on 02/20/2017). [90] J Noguera and Juan-roberto Jiménez. “Visualization of very large 3D volumes on mobile devices and WebGL”. In: WSCG Communication Proceedings August (2012), pp. 105–112. [91] Mozilla Developer Network - window.requestAnimationFrame(). 2017. URL: https://developer.mozilla.org/en-US/docs/Web/API/window/ requestAnimationFrame (visited on 02/20/2017). [92] A.J. Smith. “Sequential Program Prefetching in Memory Hierarchies”. In: Computer 11.12 (Dec. 1978), pp. 7–21. [93] Todd C. Mowry, Monica S. Lam, and Anoop Gupta. “Design and evaluation of a com- piler algorithm for prefetching”. In: Proceedings of the fifth international conference on Architectural support for programming languages and operating systems - ASPLOS-V. New York, New York, USA: ACM Press, 1992, pp. 62–73. [94] Venkata N. Padmanabhan and Jeffrey C. Mogul. “Improving HTTP latency”. In: Com- puter Networks and ISDN Systems 28.1-2 (1995), pp. 25–35. [95] Darin Fisher and G. Saksena. “Link prefetching in Mozilla: A server-driven approach”. In: Web content caching and distribution (2004), pp. 283–291.

53 Bibliography

[96] Venkata N. Padmanabhan and Jeffrey C. Mogul. “Using predictive prefetching to im- prove World Wide Web latency”. In: ACM SIGCOMM Computer Communication Review 26.3 (July 1996), pp. 22–36. [97] V Krishnamoorthi et al. “Quality-adaptive prefetching for interactive branched video using HTTP-based Adaptive Streaming”. In: Proceedings of the 2014 ACM Conference on Multimedia, MM 2014 Mm (2014), pp. 317–326. [98] David Luebke et al. Level of Detail for 3D Graphics. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2003. [99] Ned Greene, Michael Kass, and Gavin Miller. “Hierarchical Z-buffer Visibility”. In: Proceedings of the 20th Annual Conference on Computer Graphics and Interactive Techniques. SIGGRAPH ’93. Anaheim, CA: ACM, 1993, pp. 231–238. [100] Satyan Coorg and Seth Teller. “Real-time Occlusion Culling for Models with Large Occluders”. In: Proceedings of the 1997 Symposium on Interactive 3D Graphics. I3D ’97. Providence, Rhode Island, USA: ACM, 1997, 83–ff. [101] Brian Guenter et al. “Foveated 3D Graphics”. In: ACM Trans. Graph. 31.6 (Nov. 2012), 164:1–164:10. [102] Anjul Patney et al. “Towards Foveated Rendering for Gaze-tracked Virtual Reality”. In: ACM Trans. Graph. 35.6 (Nov. 2016), 179:1–179:12. [103] John Brewer and Albert Hunter. Multimethod research: A synthesis of styles. Sage Publi- cations, Inc, 1989. [104] Donald T Campbell and Donald W Fiske. “Convergent and Discriminant Validation by the Multitrait-Multimethod Matrix”. In: Psychological Bulletin 56.2 (1959), pp. 81– 105. [105] John Brewer and Albert Hunter. Foundations of multimethod research: Synthesizing styles. Sage, 2006. [106] Steve Easterbrook et al. Selecting Empirical Methods for Software Engineering Research. Vol. 53. 9. 2008, pp. 296–297. [107] Per Runeson and Martin Höst. “Guidelines for conducting and reporting case study research in software engineering”. In: Empirical Software Engineering 14.2 (2009), pp. 131–164. [108] Colin Robson. Real world research : a resource for users of social research methods in applied settings. 3rd ed. Chichester, West Sussex: Wiley, Apr. 2011. [109] Claes Wohlin et al. Experimentation in software engineering. Springer Science & Business Media, 2012. [110] Jarrett Rosenberg. “Statistical methods and measurement”. In: Guide to Advanced Em- pirical Software Engineering. Springer, 2008, pp. 155–184. [111] Graham Pervan and Hilangwa Maimbo. “Designing a Case Study Protocol for ap- plication in IS research”. In: Proceedings of 9th Pacific Asia Conference on Information Systems: I.T. and Value Creation, PACIS 2005. 2005, pp. 1281–1292. [112] UnitySceneWebExporter. 2016. URL: https://github.com/if1live/unity- scene-web-exporter/ (visited on 04/04/2017). [113] Oculus Unity Packages. 2017. URL: https://developer.oculus.com/ downloads/unity/ (visited on 04/25/2017). [114] Unity. Optimisation for VR in Unity. 2017. URL: https://unity3d.com/learn/ tutorials/topics/virtual-reality/optimisation-vr-unity (visited on 02/06/2017).

54 Bibliography

[115] Chris Pruett. Squeezing Performance out of your Unity Gear VR Game. 2015. URL: https://developer3.oculus.com/blog/squeezing-performance-out- of-your-unity-gear-vr-game/ (visited on 02/06/2017). [116] Samsung. Exynos 7 Octa (7420). 2015. URL: http://www.samsung.com/ semiconductor/minisite/Exynos/w/solution/mobile_ap/7420/ (visited on 02/07/2017). [117] Joshua Ho. Samsung Announces the Galaxy S 6 and S 6 Edge. 2015. URL: http: //www.anandtech.com/show/8999/samsung-announces-the-galaxy-s6 (visited on 02/07/2017). [118] Oculus Remote Monitor. 2017. URL: https://developer3.oculus.com/ documentation/mobilesdk/latest/concepts/mobile-remote-monitor/ (visited on 04/24/2017). [119] Costas Boletsis and Jarl Cedergren. “VR Locomotion in the New Era of Virtual Reality: An Empirical Comparison of Prevalent Techniques”. In: Advances in Human-Computer Interaction 2019 (Apr. 2019), pp. 1–15. [120] Jason Tsai. TrendForce Forecasts Global VR Device Shipments at 6 Million Units in 2019, with Oculus’s Price Cut Boosting Sales. 2018. URL: https://www.trendforce.com/ presscenter/news/20181210-10080.html (visited on 06/07/2020). [121] SuperData - A Nielsen Company. 2019 Year In Review: Digital Games and Interac- tive Media. 2019. URL: https://www.superdataresearch.com/reports/2019- year-in-review. [122] W. Cai et al. “A Survey on Cloud Gaming: Future of Computer Games”. In: IEEE Access 4 (2016), pp. 7605–7620. [123] X. Zhang et al. “Improving Cloud Gaming Experience through Mobile Edge Comput- ing”. In: IEEE Wireless Communications 26.4 (2019), pp. 178–183. [124] José Gutiérrez-Maldonado et al. “Virtual Reality: Applications to Eating Disorders”. In: The Oxford Handbook of Eating Disorders. Feb. 2018, pp. 470–491. [125] Lasse Jensen and Flemming Konradsen. “A review of the use of virtual reality head- mounted displays in education and training”. In: Education and Information Technologies 23.4 (July 2018), pp. 1515–1529. [126] Matthew Grizzard et al. “Repeated Play Reduces Video Games’ Ability to Elicit Guilt: Evidence from a Longitudinal Experiment”. In: Media Psychology 20.2 (2017), pp. 267– 290. [127] Kathryn Y Segovia and Jeremy N Bailenson. “Virtually true: Children’s acquisition of false memories in virtual reality”. In: Media Psychology 12.4 (2009), pp. 371–393. [128] Mathias Almquist et al. “The Prefetch Aggressiveness Tradeoff in 360° Video Stream- ing”. In: Proceedings of the ACM Multimedia Systems Conference 12 (2018), pp. 258–269. [129] WebVR - Editor’s Draft, 12 December 2017. 2017. URL: https://immersive-web. github.io/webvr/spec/1.1/ (visited on 08/16/2020).

55 APPENDIX A

User Study Test Protocol

Test protocol

1. Open forms and calc: a. Session Data b. Demographics (If first session) ​ c. Pre Session d. Post Session 2. Fill the administration part (the first section containing subject id, etc.) of each form a. ID b. Name/email c. Application version d. If second session, VR usage between sessions. 3. Fill in the Demographics form if first session. 4. Fill in the pre session SSQ. 5. Inform the subject about the VR room: a. It is a virtual representation of Valtech Store. b. There are four “objects/items” that can show some information about them when you gaze and click them. c. You can move by looking at the floor and clicking. d. You can change the texture of the walls by gazing and clicking somewhere on the wall. 6. Inform the subject about the GearVR headset. a. Tap/Click touchpad b. Focus wheel c. Make sure subject is standing up. d. Put it on pls e. Start timer, keep VR usage to around 2-3 min. 7. Instruct the subject on the first interactions. a. Look at the floor, click to move to the icon position b. Look at the shoe, notice the info reticle. Click to show info, click close icon to close. c. When standing by the entrance/door, look towards the TV’s and click the wall beside them. Change the texture and find your favorite. d. Find all four info items (should be three remaining). e. Done, take the headset off. 8. Fill in SSQ and IPQ. a. The scales of the IPQ are a little inconsistent, keep that in mind. 9. Done, schedule next session if not already done or this was session 2.

56 APPENDIX B

Performance Test Protocol

Performance Test Protocol

1. Start OVRMonitor 2. Connect phone with USB and make sure Enable Capture is checked. ​ ​ 3. Prepare OVRMonitor Session Settings ​

4. Prepare a timer. 5. Open VR-application on phone. 6. Connect to VR-application using OVR Monitor. 7. Perform test tasks a. 0s Look at initial camera position b. 10s Look at VR-headset c. 15s Teleport to VR-headset and look towards the clothes d. 25s Turn around and look outside through the window and focus on the door e. 35s Teleport to iPad and look at it f. 45s Look up and focus at the TVs g. 60s Stop 8. Save log file

57 APPENDIX C

Questionnaires

C.1 Demograhics Questionnaire

27/04/2017 VR Store Test

VR Store Test *Obligatorisk

1. Subject ID *

Demographics

2. Gender * Markera endast en oval.

Male Female Prefer not to say

Övrigt:

3. Birth year *

4. I am susceptible to motion sickness. * Motion sickness is a common condition that occurs in some people who travel by car, train, airplane or boat. Dizziness, fatigue, and nausea are the most common symptoms of motion sickness. Markera endast en oval.

1 2 3 4 5

not accurate at all entirely accurate

5. I have used a VR device... * Markera endast en oval.

More than 20 times 10­20 times 5­10 times 1­5 times 0 times

6. I use a desktop computer, laptop or tablet every... * Markera endast en oval.

Day Week Month Year

https://docs.google.com/a/student.liu.se/forms/d/1pYwryLjMEdsXJ7g7tbnmGHtcvGzobWxRxkZfLz0-QVI/edit 1/2

58 C.2. Post-session Questionnaire

C.2 Post-session Questionnaire

27/04/2017 VR Store Test

VR Store Test *Obligatorisk

1. Subject ID *

2. App * Markera endast en oval.

web native

Simulation Sickness

3. General discomfort * Markera endast en oval.

1 2 3 4

None Severe

4. Fatigue * Markera endast en oval.

1 2 3 4

None Severe

5. Headache * Markera endast en oval.

1 2 3 4

None Severe

6. Eyestrain * Markera endast en oval.

1 2 3 4

None Severe

7. Difficulty focusing * Markera endast en oval.

1 2 3 4

None Severe

https://docs.google.com/a/student.liu.se/forms/d/1Ewh3jHfM75wOI7V2ejarm3dnlqQyZ_xMMFqyUu-jmNs/edit 1/5

59 C.2. Post-session Questionnaire

27/04/2017 VR Store Test 8. Increased salivation * Markera endast en oval.

1 2 3 4

None Severe

9. Sweating * Markera endast en oval.

1 2 3 4

None Severe

10. Nausea * Markera endast en oval.

1 2 3 4

None Severe

11. Difficulty concentrating * Markera endast en oval.

1 2 3 4

None Severe

12. Fullness of head * Markera endast en oval.

1 2 3 4

None Severe

13. Blurred vision * Markera endast en oval.

1 2 3 4

None Severe

14. Dizzy (eyes open) * Markera endast en oval.

1 2 3 4

None Severe

https://docs.google.com/a/student.liu.se/forms/d/1Ewh3jHfM75wOI7V2ejarm3dnlqQyZ_xMMFqyUu-jmNs/edit 2/5

60 C.2. Post-session Questionnaire

27/04/2017 VR Store Test 15. Dizzy (eyes closed) * Markera endast en oval.

1 2 3 4

None Severe

16. Vertigo * Vertigo is experienced as loss of orientation with respect to vertical upright. Markera endast en oval.

1 2 3 4

None Severe

17. Stomach awareness * Markera endast en oval.

1 2 3 4

None Severe

18. Burping * Markera endast en oval.

1 2 3 4

None Severe

Presence Now you'll see some statements about experiences. Please indicate, whether or not each statement applies to your experience. There are no right or wrong answers, only your opinion counts.

You will notice that some questions are very similar to each other. This is necessary for statistical reasons . And please remember: Answer all these questions only referring to this one experience.

19. How aware were you of the real world surrounding while navigating in the virtual world? (i.e. sounds, room temperature, other people, etc.)? * Markera endast en oval.

1 2 3 4 5 6 7

extremely aware not aware at all

20. How real did the virtual world seem to you? * Markera endast en oval.

1 2 3 4 5 6 7

completely real not real at all

https://docs.google.com/a/student.liu.se/forms/d/1Ewh3jHfM75wOI7V2ejarm3dnlqQyZ_xMMFqyUu-jmNs/edit 3/5

61 C.2. Post-session Questionnaire

27/04/2017 VR Store Test 21. I had a sense of acting in the virtual space, rather than operating something from outside. * Markera endast en oval.

1 2 3 4 5 6 7

fully disagree fully agree

22. How much did your experience in the virtual environment seem consistent with your real world experience ? * Consistent to the real world in general, not specific to the virtual representation of Valtech Store. Markera endast en oval.

1 2 3 4 5 6 7

not consistent very consistent

23. How real did the virtual world seem to you? * Markera endast en oval.

1 2 3 4 5 6 7

indistinguishable about as real as an from the real imagined world world

24. I did not feel present in the virtual space. * Markera endast en oval.

1 2 3 4 5 6 7

did not feel felt present

25. I was not aware of my real environment. * Markera endast en oval.

1 2 3 4 5 6 7

fully disagree fully agree

26. In the computer generated world I had a sense of "being there". * Markera endast en oval.

1 2 3 4 5 6 7

not at all very much

27. Somehow I felt that the virtual world surrounded me. * Markera endast en oval.

1 2 3 4 5 6 7

fully disagree fully agree

https://docs.google.com/a/student.liu.se/forms/d/1Ewh3jHfM75wOI7V2ejarm3dnlqQyZ_xMMFqyUu-jmNs/edit 4/5

62 C.2. Post-session Questionnaire

27/04/2017 VR Store Test 28. I felt present in the virtual space. * Markera endast en oval.

1 2 3 4 5 6 7

fully disagree fully agree

29. I still paid attention to the real environment. * Markera endast en oval.

1 2 3 4 5 6 7

fully disagree fully agree

30. The virtual world seemed more realistic than the real world. * Markera endast en oval.

1 2 3 4 5 6 7

fully disagree fully agree

31. I felt like I was just perceiving pictures. * Markera endast en oval.

1 2 3 4 5 6 7

fully disagree fully agree

32. I was completely captivated by the virtual world. * Markera endast en oval.

1 2 3 4 5 6 7

fully disagree fully agree

Tillhandahålls av

https://docs.google.com/a/student.liu.se/forms/d/1Ewh3jHfM75wOI7V2ejarm3dnlqQyZ_xMMFqyUu-jmNs/edit 5/5

63 APPENDIX D

IPQ results

25 25

20 20

15 15

IPQ Score 10 IPQ Score 10

5 5

0 0 G1 SP INV REAL G1 SP INV REAL

(a) Web (b) Native

64 APPENDIX E

SSQ results

120 120

100 100

80 80

60 60

SSQ Score 40 SSQ Score 40

20 20

0 0 N O D T N O D T

(a) Web pre exposure (b) Web post exposure 120 120

100 100

80 80

60 60

SSQ Score 40 SSQ Score 40

20 20

0 0 N O D T N O D T

(c) Native pre exposure (d) Native post exposure N = Nausea O = Oculomotor D = Disorientation T = Total Shows the SSQ scores before and after exposure of both implementation.

65 APPENDIX F

WebVR Specification

The WebVR IDL from the WebVR 1.1 Editor’s Draft, 12 December 2017 [129].

1 interface VRDisplay : EventTarget {

2 readonly attribute boolean isConnected;

3 readonly attribute boolean isPresenting;

4 5 /** 6 * Dictionary of capabilities describing the VRDisplay. 7 */ 8 [SameObject] readonly attribute VRDisplayCapabilities capabilities;

9 10 /** 11 * If this VRDisplay supports room-scale experiences, the optional 12 * stage attribute contains details on the room-scale parameters. 13 * The stageParameters attribute can not change between null 14 * and non-null once the VRDisplay is enumerated; however, 15 * the values within VRStageParameters may change after 16 * any call to VRDisplay.submitFrame as the user may re-configure 17 * their environment at any time. 18 */ 19 readonly attribute VRStageParameters? stageParameters;

20 21 /** 22 * Return the current VREyeParameters for the given eye. 23 */ 24 VREyeParameters getEyeParameters(VREye whichEye);

25 26 /** 27 * An identifier for this distinct VRDisplay. Used as an 28 * association point in the Gamepad API. 29 */ 30 readonly attribute unsigned long displayId;

31 32 /** 33 * A display name, a user-readable name identifying it. 34 */ 35 readonly attribute DOMString displayName;

36 37 /**

66 38 * Populates the passed VRFrameData with the information required to render 39 * the current frame. 40 */ 41 boolean getFrameData(VRFrameData frameData);

42 43 /** 44 * Return a VRPose containing the future predicted pose of the VRDisplay 45 * when the current frame will be presented. The value returned will not 46 * change until JavaScript has returned control to the browser. 47 * 48 * The VRPose will contain the position, orientation, velocity, 49 * and acceleration of each of these properties. 50 */ 51 [NewObject] VRPose getPose();

52 53 /** 54 * Reset the pose for this display, treating its current position and 55 * orientation as the "origin/zero" values. VRPose.position, 56 * VRPose.orientation, and VRStageParameters.sittingToStandingTransform may be 57 * updated when calling resetPose(). This should be called in only 58 * sitting-space experiences. 59 */ 60 void resetPose();

61 62 /** 63 * z-depth defining the near plane of the eye view frustum 64 * enables mapping of values in the render target depth 65 * attachment to scene coordinates. Initially set to 0.01. 66 */ 67 attribute double depthNear;

68 69 /** 70 * z-depth defining the far plane of the eye view frustum 71 * enables mapping of values in the render target depth 72 * attachment to scene coordinates. Initially set to 10000.0. 73 */ 74 attribute double depthFar;

75 76 /** 77 * The callback passed to `requestAnimationFrame` will be called 78 * any time a new frame should be rendered. When the VRDisplay is 79 * presenting the callback will be called at the native refresh 80 * rate of the HMD. When not presenting this function acts 81 * identically to how window.requestAnimationFrame acts. Content should 82 * make no assumptions of frame rate or vsync behavior as the HMD runs 83 * asynchronously from other displays and at differing refresh rates. 84 */ 85 unsigned long requestAnimationFrame(FrameRequestCallback callback);

86 87 /** 88 * Passing the value returned by `requestAnimationFrame` to 89 * `cancelAnimationFrame` will unregister the callback. 90 */

67 91 void cancelAnimationFrame(unsigned long handle);

92 93 /** 94 * Begin presenting to the VRDisplay. Must be called in response to a user gesture. 95 * Repeat calls while already presenting will update the VRLayers being displayed. 96 * If the number of values in the leftBounds/rightBounds arrays is not 0 or 4 for any of the passed layers the promise is rejected 97 * If the source of any of the layers is not present (null), the promise is rejected. 98 */ 99 Promise requestPresent(sequence layers);

100 101 /** 102 * Stops presenting to the VRDisplay. 103 */ 104 Promise exitPresent();

105 106 /** 107 * Get the layers currently being presented. 108 */ 109 sequence getLayers();

110 111 /** 112 * The VRLayer provided to the VRDisplay will be captured and presented 113 * in the HMD. Calling this function has the same effect on the source 114 * canvas as any other operation that uses its source image, and canvases 115 * created without preserveDrawingBuffer set to true will be cleared. 116 */ 117 void submitFrame();

118 };

119

120 typedef (HTMLCanvasElement or

121 OffscreenCanvas) VRSource;

122

123 [Constructor(optional VRLayerInit layer)]

124 interface VRLayer {

125 readonly attribute VRSource? source;

126

127 readonly attribute sequence leftBounds;

128 readonly attribute sequence rightBounds;

129 };

130

131 dictionary VRLayerInit {

132 VRSource? source = null;

133

134 sequence leftBounds = [ ];

135 sequence rightBounds = [ ];

136 };

137

138 interface VRDisplayCapabilities {

139 readonly attribute boolean hasPosition;

140 readonly attribute boolean hasOrientation;

141 readonly attribute boolean hasExternalDisplay;

142 readonly attribute boolean canPresent;

143 readonly attribute unsigned long maxLayers;

68 144 };

145

146 enum VREye {

147 "left",

148 "right"

149 };

150

151 interface VRFieldOfView {

152 readonly attribute double upDegrees;

153 readonly attribute double rightDegrees;

154 readonly attribute double downDegrees;

155 readonly attribute double leftDegrees;

156 };

157

158 interface VRPose {

159 readonly attribute Float32Array? position;

160 readonly attribute Float32Array? linearVelocity;

161 readonly attribute Float32Array? linearAcceleration;

162

163 readonly attribute Float32Array? orientation;

164 readonly attribute Float32Array? angularVelocity;

165 readonly attribute Float32Array? angularAcceleration;

166 };

167

168 [Constructor]

169 interface VRFrameData {

170 readonly attribute DOMHighResTimeStamp timestamp;

171

172 readonly attribute Float32Array leftProjectionMatrix;

173 readonly attribute Float32Array leftViewMatrix;

174

175 readonly attribute Float32Array rightProjectionMatrix;

176 readonly attribute Float32Array rightViewMatrix;

177

178 readonly attribute VRPose pose;

179 };

180

181 interface VREyeParameters {

182 readonly attribute Float32Array offset;

183

184 [SameObject] readonly attribute VRFieldOfView fieldOfView;

185

186 readonly attribute unsigned long renderWidth;

187 readonly attribute unsigned long renderHeight;

188 };

189

190 interface VRStageParameters {

191 readonly attribute Float32Array sittingToStandingTransform;

192

193 readonly attribute float sizeX;

194 readonly attribute float sizeZ;

195 };

196

69 197 partial interface Navigator {

198 Promise> getVRDisplays();

199 readonly attribute FrozenArray activeVRDisplays;

200 readonly attribute boolean vrEnabled;

201 };

202

203 enum VRDisplayEventReason {

204 "mounted",

205 "navigation",

206 "requested",

207 "unmounted"

208 };

209

210 [Constructor(DOMString type, VRDisplayEventInit eventInitDict)]

211 interface VRDisplayEvent : Event {

212 readonly attribute VRDisplay display;

213 readonly attribute VRDisplayEventReason? reason;

214 };

215

216 dictionary VRDisplayEventInit : EventInit {

217 required VRDisplay display;

218 VRDisplayEventReason reason;

219 };

220

221 partial interface Window {

222 attribute EventHandler onvrdisplayconnect;

223 attribute EventHandler onvrdisplaydisconnect;

224 attribute EventHandler onvrdisplayactivate;

225 attribute EventHandler onvrdisplaydeactivate;

226 attribute EventHandler onvrdisplayblur;

227 attribute EventHandler onvrdisplayfocus;

228 attribute EventHandler onvrdisplaypresentchange;

229 };

230

231 partial interface HTMLIFrameElement {

232 attribute boolean allowvr;

233 };

234

235 partial interface Gamepad {

236 readonly attribute unsigned long displayId;

237 };

70