Collaborative Acceleration for Mixed Reality

Collaborative Acceleration for Mixed Reality Kiron Lebeck Eduardo Cuervo Matthai Philipose University of Washington∗ Microsoft Research Microsoft Research [email protected] [email protected] [email protected] ABSTRACT we seek to better understand the workload and hardware A new generation of augmented reality (AR) devices, such as ecosystem surrounding MR technologies, and to cast some the Microsoft HoloLens, promises a user experience known early light on their implications for MR systems design. as mixed reality (MR) that is more seamless, immersive, and In order to better understand the workload, we implement 1 intelligent than earlier AR technologies. However, this new a concrete application , a 3D task-assistance app to assist experience comes with high computational costs, including users with cleaning rooms, on the HoloLens. We identify exceptionally low latency and high quality requirements. five tasks as fundamental to such applications: specify the While this cost could be offset in part through offloading, target configuration of a room, detect the objects in this space we also observe an increasing availability of on-device, task- and estimate their 3D positions, recognize their orientations, specific accelerators. In this paper, we propose collaborative track these objects as they move, and render overlays to acceleration, a collaborative technique that utilizes the unique indicate how these objects should be manipulated. hardware accelerated capabilities of an MR device, in con- Architecturally, we believe the HoloLens is representative junction with an edge node, to partition an application’s of MR devices that will be available in coming years. In par- core workflow according to the specific strengths of each ticular, the battery constraints introduced by mobility mean device. To better understand the workloads of next gener- that on-board computation will include a mix of custom ation MR applications, we implement a concrete MR app acceleration and low-power processing. For example, the on the HoloLens: an assistive tool to visually aid users in HoloLens provides hardware acceleration for estimating the manipulating physical objects. Through our prototype, we geometry of the space surrounding it, along with a modest in- find that offloading a subset of the app’s workload toanedge tegrated GPU for rendering. Further, recent announcements while also leveraging the strengths of the HoloLens delivers indicate that the device will soon see an accelerator for Deep accurate enough results at a low latency. Our work provides Neural Networks added to it, as for instance, in the Apple an early glimpse into the system design challenges of MR, iPhone X. Finally, the device provides fast network connec- potentially the first “killer application” of edge offloading. tivity that allows it to take advantage of resources in the cloud or nearby “edge” [18] devices, nodes with substantial 1 INTRODUCTION amounts of computing resources placed in close proximity to mobile devices to augment their abilities. A new generation of augmented reality (AR) devices, such Offloading to the cloud and edge has been a recurring as the Microsoft HoloLens [11], MagicLeap [10], and Meta2, theme in recent years when seeking to balance mobile con- mixed reality promises a user experience known as (MR) that straints with application needs [2, 5, 8]. MR apps motivate is more seamless, immersive, and intelligent than that pro- yet another revisiting of this tradeoff for two reasons. First, vided by earlier AR technologies such as the Google Glass. they have exceptionally low latency and high quality require- Where past AR experiences overlaid text or 2D annotations ments, rendering the latency to the cloud prohibitive (around over a small portion of the user’s field of view, MR provides 74ms [9]) and potentially straining both the latency limits of overlays with the illusion of physical presence by generat- off-boarding even to nearby edge nodes, as well as thecom- ing a semantic understanding (e.g., the identities of objects putational limits of on-board execution. Second, on-device and people) and detailed 3D geometric maps of the user’s task-specific accelerators have the potential to dramatically environment, along with finely rendered graphics anchored increase local computational capabilities, allowing the device within this 3D environment. These immersive capabilities to collaboratively share execution duties with the edge [4] enable new applications [1] such as home planning, pain and rather than merely playing the role of “offload shaping" [6, 7]. 3D task-assistance phobia treatment, and , or assistive tools Accelerators also raise the possibility that offloading to the to visually guide users through the manipulation of physical edge may not be needed at all in the future. objects. However, these advancements come at high compu- Nevertheless, we posit that the edge still has a strong, al- tational cost: the “maps, meshes and models” that comprise beit altered role in the future of mixed reality. Although we this workload are computationally expensive. In this paper, *This work was conducted while the first author was an intern at Microsoft. 1A video of our prototype can be found at https://youtu.be/bWTNzg6y5hY believe that on-device localization and mapping will be fea- operations such as low-latency rendering and spatial map- sible going forward, both AI workloads (e.g., executing Deep ping [12], and future versions will also accelerate neural Neural Networks) and high quality graphics will benefit from networks [13]. Other organizations have begun employing edge assistance. In particular, even if certain AI primitives hardware accelerators for MR as well. For example, Apple’s (e.g., hand pose estimation or object detection) run on board, iPhoneX includes a neural network accelerator they call the each application could have its own custom AI requirements Neural Engine, and Google’s Tango phone platform runs that may not all be accommodated locally. Similarly, although atop devices with Qualcomm-built hardware support for 3D high fidelity graphics from a sufficiently powerful on-board depth perception, analogous to HoloLens spatial mapping. GPU would allow edge-free operation, truly immersive or While the latency requirements for traditional phone-based seamless graphics will likely require edge-class capabilities MR may not be as strict as they are on head-mounted dis- in the foreseeable future. plays, phones can be easily used as MR headsets by employ- In this paper, we propose collaborative acceleration, a tech- ing adapters like mergecube. Other major players such as nique that utilizes the unique custom hardware accelerated Huawei [15] and Intel (Movidius) also have neural network capabilities of a mixed reality device in collaboration with hardware accelerators of their own, suggesting an industry- an edge device to partition an application’s core workflow wide trend towards hardware support for MR. according to the specific strengths of each device. Through a prototype application on the HoloLens, we identify five 3 DESIGN AND IMPLEMENTATION key primitives for collaborative acceleration that leverage the strengths of the HoloLens’s on-device capabilities. Our We next explore the potential benefit of applying collab- work provides an early glimpse into system design for mixed orative acceleration to MR applications, beginning with a reality. Although the MR setting requires renegotiating the motivating scenario to guide our inquiry. systems contract between device and edge significantly, the Motivating Scenario: 3D Task Assistance. To more con- tangible benefits from edge acceleration may also makeit cretely explore the potential role of collaborative acceleration the first “killer application” of edge offloading. in MR, we focus on the domain of 3D task assistance, activi- ties in which the user may benefit from precise visual guides 2 BACKGROUND to help them complete a task (e.g., assembling furniture or We begin with important background context on the HoloLens cooking). Specifically, we focus on meticulous room cleaning, and the increasing prevalence of hardware accelerators be- commonly performed in hotels or cruise ships. Looking to fore diving into the details of collaborative acceleration. deliver the best experiences to their guests, interior design- ers carefully plan the layout and placement of items within 2.1 HoloLens guests’ rooms. Before new guests arrive, cleaning staff must The HoloLens is an untethered head-mounted display that ensure that the rooms are appropriately configured accord- presents visual overlays on see-through LCOS waveguide ing to the designer’s specifications; for example, by wiping, displays. The device can track its pose within the user’s moping, or dusting surfaces, or by correctly positioning items environment, without external hardware, using computer of interest in specific locations. An AR cleaning assistance vision algorithms, four grayscale cameras, and an inertial app could help cleaning staff complete these objectives by measurement unit (IMU). It also employs a depth sensor to providing helpful visual cues. In our model, a room designer perform spatial mapping, or the creation of a 3D mesh repre- specifies a correct room configuration with visual markers senting the user’s surroundings, and it provides developers a priori, demarcating surfaces that should be cleaned and with access to a general purpose 2.4 megapixel RGB camera target locations for individual items

Collaborative Acceleration for Mixed Reality

Samsung Announces New Windows-Based Virtual-Reality Headset at Microsoft Event 4 October 2017, by Matt Day, the Seattle Times

Holographic Computer

New Realities Risks in the Virtual World 2

Microsoft Aims at Apple with High-End Pcs, 3D Software 26 October 2016

Evaluating the Microsoft Hololens Through an Augmented Reality Assembly Application" (2017)

Virtual & Augmented Reality

Mixed Reality

Mixed Reality & Microsoft Hololens

Neural Networks on Microsoft Hololens 2

Validity of Hololens Augmented Reality Head Mounted Display for Measuring Gait Parameters in Healthy Adults and Children with Cerebral Palsy

Memory Task Performance Across Augmented and Virtual Reality

Review of Microsoft Hololens Applications Over the Past Five Years