Large Interactive Laser Light-Field Installation (LILLI) Tyler J

Large Interactive Laser Light-Field Installation (LILLI) by Tyler J. Schoeppner B.S., The Ohio State University (2014) M.S., Friedrich-Schiller Universit¨at (2018)

Submitted to the Program in Media Arts and Sciences in partial fulﬁllment of the requirements for the degree of

Master of Science in Media Arts and Sciences at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 2020

Author...... Program in Media Arts and Sciences August 17th, 2020

Certiﬁed by...... Joseph Paradiso Associate Academic Head, Program in Media Arts and Sciences Thesis Supervisor

Accepted by...... Tod Machover Academic Head, Program in Media Arts and Sciences 2 Large Interactive Laser Light-Field Installation (LILLI) by Tyler J. Schoeppner

Submitted to the Program in Media Arts and Sciences on August 17th, 2020, in partial fulﬁllment of the requirements for the degree of Master of Science in Media Arts and Sciences

Abstract A large psuedo 3D display measuring 14x14x10 ft was built that used the pyramid Pepper’s ghost to create the illusion of a ﬂoating 3D model. The Pepper’s ghost was combined with laser projectors and anaglyph imaging to create more realistic 3D projections. Adding laser projectors worked well at removing the ”ghostly”, transparent characteristics of the projection often associated with the Pepper’s ghost. The laser projectors were unable to display detailed images or textures and were limited to wireframes. A comparison of video and laser projectors found that only the laser projectors were capable of achieving good quality binocular disparity for the anaglyph images due to the high perceptual brightness of the laser images. A generalized framework for designing large 3D displays in public settings was proposed

Thesis Supervisor: Joseph Paradiso Title: Associate Academic Head, Program in Media Arts and Sciences

Large Interactive Laser Light-Field Installation (LILLI)

Tyler J. Schoeppner

This thesis has been reviewed and approved by the following committee members

Professor Joe Paradiso ……………………………………………………………………………………………………………… Alexander W. Dreyfoos (1954) Professor in Media Arts and Sciences Director, Responsive Environments Group Associate Head, Media Arts & Scienes Department

Dr. Dan Novy ……………………………………………………………………………………………………………… Research Scientist MIT Media Lab

Dr. Dan Smalley ……………………………………………………………………………………………………………… Associate Professor Brigham Young University

Acknowledgments

I would like to express my sincere gratitude to my thesis readers Professor Joe Paradiso, Professor Dan Smalley, and especially to Dr. Dan Novy who o↵ered continual help and guidance during this tumultuous and unpredictable time. It was a great pleasure and privilege to be able to work under your supervision. I am extremely grateful for all the help and support from my parents and brothers. A special thanks to my mom and step-dad for tolerating me through all the late nights and letting me use their entire garage space for my monstrously sized display. Thanks and love for my girlfriend for giving me continual care and support. I’d like to give a heartfelt thanks to the Open Ocean Initiative Team, the Object- Based Media group, and 11th Hour Boat Racing for their expertise and support. A special thanks to Ocearch, the Sanibel-Captiva Conservation Foundation, the Estuary & Coastal Wetland Ecosystems Research Group at James Cook University, the School of Aquatic Fishery Sciences at the University of Washington, and the Harte Research Institute for Gulf of Mexico Studies for sharing their data on sharks and loggerhead turtles to make this project come alive! 6 Contents

1 Introduction 9

2 Background 11 2.1 Human-Computer Interaction ...... 11 2.1.1 Multi-user Interaction ...... 12 2.1.2 Immersive Experience ...... 15 2.2 Display Interaction ...... 17 2.2.1 Interaction with large displays ...... 17 2.2.2 Interaction with 3D displays ...... 18 2.3 3D Displays ...... 19 2.3.1 Types of 3D displays ...... 21 2.3.2 Comparison of 3D displays for scaling ...... 24

3 LILLI Design 25 3.1 Build Design ...... 26 3.1.1 Materials ...... 29 3.1.2 Hardware ...... 31 3.1.3 Software ...... 35 3.2 Animation and UX Design ...... 36 3.2.1 Water Scene ...... 36 3.2.2 Earth Scene ...... 38 3.2.3 Wind Scene ...... 39 3.2.4 Fire Scene ...... 39

4 Final Build Observations 41 4.1 Anaglyph Pepper’s ghost ...... 41 4.2 Laser Projector Limitations ...... 43 4.3 View zones warping and mismatch ...... 43 4.4 Large 3D Display Interaction Framework ...... 44

5 Conclusion 49

7 8 Chapter 1

Introduction

The movie Bladerunner takes place in a future dystopian Los Angelas filled with flying cars, bioengineered humans, and displays that fill the entire facades of skycrapers. The story, set in November 2019, is a far cry from what current technology is capable of. Humans are not unfamiliar with making incorrect predictions of future technologies [1], but one of the technologies that comes up time and time again are 3- dimensional (3D) displays. Figure 1-1 shows how movies and TV shows continuously expose viewers to large 3D displays littered throughout cities, all culminating to a unified idea, 3D displays are the pinnacle of future technology.Currently there is a large number of 3D displays on the market, but not many have caught the attention of the public. This is due to a number of reasons, but one of the main setbacks is size limitations of 3D display technologies. Many 3D technologies work well at their respective sizes, but face upscaling challenges due to mechanical, cost, or memory limitations.

Figure 1-1: Holograms used in media. a) City wide view of Ghost in the Shell with holograms on the sides of buildings. b) Holo- grams are used throughout the Marvel Universe movies

9 Pseudo 3D displays are one of the most successful 3D display types currently available and o↵er the most promising solution to large 3D displays in the near future. Although not true 3D, they still provide immersive experiences to users by making visuals real enough to suspend the participant’s disbelief for a period of time [49]. One of the most important things to consider as large 3D displays gain popularity is how people will interact with them. Large 3D displays provide more perspectives, more information, and invite multi-user interactions, but interactions with regular sized 3D display do not translate well to large 3D displays. Figure 1-2 shows an example of how changing the scale of 3D models on a regular sized 3D displays does not translate well to large 3D displays. When people view a regular sized 3D display, every person has a di↵erent perspective of the 3D model, but they are all seeing relatively the same information so when a user scales up the model, all the other users intuitively understand what is happening. In the case of a large 3D display, the 3D model is large enough that people standing around the display all have unique perspectives on the 3D model. When a user scales up the 3D model, the other users can become ’surrounded’ by the 3D model and the illusion is broken. A new interaction framework must be explored for large 3D displays as they become more prevalent. This report covers the build and design of a large psuedo 3D display that uses the pyramid Pepper’s ghost to create the illusion of a ﬂoating 3D model. The Pepper’s ghost was combined with laser projectors and anaglyph imaging to create more realistic 3D projections. In the ﬁrst section, a ba- Figure 1-2: a) Users see sic overview of human-computer interaction and 3D displays a 3D model from shared perspectives. b) When a are covered to provide a better understanding of how these user scales up the 3D model, displays function. The second section covers the build, an- the 3D illusion is not bro- imation, and user experience design of the large 3D pseudo ken. c) Users see a 3D display. The third section discusses the results of using laser model from unique perspectives. d) When a user scales projectors and anaglyph images with the Pepper’s ghost il- up the 3D model, the illu- lusion along with a proposed generalized framework for de- sion is broken for some of the signing large 3D displays in public settings. users as they become ’surrounded’ by the 3D model

10 Chapter 2

Background

2.1 Human-Computer Interaction

Human-computer interaction (HCI) studies the design and implementation of interactive computer systems with humans (users) [30]. Even though computers have been around since the early 1900s, HCI didn’t gain popularity until the 1980s. At this time, computers were being introduced to the general public, such as in oces, and were no longer only meant to be used as specialized machines for scientists. A new market was available to computer companies and the idea of seamless human-computer interaction became ever more important. Throughout the decades, as new technologies were developed, HCI often had a dicult time keeping up [29]. Figure 2-1 shows the research and commercial development timelines of various HCI technologies. The large gap in time between academic/industrial research and eventual commercialization of products is due to years of study and ﬁne-tuning to see if a technology is practical. Bill Buxton [13] says that, ”any technology that is going to have signiﬁcant impact in the next 10 years is at least 10 years old.” It is often observed that when Figure 2-1: Approximate a new technology comes out, a new interaction framework research and commercial- is required and created [39]. This interaction framework is ization timelines of HCI sometimes met with success, but more often fails to mature topics. Some technologies spend over a decade and survive the early stages of research. in research before becom- 3D display technologies, such as virtual reality (VR), re- ing a commercialized prod- quired a new interaction framework since many functions on uct. Adapted from [29]

11 2D displays, such as a mouse cursor (xy movement), did not translate to 3D displays (xyz movement). To simplify the interaction design, many of the basic interactions with 2D displays were adapted to 3D displays to accomplish the same results using di↵erent approaches; e.g. mouse cursor vs. ray- casting [53]. Chris Hand [27] proposed a HCI framework that highlights three basic functionalities for 3D interaction: nav- igation (such as the interaction with the user’s viewpoint), interaction with the virtual objects of the virtual environment (such as object selection and object manipulation), and application control (such as interaction with 3D widgets in order to change some parameters of the virtual environment). Including these three interaction in a 3D display provide a user with complete control in a 3D scene. 3D interactions allow a more natural experience for users as compared to 2D interactions. These experiences are becoming more common as computers with advances in technology. With the prevalence of computer-inserted devices that rely on gestural, body tracking, and voice interaction (e.g. cars, lights, clocks), human-computer interaction is transi- tioning towards human-technology interaction [19]. Even posters and banners have QR codes that lead viewers to web- sites. This creates a web of interaction as technologies are able to work and talk with each other. This has decreased the use of direct user interaction and been replaced by indirect interactions opening the avenue for multi-user experiences and more natural, immersive experiences.

2.1.1 Multi-user Interaction As interactive technologies became cheaper, they moved out of the lab and into public spaces such as museums, tourist information centers, and shopping malls [Fig. 2-2][9, 33]. HCI once again had to adapt to a new area of design: multi- user interaction. Traditional HCI tools, such as the computer mouse, were not feasible for multiple users. A new method that allowed multiple inputs was required. One of the most Figure 2-2: An interactive promising multi-user inputs are multi-touch displays capable window display at a shop in of detecting multiple ﬁngers [Fig. 2-3]. These work by using London [38] capacitance to track the user’s ﬁnger. Recently, technology

12 has been shifting away from direct user inputs and instead towards indirect user inputs such as the Kinect that can sense multiple users’ entire bodies. As multi-user interactions became possible, display sizes and arrangements adapted for more users. Figure 2-4 shows the di↵erent arrangements often used for multi-user displays: vertical, horizontal, tilted, and ﬂoor. Each of these arrangements have di↵erent advantages depending on the intended Figure 2-3: One of the use or intended audience. To gain a better understanding of ﬁrst multi-touch sensors de- their uses, multi-user technologies can be divided into two veloped at CERN [58] broad categories: collaborative and passive.

Collaborative Interaction Collaborative multi-user displays are most applicable in of- fices or design studios. These are meant to be used by a few dedicated people working together for extended periods times to complete a specific task. These displays are used for a specific purpose, and thus, typically have specialized interaction frameworks. It has been found that horizontal displays work best for collaborative multi-user interactions. This arrangement allows all the users to see each others input at all times which is important for collaborative work [61]. The downside of using horizontal displays is that some users view the displayed information upside down, but depending on the task this has also been considered an advantage since understanding issues may occur to users whose viewpoints are di↵erent [24]. Not only knowing what tasks are being performed, but also who is performing the task is important for collaborative work. Vogt et. al [61] suggest using lasers to keep track of users’ work. Each user has a laser that blinks at a unique frequency. A camera tracks the laser pointer position and frequency on a display to identify the user [Fig. 2-5]. Di- amondTouch is a horizontal touch display that identifies a user’s touch through a capacitively coupled signal that runs through the user’s chairs and up through their bodies [Fig. Figure 2-4: Di↵erent dis- 2-6][18]. play arrangements: a) ver- Alternatively, some researchers suggest 3D collaborative tical b) horizontal c) tilted virtual environments where several users can join to share and d) floor [9]

13 collaborative interaction experiences, regardless if the users are in the same location [25, 42]. This o↵ers an advantage over the horizontal table as users are able to walk around a 3D environment to actively change their perspective.

Passive Interaction Passive multi-user interactions are most applicable in public Figure 2-5: Identifying settings. These are meant to be used by many di↵erent peo- user interactions from laser ple over short periods of time. Users may work together to position and frequency tracking [61] complete tasks such as creating art or sharing information, but unlike collaborative interaction, the tasks and people using the interfaces are fleeting. These displays are meant to be used by a number of di↵erent people and thus require a more generalized interaction framework. Passive multi-user displays are typically found in public areas such as museums, malls, parks, and shop windows. Ardito et. al [9] found that vertical displays work best for Figure 2-6: Diamond- public settings because this allows many people to view the Touch horizontal display [18] screen at the same time and can be increased to a very large size while keeping a small footprint (e.g. wall displays). Figure 2-7 shows di↵erent arrangements of vertical displays: hexagonal, flat, and concave. A flat arrangement triggers the strongest ”honey-pot” e↵ect, attracting multiple people in a public space to interact with each other, sometimes even promoting stranger collaboration [9]. During the summer of 2007, CityWall was placed in a shop window next to a cafe in the center of Helsinki, Finland [Fig. 2-8]. The display allowed users to browse photos and videos downloaded from such sources as FlickR and Youtube. Researchers found that the display encouraged interaction among di↵erent groups who could all perform tasks at the same time [48]. Groups of strangers had fun even if they started out using the display separately. The hexagonal arrangement prevents users in front of the display from seeing what other users were do- ing, resulting in low collaboration and sociability [9]. The concave arrangement was also found to not stimulate collaboration. The diculty with passive multi-user HCI is the large variety of users that public settings o↵er. Large amounts of

14 research have focused on interactions for various display and public settings, but these often don’t generalize across all public settings. Passive displays in public settings pose new challenges for HCI designers since they introduce the ideas of participation and engagement into the design [33]. The display location e↵ects how people interact due to factors such as lighting, viewability, who uses the display, and the purpose of the application. Public settings are also unique in that they have a diverse audience who di↵er in age, in- terests, and experience with technology. People often engage in unpredictive ways individually and in groups [33]. Collo- quial norms of a location also e↵ect the interaction framework when relying on gestures. This calls for a generalized interaction that is independent from specific public spaces. Some researchers have suggested cell phones as a generalized interaction tool [10, 14, 28]. Researchers such as Iftode et al. [35] indicate that smart phones are destined for universal acceptance due to their Bluetooth capabilities, internet Figure 2-7: Di↵erent connectivity, significant processing power and combination display arrangements: a) of modality modes. Cell phones as interaction tools also ap- hexagonal b) flat and c) con- peal to those who don’t want to have close interaction with cave [9] a display. People can be reluctant to approach displays in public settings dividing users into direct and distant viewers [12]. People have been found to be drawn to displays because of the new technologies that they o↵er [9]. EMDialog, a display set up at the Glenbow Museum in Calgary used for data Figure 2-8: CityWall pro- visualization was observed over a 15 days span [Fig. 2-9]. It moted interaction among was found that visitors were drawn to the display because strangers in a public setting of the display technology, the appealing visualizations, and [48] seeing other people interact with it [12].

2.1.2 Immersive Experience HCI design is working towards immersive experiences for users. This type of experience happens when the user feels completely surrounded by the display often by including additional senses such as touch, smell, or sound to make the Figure 2-9: EMDialog of- experience more realistic. Figure 2-10 shows an immersive fered users a chance to inter- tunnel which is a method that amusement parks are using to act with new technology

15 create immersive experiences for riders. The immersive tunnel uses large screens, some measuring up to 30m long and 23m high, that covers a viewers entire field of vision. Riders are in a tram-like vehicle that shakes and rolls to the coordination of the on-screen action. These rides, often labeled as 4D films, use 3D images and directional sound to enhance the experience. Cave automatic virtual environments (CAVEs) are another example of immersive experiences. Images are projected onto three to six sides of a cube shaped room [Fig. 2-11]. A user inside the room is tracked using motion cap- ture and the projected environment will react to their movement. Viewers must turn their head/body to see the entire Figure 2-10: Immersion projection area and to follow directional sound making the tunnel being developed by experience feel more like a natural open space [44]. Walking Super78 [57] around the room makes users feel like they are part of the experience and not simply controlling the actions of a remote character. Users also use gestures or gaze to directly interact with the system. This imitates how people interact with objects in the real life [44]. Using gestures as opposed to specific hand movements like ASL are a more natural form of communication for interactive experiences. This is especially true for public displays that are only meant to be used for a short time by a user who has no prior knowledge of the system [9]. The interaction framework must depend on natural feeling, intuitive move- Figure 2-11: Rendering of ments that don’t require any previous explanation especially auserinaCAVE[62] since gesture interfaces are not self-revealing. According to Kurtenbach and Hulteen [37], when interacting through gestures, users do not think in terms of manipulating an input device, but move parts of their body to execute the task. However, caution must be taken since everyday gestures do not always necessarily result in an optimal interaction modality. An ideal interaction modality includes both voice and gestures to make interaction more natural and ecient [60]. Voice interaction is becoming more common due to the popularity of such technologies like Siri and Alexa [19]. Sharma and Radke [56] combined gestural and voice interfaces and

16 found that users enjoyed the freedom of moving freely around the display and being able to focus on the screen instead of having to repeatedly glance at an interface to input commands. Combining voice and gesture in interaction frameworks leads towards natural feeling human-human interaction instead of human-computer interaction; i.e. experiences are more like talking to a person than a computer. As technology progressively blends into the background of our lives, issues start to arise that users feel they are less in control. Weiser [63] said, “The most profound technologies are those that disappear.” The technology itself is not invisible, but it blends into the background to the point that it may go unnoticed by the user. At worst, objects change around the user, with little understanding of why and the relationship between our activities and their e↵ects [19].

2.2 Display Interaction

As mentioned before, di↵erent HCI frameworks are developed when new technologies are invented. This section discusses the HCI frameworks for two types of display technologies: large displays and 3D displays.

2.2.1 Interaction with large displays Large displays can be considered any display larger than 3x4 ft. Due to their increase in resolution and reduction in cost in recent years, large displays have been growing in popularity and are considered one of the most promising technologies in the next 20 years since they can augment everyday pieces of furniture such as tables, walls, and panels [55] [Fig. 2-12]. At ﬁrst, large interactive displays were only used for entertainment, but now, as the technology becomes more advanced, they allow for more complex interactions [9]. Using traditional WIMP (windows, icon, menus, pointers) interaction still works for large displays, but they en- Figure 2-12: Interactive counter issues such as losing track of the cursor, distal access wall display at the Google to information, and task management problems [16]. Alter- headquarters in New York natively, Ardito et. al. [9] suggest using one of these inter- [17]

17 action frameworks when working with large displays: touchscreens, external devices, and body movements: (1) Touchscreens are a good alternative to traditional interaction since they allow many people to use the display at one time. Multiple people can ’click and drag’ objects on the display. Issues with this interaction tool arise if the display is too big. Users can no longer reach all the parts of the display and many people using the display up close often block other users’ view. (2) External devices such as cell-phones or controllers are a useful interaction tool. Users can interact up close or far away and there is seemingly no limit to the number of users interacting with the display. The downside is that using a cellphone or controller is a less natural interaction experience. (3) Body tracking is a very useful tool. Users can be tracked anywhere relative to the display. This type of interaction is becoming more popular for gaming such as with the Kinect [Fig. 2-13].

2.2.2 Interaction with 3D displays 3D displays o↵er depth cues to create the appearance of 3D objects. Although there is a large number of di↵erent types Figure 2-13: User blob of 3D displays currently on the market, many have failed to tracking using the Kinect gain much popularity due to high costs, the need for special- [52] ized equipment, small display size, and unmatured interaction frameworks. A new interaction framework was needed for 3D displays since most HCI design had been meant for 2D displays. Holodesk is a 3D display that allows users to interact with 3D projections using hand tracking [Fig. 2-14]. Virtual objects are spatially aligned with a real-world interaction space where users can move their hands freely. A Kinect creates a depth map of the user’s hands and virtually places them in the 3D scene so the user can interact with the 3D projections [31]. Jadhav et al. [36] use gloves ﬁtted with sensors and soft Figure 2-14: Using hand robotics for hand tracking that provide haptic feedback for tracking to interact with the the user. Using gloves allows users to interact anywhere with 3D models on the Holodesk the display and not be limited to a designated interaction [31] space.

18 Tangible user interfaces (TUIs) employ real-world objects to manipulate virtual scenes and objects. These are most commonly based on vision-based ﬁducial marker tracking such as the one shown in Fig. 2-15 [23]. They allow control of multiple degrees of freedom simultaneously and have been found to be more e↵ective for interaction with 3D con- tent compared to touch interaction on a tablet [11, 43]. For example, Hinckley [32] explored tangible cutting planes that a user could move freely in space and whose rotation and position would be propagated to the 3D models on a display. General drawbacks of TUIs are fatigue and the need for extra physical objects [11], as well as the possible lack of coordination [64]. In recent years, VR has become one of the most successful 3D display technologies, largely bolstered by its inclusion in the entertainment and gaming industries. It uses a head- mounted display and controllers that allow users to interact with the virtual space. The head-mounted display responds Figure 2-15: User interact- to user movement and updates the displayed image according with a real-world object ingly, sometimes providing feedback via haptics, sound, and to manipulate a virtual ob- possibly even smell and taste [44]. The controllers allow users ject [11] to point to, select, or drag and drop virtual objects [Fig. 2- 16][44]. Lotte et al. [40] have looked into using a brain- computer interface (BCI) for interaction with virtual environments [Fig. 2-17]. Historically meant for patients with severe disabilities, BCI is now being tested as a means to interact with the virtual world. BCI can be used to navigate virtual worlds although BCI control is still a new technology that is slow, error-prone, and has limited degrees of freedom [40]. Figure 2-16: Virtual reality head and hand tracking 2.3 3D Displays

Displays that o↵er depth cues to create the appearance of 3D objects are 3D displays. To be considered a 3D display, it must possess one of these types of depth cues to generate a 3D sensation for the user [Fig. 2-18][26]: Figure 2-17: BCI naviga- (1) Accommodation - Achieved by the eye lens chang- tion and control of a virtual ing shape to keep objects focused. The ability of the eye to environment [41]

19 Figure 2-18: 3D depth cues [26]

change its focus from distant to near objects (and vice versa). The eye focuses on a 3D object to perceive its 3D depth. (2) Convergence - The ability of two eyes to turn inwards or outwards to focus on an object depending if the object is close or far away. (3) Motion Parallax - O↵ers depth cues by comparing the relative motion of di↵erent objects in a 3D scene. When a user moves their head, closer object appear to move faster than distant object. (4) Binocular Disparity - The di↵erence in images from the left and right eye when viewing 3D objects. When objects are closer, the di↵erence in the images is greater. Some 3D displays provide all of these depth cues while others only achieve a few. For example, 3D movies use binocular disparity to create the illusion of a 3D scene, but this often causes eye fatigue for viewers due to the conﬂict of accommodation and convergence since the displayed images are on the screen and not at the actual physical distance perceived in the 3D scene [34].

20 2.3.1 Types of 3D displays

3D displays can be divided into ﬁve groups: stereoscopic, autostereoscopic, volumetric, holographic, and pseudo 3D.

Stereoscopic Displays (Two-view)

Based on binocular disparity, these types of 3D displays present o↵set views of a scene separately to the left and right eye. Objects in the scene with di↵ering levels of image disparity create the illusion of objects being farther and closer to the viewer. The two views of the scene are superimposed onto the same display and users wear glasses, such as anaglyph or polarized glasses, to ﬁlter images to the left and right eye [Fig. 2-19].

Autostereoscopic Display (Multi-view) Figure 2-19: a) Anaglyph b) and polarized glasses for Similar to stereoscopic displays, autostereoscopic displays use two-view 3D displays binocular disparity to send separate views of a scene to a users left and right eye. Instead of having only a single static view though, these displays o↵er horizontal motion parallax (usually at a coarse level) so when a user moves their head from side-to-side, the scene view changes accordingly showing discrete views. This is done by a number of methods such as head-tracking, lenticular lenses, and parallax barriers. Head- tracking lets the system know from what perspective the user is viewing the display and updates the two views. The downside of this method is that it usually only works for a single user. Lenticular lenses work by discretizing an image into a large number of thin slices [Fig. 2-20a]. The lens focuses sections of these images in certain directions allowing multiple views of the image at di↵erent angles. The parallax barrier works similar to the lenticular lens, in that di↵erent angles show multiple views of the image, but instead of the light Figure 2-20: a) Lenticu- being directed in di↵erent directions, only certain views are lar lens vs b) parallax barrier visible at discrete angles [Fig. 2-20b]. for autostereoscopic displays

21 Volumetric Display

Volumetric displays use mechanisms with precise timing and alignment to display light in a volume, most popularly by swept-volume or static volumes. Swept-volume displays use mechanisms that move faster than the persistence of vision creating the illusion of ﬂoating points of light [Fig. 2-21a]. Static volumetric displays use a predistributed light volume to create 3D images. Lights in the volume turn o↵ and on to create voxels (3D pixels) to draw out patterns [Fig. 2-21b]. Volumetric displays o↵er all the 3D depths cues since the 3D images are in real space. Ochiai et al. [46] use high-powered lasers focused in air to create small bursts of plasma. The plasma creates ﬂashes of light (voxels) to draw images in air. The downsides of volumetric displays are the moving parts that prevent users from ”touching” the 3D images. These displays also su↵er from diculties with occlusion, opacity, and large bandwidth requirements.

Holographic Display

Holographic displays rely on light di↵raction to create 3D im- Figure 2-21: a) Static and b) swept-volume volumetric ages that provides all the depth cues of a real scene. These displays displays require no additional glasses or external hardware for a user to view the 3D image. Figure 2-22 shows how a traditional hologram plate works. Holographic plates record the light interference patterns from an object and a reference beam. After processing the plate, a user can shine the reference beam (reconstruction beam) onto the plate. The beam is di↵racted by the interference pattern on the holographic plate to create a reconstructed wavefront that results in an image of the 3D object. The unique virtue of holograms is that they record and replay all the characteristics of light waves, including phase, amplitude, and wavelength, going Figure 2-22: Diagram through the recording medium [26]. As such, ideally there showing how a hologram should be no di↵erence between seeing a natural object or plate works. For holo- scene and seeing a hologram of it. Using devices such as spa- graphic displays, the plate can be replaced by a spatial tial light modulators (SLMs) an artiﬁcial interference pattern light modulator or other up- can be created for an updatable holographic display. datable di↵ractive element The disadvantage of holographic display technology is

22 that it requires tremendous amounts of data to create an image. Holographic plates use ”pixels” smaller than 1 µm which translates to a display requiring trillions of pixels on a reasonable sized display screen [26]. Such a huge amount of data presents seemingly unfathomable technical challenges to the entire chain of 3D imaging industries, including 3D image acquisition, processing, transmission, visualization, and display. Coarser pixel sizes can be be used to reduce the data demand.

Pseudo 3D Display Displays that don’t fit into the traditional group of 3D displays, but often rely on illusions and 2D depth cues to create the appearance of a 3D objects are referred to as pseudo 3D displays. These displays are some of the most popular types to be attributed to the name ”hologram” in media due to their accessible parts and easy understanding to the public. One example are fog screens that project 2D scenes onto a mist of water to make the image appear as if it is floating in front of the user [Fig. 2-23]. The user can stick their hands and entire body through the projections and interact with Figure 2-23: Example projection on a fog screen the projections. The projections can be on both the back and front of the fog screen for a volumetric feel. Using a similar technique, a circus in Germany uses projections on netting to create the illusion of floating animals [Fig. 2-24]. Projection mapping is also considered a form of pseudo 3D display when projecting onto 3D surfaces. Figure 2-25 shows how projection mapping takes advantage of the 3D architecture of a building and augment it with projections Figure 2-24: Circus Ron- to make it appear as if the structure is moving or changing calli uses projections to re- place live animals during colors. shows Pepper’s ghost is a technique that has been used as far back as the 1860s that creates the illusion of floating objects. It uses a beamsplitter that reflects the image of an object that is hidden from the viewer. Viewers only see the reflected image that has all the correct depth cues (essentially a mirror) to give the illusion of a floating ghostly image [Fig. 2-26a]. The amount of light that is reflected and refracted Figure 2-25: Projection is governed by the Fresnel equations and depends on the an- mapping onto buildings

23 Figure 2-26: a) Old use of the Pepper’s ghost where an actor would hide under the stage and their reﬂected image would appear to the audience above. b) Diagram showing how the object’s image is reﬂected towards the viewer [20]

gle of incidence, the polarization of the incoming light, and the reﬂecting materials [Fig. 2-26b] [15]. Pepper’s ghost is still widely popular today and is used in theme parks such as Disney World and Universal Studios. Instead of real-objects, these displays mostly rely on projectors to create visuals. Figure 2-27 shows an updated version of the Pepper’s ghost using a four sided pyramid that a user can walk around to make the projection look like its ﬂoating in the pyramid.

2.3.2 Comparison of 3D displays for scaling In the coming decades, some 3D display technologies will face diculties trying to scale up in size. Stereoscopic displays are the most successful. They are popularly used at movie the-

Figure 2-27: Pyramid aters with the only limit in size being due to the resolution Pepper’s ghost displayed on of projectors. Autostereoscopic displays have been used to a phone create life-sized human displays, but are rarely made larger. Scaling up autostereoscopic displays requires more view angles, otherwise they su↵er coarser view zones. Volumetric displays are hindered by moving mechanisms and noise. A larger display means a faster moving mechanism which would cause more strain on the display system. Holographic displays are still a struggling technology that have yet to get o↵ the ground. It is dicult to make small displays, let alone large displays. Similarly to stereoscopic displays, pseudo 3D displays are already a popular medium for large displays. They often require cheap components that are easily available to the public.

24 Chapter 3

LILLI Design

The large interactive laser light-field installation (LILLI) uses a combination of multi-view, two-view, and projection techniques to create illusions of floating 3D objects. A rendering of LILLI can be seen in Figure 3-1. The installation consisted of four large screens that showed di↵erent sides of a 3D object (multi-view) with anaglyph ef- fects used on each of the four screens (two-view) while all being projected using the Pepper’s ghost illusion (pseudo 3D). To the author’s knowledge, there has never been a pyramid Pepper’s ghost used with lasers or anaglyph imaging to enhance the visuals and 3D e↵ect. LILLI utilizes laser projectors to overcome the ghostly e↵ects commonly associated with Pepper’s ghost. Not only do the lasers create stark, bold visuals, making the projected images more opaque, but also ease the need for stringent lighting control that the Pepper’s ghost is associated with. The author also implemented the use of anaglyph images to give depth to the Pepper’s ghost projections. This chapter is separated into two sections. In the first section, the build and design process of LILLI will be discussed along with hardware and software choices. In the second section, the animation and user experience design will be discussed.

25 Figure 3-1: Rendering of LILLI in the Media Lab atrium

3.1 Build Design

The first consideration in the design of LILLI was that it be very large. LILLI is meant to be a multi-user experience that groups of people can interact with at the same time. This display may likely be many users’ first interaction with 3D projections so a ”wow” factor was desired with regard to the size. The projection technique of the pyramid Pepper’s ghost all scale linearly so there is no limit to the size that it can be made aside that it fit inside the venue that it is being displayed. The first limitations in size would be due to the limit in the resolution of the video projectors. The venue chosen for LILLI was the 3rd floor atrium in the MIT Media Lab. It is one of the largest spaces available in the lab with an area of roughly 38x48 ft and a ceiling that extends up three floors. Users need sucient space to stand back from the installation to interact with the visuals so the expected installation footprint extends out an additional 10ft. There also needs to be enough room for non-active users to pass by the display without interacting with the visuals so an additional 8ft is added to the footprint. Keeping the installation symmetric gives a maximum footprint size of 20x20 ft

26 Figure 3-3: CAD LILLI model with coverings

Figure 3-4: CAD LILLI model showing only the framing

in the Media Lab atrium space [Fig. 3-2]. The original design of LILLI was a 20x20 ft inverted pyramid that was to be hung from the Media Lab atrium truss system, but due to quarantine and loss of access to campus, a 14x14 ft stand-alone non-inverted pyramid was designed. The non-inverted pyramid was the most structurably stable without a truss system. CAD models of LILLI are shown in Figures 3-3 and 3-4. The installation features four 7x10 ft triangular beamsplitter Figure 3-2: Footprint in screens that all tilt inward at an angle of 45° to resemble the Media Lab atrium shape of a pyramid. Di↵use screens are mounted at the top of the installation. Images are projected onto the underside of these di↵use screen. These projected images reﬂect o↵

27 the di↵use screen and then reflect o↵ the angled beamsplitter screens out towards the users to create the illusion of a floating object. Figure 3-5 shows how the distance from the di↵use screen to the beamsplitter must match the distance from the beamsplitter to the expected 3D model location in the installation so that all view zones around the display match up to the same location. There are projections of four Figure 3-5: Diagram showing how the Pepper’s pyra- di↵erent views of a 3D model onto the four di↵erent screens mid works. All projected around the installation. When a user walks around the in- images must line up in the stallation, they see the four sides of the model creating the middle of the pyramid for illusion of a floating 3D model within the pyramid. Panels the illusion for work. cover the rim of the top of the display so the users cannot see the projected images as this would ruin the illusion. Panels also cover the outside base frame to hide all of the electron- ics and projectors. Finally the inside is covered by a black material so the 3D projected objects appear to be encased in a viewing zone. All of the blockers and coverings are black to give as much contrast between the display frame and the projected images. The display is limited to four view zones since adding additional view zones reduces the maximum size of the projected 3D model. When keeping a constant display perimeter length, the view zone size is equal to P/S where P is the perimeter length and S is the number of sides of the display. Figure 3-6 shows how the projected images change when ad- justing the number of view zones. The additional barrier required between each viewing zone would obstruct the users Figure 3-6: The total 3D view of the 3D projection, further diminishing the quality of model projection size is re- the 3D illusion. duced when more view zones are added to the installation The panels around the top of the display hide the projections from the users. As such, the top of the beamsplitter screens are not viewable by the users, so this section of the pyramid is removed. Also, no images are planned to be projected to the far corners of the triangle base of the screen, so these sections are also shortened [Fig. 3-7]. The beamsplitters are tilted at 45° relative to the viewer so the image is reflected out parallel to the ground. The beamsplitters are Figure 3-7: Beamsplitter positioned so the center of the projected images are level with frame size the height of an average person (5’6”) [54]. This results in

28 Figure 3-8: User spacing dependent on blocker height

a beamsplitter height 3ft o↵ the ground [Fig. 3-8]. At this height, the user must stand back a minimum of 6ft in order for the blockers to hide the projected images at the top of the display. The user is also able to see the adjacent projected view zones, so additional blockers between the di↵use screens are used. These blockers not only help hide the Pep- per’s ghost illusion, but also protect the users from viewing the bright laser light reﬂections. The blockers are tilted at 15° to add dimension to the installation so it isn’t a perfect cube shape.

3.1.1 Materials The build materials consist of framing, masking, beamsplitters, and di↵use screens.

Framing T-slot structural framing was chosen as the build material for the installation since there is a large catalog of parts available online. LILLI was planned to ﬁrst be assembled o↵ campus, torn down after the quarantine lifted, and reassembled within the atrium so the parts had to be able to be assembled and disassembled quickly and easily. T-slot framing requires no pre-fabrication and allows users to start building immedi- ately. T-slot 40x40 mm rails are used for the majority of the structure framing with T-slot 80x40 mm rails used at the four

29 structure corners for extra strength. Hollow T-slot 30x30 mm rails were chosen to support the beam blockers around the edges of the top frame since these are not weight bearing. T- slot 20x20 mm rails were chosen for the beamsplitter frames as well as the inside framing since these sections need to be light and only support the tension of the beamsplitters. All of the T-slot framing that is viewable by the user is colored black to create greater contrast between the frame and the projected images. Some of the T-slot framing on the inside of the installation is also black to prevent unwanted reflections from the projectors or di↵use screens. This also helps to keep the structure hidden in the Pepper’s ghost illusion. Some of the T-slot rails require connections at uncommon angles; e.g. 125°, 145°. Custom brackets were fabricated to attach these components. These were created using flat metal aluminum bars that were cut to size. Holes were then drilled at the required angles. The ends of the T-slot rails were cut for a more flush connection at these uncommon angles.

Masking Black painted tempered hardboard is used to cover the base of the frame and to hide the di↵use screens from the users. Tempered hardboard has a smooth ﬁnish on one side that makes it easy for painting. The material is thin allowing it to be cut into custom shapes and to be connected directly to the T-slot framing with no special components. It is also light enough to not be too overbearing on the frame. Black Peachskin fabric is used to cover the inside of the display. The fabric is non-stretch so it can be tensioned over a span of 9ft without sagging. It also has a high knit count so it isn’t see-through when tensioned.

Beamsplitter The ideal beamsplitter is perfectly ﬂat and transparent. Ta- ble 3.1 shows the materials considered for the beamsplitter screen. The most straightforward material are glass panes because they are ﬂat and o↵er a selection of di↵erent trans- parencies, but at such a large size the glass panes would need

30 Material Cost Size Transparency Glass X X GLIMM Show Lining X X Table 3.1: Comparison of Plastic Lining X X beamsplitter materials Boat Shrink Wrap X X Window Shrink Wrap X X X to be custom made and would be incredibly heavy. Plastic lining was considered because they are easy to handle and are cheap. The disadvantage is that many plastic linings were found to have a foggy appearance and would require an additional tensioning system to make them flat. Boat shrink- wrap and window shrink-wrap both have a built in tensioning characteristic and have good transparency. The boat shrink-wrap wasn’t o↵ered in rolls wider than 72 in which was too narrow to fit the beamsplitter framing so multiple sheets would have had to been combined. Di↵erent materials were tested to connect the screens, but all left a noticable mark along the screen. In the end, window-shrink wrap was chosen because this material fits the beamsplitter frame, has good transparency, and has a built-in tensioning method.

Di↵use screen

Ideal di↵use screens reflect the majority of incident light on their surface to create bright Pepper’s ghost projections. Any nonuniformity on the di↵use screens are noticeable in the final 3D projected image. Table 3.2 shows the materials considered for the di↵use screens. In the end, projector screens were chosen for the di↵use screen because the material reflects a large percentage of light and has no issues with uniformity. Projector screens are sold in rolls and can be cut to custom lengths and sizes.

3.1.2 Hardware

LILLI consists of four 7200 lm Panasonics video projectors, four 2800 lm short throw video projectors, four Pangolin 6W

31 Material Cost Uniformity Reﬂectance Table 3.2: Comparison of White Paint X di↵use screen materials Glass Beads X Projector Screen X X X

Figure 3-9: Diagram showing the placement of the hardware around LILLI

90kpps laser projectors, and four Intel Realsense depth cameras. All of the hardware is being run on a computer with two Nvidia Quadro K6000 graphics cards. Figure 3-9 shows the placement of hardware around the installation and Fig. 3-10 shows the wiring diagram for the components.

Video Projectors

The PT-RZ770 Panasonic projectors were chosen because of their high lumens (7200) which are ideal for use in ambient lighting. The distance from the projectors to the di↵use screens is less than 6ft so lenses are attached to the projectors to increase image size. The projectors are set at the maximum brightness before the blacks of the image start to appear. Short-throw projectors are placed around the top four sides of the installation. These project images down onto the ﬂoor around the display for users to interact with. The quality of the ﬂoor projections are not critical to the function of the display so low lumen (2800) projectors are used.

32 Figure 3-10: Diagram showing the wiring for all the hardware

Laser Projectors

The most important feature of the installation are the four laser projectors. The lasers draw out images using a built- in mirror galvanometer. Three laser diodes (red = 1.3W, green = 1.8W, blue = 3.0W) within the laser housing are aligned so they all combine at the mirror galvanometer. The mirror galvanometer quickly moves along two axes to write out images [Figure 3-11]. The power of the RGB lasers diodes can vary and turn on/o↵ in rapid succession to create a range of colors. The faster the mirror galvanometer moves, the faster the image is able to be drawn, which results in less flashing. When there are too many points in the laser image, the image begins to flash because the laser is unable to draw the entire image faster than the persistence of vision. The mirror galvanometer moves at a speed of 90 kpps or 90,000 Figure 3-11: Laser galvanometer has free- laser points per second. The higher the kpps, the smoother dom along the x- and the resulting laser image appears. y-coordinates Galvanometer, liquid crystal display, and spatial light modulator laser projectors were all considered for the installation. Table 3.3 shows the comparison of the di↵erent laser technologies. LCD projectors work by using a lamp, a prism, and filters to create the image on the screen [59]. Using the LCD filter reduces the maximum brightness and makes a resulting image comparable to a high lumen con- ventional projector [Fig.3-12]. Spatial light modulators are Figure 3-12: Diagram of a di↵ractive optical elements (DOEs) that modulate a spatial LCD projector [59]

33 Figure 3-13: Diagram of using a spatial light modulator to create a hologram [50]

(pixel) pattern to control the phase and amplitude of light passing through it [50]. By using an algorithm, an image can be generated into a DOE pattern so the resulting laser image shows the same image. This can be used to make very complex laser images and three separate DOEs can be combined for color mixing. The downside of using the DOEs is that they reduce the laser brightness by spreading the laser beam across a viewing field. A more powerful laser would fix this, but there are threshold power limits for SLMs. Cur- rently, SLMs are only meant for small applications. To make the beam wider, a lens would have to be used that would severely distort the image and again reduce image brightness [Fig. 3-13]. They are also not ideal for real-time interaction since it takes a special system to compute the necessary light modulator field. The algorithm to compute the DOEs patterns takes time and is not ideal for real-time interaction. In the end, 6W galvanometer lasers were chosen because of their high power which is most important to their function for this display. The LILLI prototype built for Fall Members week used a laser galvanometer that had a max speed of 30kpps which was too slow for the images causing significant flashing. A faster galvanometer is used for the final LILLI build, but even the fastest moving laser has diculties drawing out objects without flashing. The use of lasers was scaled back and it was repurposed as a highlighter to draw out important parts

34 Laser Technology Brightness Image Complexity Interaction Image Size Mirror Galvanometer X X X LCD Laser Projector X X X Spatial Light Modulator X

Table 3.3: Comparison of di↵erent laser projection technologies in a scene. The four lower video and laser projectors are projected onto the di↵use screen to create the Pepper’s ghost illusion. In order to hide the projectors they were considered being placed in three di↵erent locations: in the upper section of the display, inside the display, and directly in front of the display. Placing the projectors in the upper section of the display would cause strong keystoning and the projections would need to cover an area of 10x5 ft over a throw distance of <1ft which would require a strong lens likely resulting in even more image distortion. The projectors were also considered being placed within the display and projecting straight up through the beamsplitters to the di↵use screens, but the brightness of the lasers combined with the imperfect transparency of the beamsplitters created a double laser image such as the one shown in Fig. 3-14. In the end, placing the projectors on the ground in front of each of the sides proves to be the simplest setup.

User Interaction Intel Realsense depth cameras are used to track users around the display and allow users to interact with the animations. This display is large and thus meant for multiple people to interact with it at one time. For this reason, we tried to Figure 3-14: Laser projected image appearing avoid direct user interaction and instead on group interac- twice on the beamsplitter tion, where no one person has more control than another.

3.1.3 Software Unity3D is used for the animations of the projected images. This software includes many tools and plugins for handling

35 and interacting with 3D models. The majority of the animations are sent directly to the video projectors. For laser animations, video frames are captured in Unity3D and sent to Openframeworks which processes the images into a read- able format for the digital-to-analog converter (DAC). For interaction, video feed from the Realsense cameras is captured and sent through a blob ﬁnding plugin in Unity3D. The range of depth sensing can be controlled with the Re- alsense cameras to remove background noise from the user blob tracking. When users enter certain areas around the Figure 3-15: Default home installation, they trigger animations. scene used to transition to other scenes 3.2 Animation and UX Design

This project is being developed in partnership with 11th Hour Racing and the MIT Media Lab Open Ocean Initiative. As a result, the display themes are focused on ocean science and promoting cleaner oceans. The animations are spread among four scenes in Unity3D with each focusing on di↵erent visualization techniques: media, art, exploration, and data. The default home scene is used as a transition point between the four main scenes. It features a 5ft diameter globe that slowly rotates in the middle of the display [Fig. 3-15]. When a user comes close to the installation, they are sensed by the Realsense depth cameras and the default home scene triggers four spinning elemental rings (water, earth, wind, ﬁre) that are projected onto the ﬂoor around the installation [Fig. 3-16]

3.2.1 Water Scene The scene shows a small glowing earth located near the top Figure 3-16: a) Water, b) of the display. Projected onto the ground around the instal- earth, c) wind, and d) ﬁre el- lation are swimming sharks and other creatures [Fig. 3-17]. emental rings that were pro- Users can select one of the creatures by stepping on them. jected onto the ﬂoor space around the installation. A Once selected, the swimming creature slowly dissolve away user could step into one of and reappears on the large main display. On all four view these rings to chose a scene zones, the user can see the selected creature’s name, gender, to transition to species, and size. Creature’s tracked paths provided by re-

36 Figure 3-19: a) Water scene example showing species facts b) Ocearch personal summary c) species terrain and d) media about the great white shark Mary Lee searchers [3, 4, 6, 5, 2] are projected onto the globe such as the path of the great white shark Mary Lee shown in Fig. 3-18. Each of the four view zones have di↵erent information about the creature. One side shows the geographical terrain of the species, one shows a personal summary of the creature written by Ocearch scientists, one shows interesting Figure 3-17: Atopdown facts about the creature species, and the final side shows me- view of swimming creatures is projected onto the floor dia of the creature such as Twitter, articles, or general movies space around the installa- and images [Fig. 3-19]. After one minute, the creature is re- tion moved from the large display screen and a new creature can be selected. The creatures swimming around the floor of the installation area are used as an extension of the 3D scene to the real world. The users must focus, not only on the large display, but also on the space around them to provide an immersive experience similar to those of CAVEs. This scene has di↵erent information available on each view zone that motivates Figure 3-18: Great white shark Mary Lee’s tracked users to walk around the installation. path on the projected globe

37 Figure 3-20: Bathymetric model of the earth’s surface with tags that users can select to view and explore interesting terrains

The use of new visualization technologies have been used to focus on environmental issues [45]. Ockwell et al. [47] found that reactions of understanding and emotion elicit change in public perspective. Ocean science and preservation is often overlooked in the public eye so the issue of shark finning was brought up in the water scene. One of the creatures that a user can select, a tiger shark named Zuza, is dead due to shark finning. The user can select this creature and see that their tracker has stopped along with a news article about the death of Zuza [51]. The user becomes temporarily attached to these creatures because they specifically chose that creature from the others in the installation. Finding out the creature is deceased can elicit the change in public perspective that is needed to fix this issue with ocean preservation.

3.2.2 Earth Scene

This scene shows a bathymetric model of the earth shown in Fig. 3-20. The earth slowly rotates and has highlighted areas with interesting terrains that a user can select. Selecting one of the locations causes the display to zoom in on that area and then the user can navigate around the scene to explore the terrain. The terrain pans across the display so all users get the same feeling of the terrain moving in the same direction Figure 3-21: Four di↵er- relative to the display. Markers across the terrains are used ent views of a terrain map to highlight notable areas. of Devil’s Tower Technologies allow people to visit places that are inacces-

38 sible, far away, do not exist anymore, or even never existed [22]. LILLI can be used as a looking glass into far away locations that are normally inaccessible to people such as the Mariana Trench or even the surface of Mars. Figure 3-21 shows how LILLI o↵ers four di↵erent birds eye views of a terrain.

3.2.3 Wind Scene This scene uses data from ARGO ﬂoats to model wind patterns around the earth. ARGO ﬂoats are instruments distributed across the world’s oceans that record temperature, salinity, and position. Their change in position can be used to estimate wind strength and direction. Figure 3-22a,b shows how wind data is used to generate particles that spawn from the recorded ARGO locations and the ARGO speed deter- mines the color of the particles. This animation runs automatically for one minute and then the particles die down. At this point, users can move around the display to trigger their own wind particles. This visualization technique is a form of community art that multiple users can contribute to create a temporary ’painting’ on the installation.

3.2.4 Fire Scene This scene uses data from ARGO floats to model temperature patterns around the earth. Each ARGO float is represented by a heat bar that varies it’s color based on a color gradient (blue=cold, red=hot) and it’s height. The animation shows the progression of ARGO temperature data from 1999 to 2020. Viewers can see the temperatures slowly rising over the years, but also see the large increase in the number of ARGO floats being deployed around the world [Fig. 3-22c,d]. Similar to the wind scene, this animation runs automatically for one minute. After that, the bars fade away and leave a blank earth model. At this point, when users approach the installation, the bars start to reappear and the more people that approach the installation and come closer, the hotter the earth appears. Only when the users back away does the earth cool down again.

39 Figure 3-22: a) Initial view of the wind scene earth model. b) The wind scene once the ARGO float data spawns data particles. c) Initial view of the fire scene earth model. d) The fire scene when the heat bars protrude from the earth’s surface

This scene enables viewers to visualize something that is normally invisible to them. Global warming is a rising issue that especially concerns ocean scientists because of its dire e↵ects on ocean temperatures. LILLI gives users a direct interaction with the world temperature, engaging with environmental issues in a more speciﬁc way, and creating a sense of realness with the issue [8, 21].

40 Chapter 4

Final Build Observations

LILLI was constructed in the Media Lab atrium and fully operational for a week. During that time, tests on anaglyph images, laser projection limitations, and 3D image quality were conducted. One of the goals of this study was to im- prove the perceived brightness of the Pepper’s ghost illusion in ambient lighting. It is important to di↵erentiate lumens (a standard measure of brightness) from perceived brightness (a subjective measurement of brightness) [7]. The video projectors have a defined lumen output, but the perceived brightness of the Pepper’s ghost projections vary significantly throughout the day depending on the ambient lighting. This chapter uses perceived brightness when comparing projections since this is a more appropriate dynamic for comparing the 3D projections. Pictures and videos of the final build can be found at https://www.media.mit.edu/projects/ large-interactive-laser-light-field-display/overview/ .

4.1 Anaglyph Pepper’s ghost

When projecting onto a flat di↵use screen, the Pepper’s ghost projection also appears flat. Even so, the image floating in midair in a real-life volume helps to add the appearance of depth. Anaglyph images were used with the Pepper’s ghost projections to see how well the binocular disparity would

41 Figure 4-1: Comparison of anaglyph 3D images for laser and video projectors

work and to see if this would make the projected images appear more realistically 3D. It was found that the video projector anaglyph images did not work while the laser anaglyph images worked very well. Figure 4-1 shows anaglyph images from both video and laser projectors. It was found that video projectors don’t work well with anaglyph Pepper’s ghost because of low image projection brightness resulting in little to no disparity of the image when using the anaglyph glasses. The projected image reflects o↵ the di↵use screen, reflects o↵ the beamsplitter, and is filtered through the anaglyph glasses before arriving at the viewers eye resulting is a loss of over 50% of the initial image brightness. On the other hand, the anaglyph Pepper’s ghost using laser projectors worked very well because of the significant boost in brightness from the laser diodes. It has proper depth cues that make it appear as if it is floating directly in the center of the display. Figure 4-2 shows a direct comparison of laser and video projected colors. The staircase behind the video projected color bars is easily seen whereas the majority of the laser color lines are opaque and hide the staircase. The white, yellow, cyan, and green lasers lines are perceptively brighter than the purple, red, and blue lasers lines. This is because the human eye is most sensitive to light at 555 nm or green light. White, yellow, cyan and all mixtures of green laser light making these all appear brighter than the red and blue laser light mixtures. This suggests that red/cyan anaglyphs may not be the best Figure 4-2: Direct com- option for the Pepper’s ghost projections. Anaglyph displays parison of brightness for the laser projector and video are filtered using near complimentary colors (red and green, projector red and cyan, or green and magenta) [26]. There were di-

42 Figure 4-3: a) Laser drawing images with a slow movement speed provides more accurate outline of the 3D model, but causes image flashing. b) Increasing the galvanometer movement speed reduces flashing, but causes image warping culties getting the cyan laser light to properly filter through the anaglyph glasses so a red and green anaglyph would likely work better for the Pepper’s ghost projections. This would allow finer control over filtering (only requiring the green and red diodes as opposed to using cyan which is a mixture of blue and green).

4.2 Laser Projector Limitations

The laser projectors were limited on the amount of image detail they could provide compared to video projectors. The laser galvanometer must draw out every detail individually whereas the video projector uses a light source and filter to create an entire scene at once. As a result, the lasers were limited to projections of wireframes and outlines of 3D models. The laser galvanometers also had diculty creating quality outlines of complex models like the creatures from the water scene. Detailed 3D models required more points for the laser to draw. This creates longer drawing times resulting in image flashing since the image took longer to draw than the persistence of vision [Fig. 4-3a]. Speeding up the galvanometer speed prevents flashing of the laser, but causes curving of the lines as shown in Figure 4-3b. A balance between outline quality and laser flashing had to be used.

4.3 View zones warping and mismatch

Figure 4-4a shows the typical setup for a pyramid Pepper’s ghost. The same view of a model is shown on all four sides

43 so when a user walks around the display, these images seam- lessly blend together [Fig. 4-4b]. In order to make the projection appear as a floating 3D model, the images often spin in a circle, otherwise viewers would never see the other sides of the model. Instead of using the same image on all the screens, LILLI showed four di↵erent views of a 3D model on the four screens [Fig. 4-4c]. Figure 4-4dshowshowthiscre- ated image continuity issues when viewing the installation at the area between adjacent screens because of the 90 ↵set of the images causing a break in the 3D illusion. To fix this issue, more sides could be added to the installation. This creates more perspectives of the 3D model, and thus a smaller o↵set between adjacent projections. This solution reduces the issue, but does not fix it. For simplicity, the traditional method of showing the same side of the model on all screens can be used. In order to still o↵er unique perspectives on the four sides of the installation, information and pop-ups can still be used. It was observed that the projections appeared 3D when viewing the screens straight on, but appeared flat when viewing the screens at an angle. As the user walked towards the edge of the screen, the projections became distorted. To reduce the issue, the images can be receded deeper into the display by increasing the distance from the di↵use screen and beamsplitter. The image would be hidden when viewed at high angles preventing image distortion.

4.4 Large 3D Display Interaction Frame- work

As with other new display technologies, a new HCI framework must be explored for large 3D displays. There are not many direct examples to draw from for large 3D display interactions since most 3D displays are smaller than 2x2x2 ft. Regardless, some interactions for regularly sized 3D displays can be used because they translate well to large 3D displays. Some interaction modalities from VR can also be used since the virtual scenes can rely on long distance interaction or interactions with large virtual models.

44 Figure 4-4: a) Four views on a pyramid Pepper’s ghost if all the sides have the same view of a 3D model. b) How two adjacent sides of a pyramid Pepper’s ghost appears before the images are combined. When the images are combined, they blend together perfectly, making seamless transition between di↵erent sides of the pyramid Pepper’s pyramid. c) Four views on a pyramid Pepper’s ghost when each side shows a di↵erent view of a 3D model. d) How two adjacent sides appear before the images are combined. When the images are combined, they do not match up because they are 90 ut of sync

45 Referring to Section 2.2.1 on large display interaction, it is important to distinguish between the type of interaction being sought whether it be collaborative or passive since dif- ferent HCI frameworks serve di↵erent purposes. For example, ray casting may be a good interaction modality for a small collaborative team, but is not ideal for a large numbers of users all trying to control a single object at the same time. The installation in this study focused on passive multi-user interactions, and as such, this framework will be focused on in this discussion. The section of this chapter is not to argue for a speciﬁc interaction framework. It is intended to start a discussion on how to approach these types of display as they start to become more ubiquitous throughout our lives. This discussion relies on informal and subjective arguments to make our case. Three important tasks to consider when designing the framework for a large passive 3D display are the following: (1) Avoid consolidating power - No one person should have ultimate power over the display. Although this feature is useful in collaborative settings, having one person have control in a passive display reduces multi-user interaction. Spreading out interaction to all users prevents people from queueing and allows a ﬂuid public interaction with the display. (2) Avoid consolidating information - Promote the users to walk around the display by presenting di↵erent information throughout the display. For an immersive experience, the user must walk around and observe details of the 3D projection and this can be done by spreading out interesting features of the display. This also prevents crowding when designing for large groups. (3) Balance local and mass feedback - For regular sized displays, interactions are centered on designated areas or objects from which viewers receive the same feedback; i.e from visual, auditory, or haptic feedback. For large displays, the interactions are spread out across multiple people at multiple locations and so viewer feedback must be distributed accordingly. For example, if a large scene interaction occurs, all users should receive feedback whereas a single user moving

46 a small virtual object should receive localized feedback and other nearby users should be una↵ected.

47 48 Chapter 5

Conclusion

In this study, a large psuedo 3D display measuring 14x14x10 ft was built that used the pyramid Pepper’s ghost to create the illusion of a ﬂoating 3D model. The Pepper’s ghost was combined with laser projectors and anaglyph imaging to create more realistic 3D projections. Adding laser projectors worked well at removing the ”ghostly”, transparent characteristics of the projection often associated with the Pepper’s ghost. The laser projectors were unable to display detailed images or textures and were limited to wireframes. A comparison of video and laser projectors found that only the laser projectors were capable of achieving good quality binocular disparity for the anaglyph images due to the high perceptual brightness of the laser images. For a continious 3D e↵ect around the display, its suggested that the same side of the 3D model is shown on all screens around the installation. Dif- ferent sides of the display should still show di↵erent information to prompt users to walk around the area. A generalized framework for designing large 3D displays in public settings was proposed to help promote this type of technology for future uses.

49 50 Bibliography

[1] A 19th-Century Vision of the Year 2000. [2] Harte Research Institute Harte Research Institute. | [3] Ocearch. [4] Sccf: Sanibel-captiva conservation foundation. [5] School of aquatic and ﬁshery sciences, university of washington. [6] Sicem: Science integrated coastal ecosystem management, james cook university australia. [7] 4 Things Architects Should Know About Lumens vs. Per- ceived Brightness, January 2019. Section: Architecture & Design. [8] Sun Joo (Grace) Ahn, Joshua Bostick, Elise Ogle, Kris- tine L. Nowak, Kara T. McGillicuddy, and Jeremy N. Bailen- son. Experiencing Nature: Embodying Animals in Immer- sive Virtual Environments Increases Inclusion of Nature in Self and Involvement With Nature. Journal of Computer- Mediated Communication, 21(6):399–419, 2016. eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/jcc4.12173. [9] Carmelo Ardito, Paolo Buono, Maria Francesca Costabile, and Giuseppe Desolda. Interaction with Large Displays: A Survey. ACM Computing Surveys, 47(3):1–38, April 2015. [10] Rafael Ballagas, Michael Rohs, and Jennifer G. Sheridan. Sweep and point and shoot: phonecam-based interactions for large public displays. In CHI ’05 extended abstracts on Human factors in computing systems - CHI ’05, page 1200, Portland, OR, USA, 2005. ACM Press. [11] Lonni Besan¸con, Paul Issartel, Mehdi Ammi, and Tobias Isen- berg. Mouse, Tactile, and Tangible Input for 3D Manipula- tion. In Proceedings of the 2017 CHI Conference on Human

51 Factors in Computing Systems, pages 4727–4740, Denver Col- orado USA, May 2017. ACM. [12] Harry Brignull and Yvonne Rogers. Enticing People to Inter- act with Large Public Displays in Public Spaces. page 8. [13] Bill Buxton. The bulk of innovation is low-amplitude and takes place over a long period. Companies should focus on refining existing technologies as much as on creation. page 4. [14] Keith Cheverst, Alan Dix, Daniel Fitton, Chris Kray, Mark Rouncefield, Corina Sas, George Saslis-Lagoudakis, and Jen- nifer G. Sheridan. Exploring bluetooth based mobile phone interaction with the hermes photo display. In Proceedings of the 7th international conference on Human computer interaction with mobile devices & services - MobileHCI ’05, page 47, Salzburg, Austria, 2005. ACM Press. [15] Brianne Christopher. Explaining the Pepper’s Ghost Illusion with Ray Optics. [16] Mary Czerwinski, George Robertson, Brian Meyers, Greg Smith, Daniel Robbins, and Desney Tan. Large display research overview. In CHI ’06 extended abstracts on Hu- man factors in computing systems - CHI EA ’06, page 69, Montréal, Québec, Canada, 2006. ACM Press. [17] Michele Debczak. An Interactive Wall at Google Is Made From Thousands of Arcade Buttons Mental Floss. | [18] Paul Dietz and Darren Leigh. DiamondTouch: A Multi-User Touch Technology. page 11. [19] Alan Dix. Human–computer interaction, foundations and new paradigms. Journal of Visual Languages & Computing, 42:122–134, October 2017. [20] Jason England. Science of Pepper’s Ghost illusion, August 2018. [21] Géraldine Fauville. Digital technologies as support for learning about the marine environment: steps toward ocean literacy. Number 408 in Gothenburg studies in educational sciences. University of Gothenburg, Acta Universitatis Gothoburgensis, Göteborg, 2017. [22] Géraldine Fauville, Anna Carolina Muller Queiroz, and Jeremy N. Bailenson. Virtual reality as a promising tool to promote climate change awareness. In Technology and Health, pages 91–108. Elsevier, 2020.

52 [23] Feng Zhou, Henry Been-Lirn Duh, and Mark Billinghurst. Trends in augmented reality tracking, interaction and display: A review of ten years of ISMAR. In 2008 7th IEEE/ACM International Symposium on Mixed and Augmented Reality, pages 193–202, Cambridge, UK, September 2008. IEEE.

[24] Mike Fraser and Steve Benford. Supporting awareness and interaction through collaborative virtual interfaces. In Pro- ceeding of ACM UIST99, pages 27–36. ACM Press, 1999.

[25] Peter Galambos and Peter Baranyi. VirCA as Virtual Intel- ligent Space for RT-Middleware. In 2011 IEEE/ASME In- ternational Conference on Advanced Intelligent Mechatronics (AIM), pages 140–145, Budapest, Hungary, July 2011. IEEE.

[26] Jason Geng. Three-dimensional display technologies. Ad- vances in Optics and Photonics, 5(4):456, December 2013.

[27] Chris Hand. A Survey of 3D Interaction Techniques. Com- puter Graphics Forum, 16(5):269–281, December 1997.

[28] Robert Hardy and Enrico Rukzio. Touch & interact: touch- based interaction of mobile phones with displays. In Proceed- ings of the 10th international conference on Human computer interaction with mobile devices and services - MobileHCI ’08, page 245, Amsterdam, The Netherlands, 2008. ACM Press.

[29] Chris Harrison. The HCI innovator’s dilemma. Interactions, 25(6):26–33, October 2018.

[30] Card-Carey Gasen Mantei Perlman Strong Hewett, Baecker and Verplank. ACM SIGCHI Curricula for Human-Computer Interaction : 2. Deﬁnition and Overview of Human-Computer Interaction, August 2014.

[31] Otmar Hilliges, David Kim, Shahram Izadi, Malte Weiss, and Andrew Wilson. HoloDesk: direct 3d interactions with a sit- uated see-through display. In Proceedings of the 2012 ACM annual conference on Human Factors in Computing Systems -CHI’12, page 2421, Austin, Texas, USA, 2012. ACM Press.

[32] Ken Hinckley, Randy Pausch, John C. Goble, and Neal F. Kassell. Passive real-world interface props for neurosurgical visualization. In Proceedings of the SIGCHI Conference on Hu- man Factors in Computing Systems, CHI ’94, page 452–458, New York, NY, USA, 1994. Association for Computing Ma- chinery.

53 [33] Uta Hinrichs, Sheelagh Carpendale, Nina Valkanova, Kai Kuikkaniemi, Giulio Jacucci, and Andrew Vande Moere. In- teractive Public Displays. IEEE Computer Graphics and Ap- plications, 33(2):25–27, March 2013. [34] David M. Ho↵man, Ahna R. Girshick, Kurt Akeley, and Mar- tin S. Banks. Vergence–accommodation conflicts hinder visual performance and cause visual fatigue. Journal of Vision, 8(3):33–33, March 2008. Publisher: The Association for Re- search in Vision and Ophthalmology. [35] Liviu Iftode, Cristian Borcea, Nishkam Ravi, Porlin Kang, and Peng Zhou. Smart Phone: An Embedded System for Universal Interactions. In 10th IEEE International Workshop on Future Trends of Distributed Computing Systems (FTDCS’04, pages 88–94, 2004. [36] Saurabh Jadhav, Vikas Kannanda, Bocheng Kang, Michael T. Tolley, and Jurgen P. Schulze. Soft robotic glove for kines- thetic haptic feedback in virtual reality environments. Elec- tronic Imaging, 2017(3):19–24, January 2017. [37] G. Kurtenbach and G. Fitzmaurice. Guest Editors’ Intro- duction: Applications of Large Displays. IEEE Computer Graphics and Applications, 25(4):22–23, July 2005. Confer- ence Name: IEEE Computer Graphics and Applications. [38] Scott Lee. Conductive Paint Turns Window Into Interactive Touch Screen, October 2014. Section: News. [39] Yong Liu, Jorge Goncalves, Denzil Ferreira, Bei Xiao, Simo Hosio, and Vassilis Kostakos. CHI 1994-2013: mapping two decades of intellectual progress through co-word analysis. In Proceedings of the 32nd annual ACM conference on Human factors in computing systems - CHI ’14, pages 3553–3562, Toronto, Ontario, Canada, 2014. ACM Press. [40] Fabien Lotte, Josef Faller, Christoph Guger, Yann Renard, Gert Pfurtscheller, Anatole Lécuyer, and Robert Leeb. Com- bining BCI with Virtual Reality: Towards New Applications and Improved BCI. In Brendan Z. Allison, Stephen Dunne, Robert Leeb, JoséDel R. Millán, and Anton Nijholt, editors, Towards Practical Brain-Computer Interfaces, pages 197–220. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012. Series Title: Biological and Medical Physics, Biomedical Engineer- ing. [41] Fabien Lotte, Aurélien van Langhenhove, Fabrice Lamarche, Thomas Ernest, Yann Renard, Bruno Arnaldi, and Anatole

54 Lécuyer. Exploring Large Virtual Environments by Thoughts Using a Brain–Computer Interface Based on Motor Imagery and High-Level Commands. Presence: Teleoperators and Vir- tual Environments, 19(1):54–70, February 2010. [42] David Margery, Bruno Arnaldi, and NoëlPlouzeau. A General Framework for Cooperative Manipulation in Virtual Environ- ments. In W. Hansmann, W. T. Hewitt, W. Purgathofer, Michael Gervautz, Dieter Schmalstieg, and Axel Hildebrand, editors, Virtual Environments ’99, pages 169–178. Springer Vienna, Vienna, 1999. Series Title: Eurographics. [43] Ryan P. McMahan, Doug Gorton, Joe Gresock, Will Mc- Connell, and Doug A. Bowman. Separating the e↵ects of level of immersion and 3D interaction techniques. In Proceedings of the ACM symposium on Virtual reality software and technology - VRST ’06, page 108, Limassol, Cyprus, 2006. ACM Press. [44] Muhanna A. Muhanna. Virtual reality and the CAVE: Taxon- omy, interaction challenges and research directions Elsevier Enhanced Reader. | [45] Anna Carolina Muller Queiroz, Amy Kamarainen, Nicholas Preston, and Maria Isabel Da Silva Leme. Immersive Vir- tual Environments and Climate Change Engagement. ISBN: 9783851256093 Publisher: Verlag der Technischen Universität Graz. [46] Yoichi Ochiai, Kota Kumagai, Takayuki Hoshi, Jun Reki- moto, Satoshi Hasegawa, and Yoshio Hayasaki. Fairy Lights in Femtoseconds: Aerial and Volumetric Graphics Rendered by Focused Femtosecond Laser Combined with Computational Holographic Fields. arXiv:1506.06668 [physics], June 2015. arXiv: 1506.06668. [47] David Ockwell, Lorraine Whitmarsh, and Sa↵ron O’Neill. Re- orienting Climate Change Communication for E↵ective Miti- gation: Forcing People to be Green or Fostering Grass-Roots Engagement? Science Communication, 30(3):305–327, March 2009. [48] Peter Peltonen, Esko Kurvinen, Antti Salovaara, Giulio Jacucci, Tommi Ilmonen, John Evans, Antti Oulasvirta, and Petri Saarikko. It’s Mine, Don’t Touch!: interactions at a large multi-touch display in a city centre. In Proceeding of the twenty-sixth annual CHI conference on Human factors in computing systems - CHI ’08, page 1285, Florence, Italy, 2008. ACM Press.

55 [49] Ken Pimentel and Kevin Teixeira. Virtual reality through the new looking glass. 1993. [50] Cátia Pinho, Isiaka Alimi, Mário Lima, Paulo Monteiro, and António Teixeira. Spatial Light Modulation as a Flexible Platform for Optical Systems. Telecommunication Systems - Principles and Applications of Wireless-Optical Technologies, September 2019. Publisher: IntechOpen. [51] Mark Price. Satellite tag reveals research shark was caught and killed The Sacramento Bee. | [52] qm13. Azure Kinect body index map. [53] Somaiieh Rokhsaritalemi, Abolghasem Sadeghi-Niaraki, and Soo-Mi Choi. A Review on Mixed Reality: Current Trends, Challenges and Prospects. Applied Sciences, 10(2):636, Jan- uary 2020. [54] Max Roser, Cameron Appel, and Hannah Ritchie. Human Height. Our World in Data, October 2013. [55] A. Schmidt, B. Pfleging, F. Alt, A. Sahami, and G. Fitz- patrick. Interacting with 21st-Century Computers. IEEE Pervasive Computing, 11(1):22–31, 2012. [56] Gyanendra Sharma and Richard J. Radke. Multi-person Spa- tial Interaction in a Large Immersive Display Using Smart- phones as Touchpads. arXiv:1911.11751 [cs], November 2019. arXiv: 1911.11751. [57] Super 78 Studios. Immersion Tunnels. [58] Bent Stumpe and Christine Sutton. The first capacitative touch screens at CERN, March 2010. Section: High-energy physics. [59] Tech-faq. How LCD Projectors Work. [60] Edward Tse, Chia Shen, Saul Greenberg, and Clifton Forlines. Enabling interaction with single user applications through speech and gestures on a multi-user tabletop. In Proceedings of the working conference on Advanced visual interfaces - AVI ’06, page 336, Venezia, Italy, 2006. ACM Press. [61] Florian Vogt, Justin Wong, Sidney Fels, and Duncan Cavens. Tracking Multiple Laser Pointers for Large Screen Interaction. page 3. [62] Votanic. Immersive cave system. [63] Mark Weiser. The Computer for the 21st Century. page 8.

56 [64] Shumin Zhai and Paul Milgram. Quantifying coordination in multiple DOF movement and its application to evaluating 6 DOF input devices. In Proceedings of the SIGCHI conference on Human factors in computing systems - CHI ’98, pages 320– 327, Los Angeles, California, United States, 1998. ACM Press.