EAI Endorsed Transactions on Creative Technologies Research Article Using Natural User Interfaces for Previsualization

Rainer Malaka, Tanja Döring∗, Thomas Fröhlich∗, Thomas Muender∗, Georg Volkmar∗, Dirk Wenig∗, Nima Zargham∗

Digital Media Lab, TZI, Bremen, Germany [email protected]

Abstract

INTRODUCTION: An important phase in the process of visual design for the narrative media is previsualization (previs). Professionals use complicated 3D software applications that are not especially designed for the purpose of previs which makes it difficult for the artists and non-technical users to create previs content. OBJECTIVES: The aim is to empower artists to express and visualize their ideas and creative capabilities in an optimal way. METHODS: We suggest using natural user interfaces (NUIs) and discuss suitable interactions for different previs tasks. We developed and evaluated a series of individual prototypes as well as a central overarching prototype following our NUI concepts. RESULTS: The results show that our NUI-based interaction methods were perceived highly positive and experts found it valuable for their work. CONCLUSION: With only a brief familiarization phase, NUIs can provide a convenient substitute to traditional design tools that require long training sessions.

Received on 04 March 2021; accepted on 15 March 2021; published on 16 March 2021 Keywords: Previsualization, Natural , Virtual Reality, , Film, Theatre, Visual Effects Copyright © 2020 Rainer Malaka et al., licensed to EAI. This is an open access article distributed under the terms of the Creative Commons Attribution license, which permits unlimited use, distribution and reproduction in any medium so long as the original work is properly cited. doi:10.4108/eai.16-3-2021.169030

1. Introduction requires proficient technical experts excluding the non- technical, imaginative users namely artists, designers, Previs is known as an indispensable and critical process and directors from the previs process. This requires that could increase the production value of a project [1, various design iterations and constant communication 2]. It has been adopted for decades in disciplines such as between the technical experts and the creative staff [5]. film, product design, and architecture and is growing in This could be problematic due to the fact that most popularity because of technological improvements [3]. creatives are used to be in control of the design process Most of today’s previs task is handled digitally and by drawing, writing, or sketching. despite all the potential advantages for production, it Current 3D software programs used by experts for is time consuming and requires trained personnel. The previs purposes provide an extensive set of features and tools used for the purpose of previs are frequently 1 2 3 functionalities, many of which are for highly specific complex 3D tools such as Maya , Auto CAD , use cases that are not really essential for previs. The and recently game engines such as Unity4 and 5 collection of all these features within a 2D interface Unreal [4]. The use of such tools is time consuming and makes these tools complex and could frustrate users that intend to use them only for the purpose previs. Furthermore, several specific functions required for ∗Authors are listed alphabetically previs may not be supported by standard tools and 1https://www.autodesk.com/products/maya 2 a great deal of extra effort might be needed to find https://www.autodesk.com/products/autocad workarounds. 3https://www.blender.org/ 4https://unity3d.com/ In order to design a previs software that particularly 5https://www.unrealengine.com provides the necessary functions for the previs process,

1 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Rainer Malaka et al. investigating the tasks that are specifically carried 2. Related Work out in digital previs is crucial. In a survey study amongst practitioners [5], we identified and analyzed 2.1. Natural User Interfaces a persistent collection of previs tasks and the main The term natural user interfaces refers to a broad functionalities that a reliable previs software should collection of interactive technologies that offers rich support. Previs applications user base also comprises ways for interacting with the digital world using of a broad range of users in different roles who existing human capabilities for communication and have different workflows and stories: what is wanted human capacity to manipulate the physical world [7]. for animation is not necessarily what is wanted for Although it still lacks a common definition since the theatre [5]. understanding of this term is subjective to the constant We believe that users should have the possibility to development of the technology [8]. However, in the choose their interaction method to use for different literature, there are recurring aspects of NUIs that previs tasks and select the one best fitted for their appear to establish a shared view on natural user creative needs. This way, the artist can express their interfaces. NUIs utilize the capabilities of human beings creativity and be more focused on their previs task to express themselves with their body movement, voice without having to face an inflexible interface that does and gestures, thus described as intuitive [9] [10]. The not specifically consider the needs of an artist. To term NUI encapsulates any technology that allows users involve non-technical people more into the previs cycle to interact in a more natural or humanistic way with and empower artists to use a digital previs software, computer systems [11]. we propose using natural user interfaces (NUIs) for Early research on NUIs roots back to the 1980s, previs tasks to empower users in how they can express where they introduced the “Put that There” system, an themselves in an optimal way depending on the task interface controlled by voice and gesture [12]. In the they want to perform in the previs cycle. NUIs have beginning, the term NUI was mainly used for (multi- the potential to minimize the effort needed to translate )touch gestural interaction. Eventually, researchers and user’s actions and provide increased direct natural designers worked with different modalities such as expressiveness [6]. speech, gesture3D, manipulation, and virtual and In the pages that follow, we examine the potential augmented reality (AR), expanding the terminology. NUIs to support intuitive interfaces for the purpose of Some popular consumer products that have been previs. Moreover, we provide a comprehensive analysis mentioned a lot with reference to natural user interfaces of digital previs in practice within the application are the Wii6 and the for Xbox7, domains of film, animation and theatre. We then both vision-based devices that indirectly or directly propose recommendations to support such tasks with support full-body movements for system input [13]. NUIs and create the natural interaction concept by The premise of NUIs is to let computers understand mapping both users’ needs and the previs task the innate human means of interaction and not induce characteristics to meaningful and natural interaction humans to train to the language of computers [14]. They techniques. In order to examine our interaction concept, aim to provide a smooth user experience where the we developed and evaluated a series of individual technology is almost invisible [15] and make users learn prototypes which we refer to as ’vertical prototypes’, the interactions as quickly as possible. Liu describes as well as a central overarching prototype which we the characteristics of NUIs as being user-centred, multi- call the ’horizontal prototype’. The prototypes were channel, inexact, high bandwidth, voice-based, image designed within the framework of the EU-project based, and behaviour based [16]. “first.stage”. First.stage is a previsualization project that A very similar concept to natural user interfaces is researches and designs natural user interfaces and tools Reality-based Interaction (RBI) by Jacob et al. [17]. for previs in order to make it accessible to non-technical They proposed RBI as a unifying concept that ties users. together a large subset of the evolving human-computer Using these prototypes, we aim to find how to interaction techniques. The RBI framework allows integrate NUIs in the previs process of the creative to discuss aspects of the multitude of current user personnel and how NUIs affect the production process interfaces “beyond the desktop” in a structured way in comparison to common 3D software. by identifying and analyzing aspects of reality and The main contribution of this work is to provide a computational power that are useful for interaction. natural user interface concept that is built around a set The authors argue that the new interaction styles make of interaction methods which are chosen specifically to use of users’ pre-existing knowledge of the everyday, the nature of the diverse previs tasks. The concept is based on the requirements of the application areas and incorporates a detailed review of the literature. 6http://wii.com/ 7https://www.xbox.com

2 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Using Natural User Interfaces for Previsualization non-digital world. Their assumption is that “basing is the combination of performance animation with interaction on pre-existing real world knowledge and physics simulation. However, the combination of both skills may reduce the mental effort required to operate technologies is not trivial. An approach is presented in a system because users already possess the skills a natural user interface that combines motion capture needed” [17]. using the Kinect sensor and physics simulation for Although you can see the use of humans’ experiences character animation by Liu et al [24]. They introduce a from the physical world to a stronger degree in most framework that combines both technologies to prevent recent user interfaces, computer tools and applications conflicting inputs from users’ movements and physics would not be as powerful, if systems were based purely engine. Their approach and framework has been on reality-based themes. The “computational powers” extended by Shum and Ho (2012) who present a more of computers such as Expressive Power, Efficiency, flexible solution to the problem of combining physical Versatility, Ergonomics, Accessibility and Practicality and motion capture information [25]. are undoubtedly advantageous [17]. That makes it Another area of complex 3D interaction is the field challenging for the current user interface design to of digital modelling and sculpting. There is a strong combine and find a balance between the computational need for tools that allow for natural expression in power and reality. Jacob et al. propose that “the goal is digital content creation, especially for previs, as many to give up reality only explicitly and only in return for productions start with zero assets. This means, that other desired qualities” [17]. assets, objects, and props have to be created from scratch most of the times. For instance, Herrlich et 2.2. NUIs for Performance Animation and Modelling al. investigated interface metaphors for 3D modelling and virtual sculpting [26]. The first implementation The common animation systems are too complex of virtual sculpting was presented by Galyean and for the high demand in animated content today. Hughes (1991) [27]. They used a custom force-feedback To improve the accessibility and efficiency of such system that would translate the absolute positions of systems, researchers suggest an HCI perspective on the into a 3D mesh. The approach was computer animation [18]. An interaction perspective further extended by Chen and Sun (2002) and Galoppo on computer animation can help to construct a design et al. space of user interfaces for spatiotemporal media. by implementing virtual sculpting using a stylus Many researchers have investigated the use of NUIs device and a polygonal mesh instead of a voxel for performance animation and modelling. Lee et al. approach [28] [29]. Wesson and Wilkinson implement presented the implementations of full-body input as a a more natural approach by using a Kinect sensor for natural interface to character animation by extracting deformation of a virtual mesh, while also integrat- user silhouettes from camera images [19]. Chai & ing speech commands for a more fluent user experi- Hodgins [20] introduced an approach for performance ence [30]. Natural animation can also be approached animation based on optical marker tracking. by using Virtual Reality (VR) technology. For instance, An affordable approach to performance animation Vogel et al. designed a VR system for animation where is presented by Walther-Franks et al. [21] with the users work with a puppeteering metaphor for character “Animation Loop Station”, allowing users to create animation [31]. They evaluated their tools with anima- character layer by layer by capturing users’ tion experts and found that it improves the speed of the movements with Kinect sensors. In their proposed workflow and fast idea implementation. system, a speech interface is included so that users can fluently work on their animation without the need for a 3. Previs Task Analysis graphical interface. In another work by Walther-Franks et al. [22], authors introduce the Dragimation technique Typical previs tasks include scene layout, camera work, which allows users to control timing in performance- animation, and special effects [32]. To better develop based animation on 2D touch interfaces where they can a software that is specifically suitable for the purpose directly interact on the characters instead of a timeline. of previs and covering the fundamental functions The authors found that Dragimation performs better needed for this process, we examined the tasks that with regard to learnability, ease of use, mental load, and are specifically operated in digital previs [5]. In order overall preference compared to timeline scrubbing and to do this, we first collected user requirements by a sketch-based approach. This system was inspired by interviewing experts from the domains of film (n=5), the work of Moscovich et al. who introduced a rigid animation (n=3) and theater (n=8), using scenarios, body deformation algorithm for multi-touch character personas, workflow generation, prioritization, and animation [23]. categorization, and extracted the everyday tasks and An interesting approach to augment the own work-flows carried out in digital previs. After we character animations with rich secondary animations gathered the requirements from the experts in the

3 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Rainer Malaka et al. interviews, we arranged a workshop with the experts interaction, and speech as interaction techniques that of the domain. We used the MoSCoW Analysis [33] would fit for a NUI based previs software. Here we to obtain the functional requirements which were not discuss these techniques in more detail and show how identified during the interview. We outlined a set they fit to the respective tasks. consisting of the requirements which were collected Direct Manipulation via Touch. 2D interfaces that offer within the interviews and the workshop. We produced direct manipulation are intuitive and provide high index cards and employed color coding to mark the accuracy. As with the high penetration of iOS and origin of every collected requirement for the reasons of Android touch devices, these interfaces prove to be very clarity and organization. intuitive for a large user base and naturally capture user Overall, we identified a set of 118 requirements. intent on 2D screens via direct manipulation metaphors Based on the collected information, we identified 20 and 2D gestures. For tasks such as import/export, functionalities as core features that an efficient previs management, and project structure, a 2D touch software should support. Although, one can not directly interface is suitable and natural as it benefits from translate all these core features into previs tasks, since current mental model of users which they employ for a few of these are general functions such as the ability 2D content manipulation. to support multi-user and visual effects (VFX). Hence, we reduce the number of core functionalities to those 3D Direct Manipulation. Direct manipulation is well that are essential for previs and can be applied to all suited for the interaction with 3D content as it is application areas. presented in a model-world interface which reflects to Result of this reduction presented the 9 core previs the real world. Having real-world metaphors for objects tasks: Sketching (Modelling), Project Structure, Shot and actions can make it easier for a user to learn and use Management, Import/Export, Camera Control, Assets an interface. On top of that, rapid, incremental feedback and Layout, Animation, Lighting and Posing, and allows a user to make fewer errors and complete tasks in Visual Effects. less time, as they can see the results of an action before For further information regarding this procedure, completing it. The user of a well-designed model- please refer to Muender et. al [5]. world interface can wilfully suspend belief that the objects depicted are artifacts of some program and 4. Natural Interaction Concept can thereby directly engage with the world of the objects. This is the essence of the “first-person feeling” In this section, we present our Natural Interaction con- of direct engagement. This interaction technique is cept that builds the foundation for the implementation natural and well suited for all tasks that require object of our vertical and horizontal prototypes. We designed manipulation in 3D space using stereoscopic view, such our core interaction techniques based on the core previs as modelling, assets and layout, camera control and tasks that we identified, each one considering the natu- motion, animation and posing, and lighting. In practise, ral aspects of the task and their context of use. there can be situations when it is not possible for In Figure 1 we provide a graphical overview of our users to interact with certain objects directly. This is overall interaction approach based on the previs tasks especially relevant for groups of objects which can mentioned earlier. We open up a 2D/3D space where be spread across the scene, objects that are far away we fit the task affordances to 2D and 3D interaction from the user or are too small for direct interaction, techniques and select the hardware correspondingly. and objects that are completely or partially occluded Our ideas and concept development are driven by by other objects. In order to overcome such issues, the variety of feedback on naturalness during the we implement an addition to the direct manipulation requirements elicitation that has been performed and concept: surrogate objects. our first-hand experiences with users that tested our A surrogate object is a reification of one or more prototypes. We further present additional interaction domain objects that the user intends to interact methods that complement our concept generation. with [34]. Other than the object within the scene, a surrogate object can always be presented in the field 4.1. Interaction Techniques of view of the user and within the interaction space, It is crucial for a previs software to include the core reachable by touch or VR controllers. The idea of functionalities of a previs task, “those functionalities of surrogate objects shows that by extending the direct the product without which, the product is not useful for manipulation concept in such way, the limitations of the users” [33]. Based on the requirement analysis, we direct manipulation can be overcome [34]. suggest direct manipulation via touch on 2D interfaces, Object-oriented User Interfaces . State-of-the-art 3D 3D direct manipulation, object-oriented user interfaces, tools often come with overloaded Windows, Icons, spatially-aware displays, tangible interaction, AR, Menus and Pointer (WIMP) interfaces. Such interfaces full body and embodied interaction, hand-based are not suitable for typical users of previs software.

4 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Using Natural User Interfaces for Previsualization

Figure 1. From previs tasks over interaction techniques to technologies

Previs tasks require specialized training and are often how they are presented and behave. In the context of time consuming. As different input and output devices what users are trying to do, all the user interface objects often require an entirely different interface, it may be fit together into a coherent overall representation. required for the user to learn three different interface OOUI can reduce the learning curve for new users variants. These arguments contradict the requirement as only relevant actions and options are displayed. that the software is suitable for non-technical persons It is possible to display an object related interface and can be learned quickly. A common practice in with a similar visual representation on different output standard WIMP interfaces is to use buttons for all devices such as tablet or VR. This can be rendered as possible actions. If an object is selected, the actions that part of the application and does not need a WIMP GUI. are not supported for this object are deactivated (in Furthermore, the interaction with such an interface can grey colour) but still presented to the user. This leads be designed to be similar with different input devices. to cluttered interfaces, which can easily overwhelm users. In addition, buttons are commonly spread across the interface leading to a loss of association with Spatially-aware Displays . These are displays that the intended object that user wants to manipulate. have information about their position and orientation Based on these observations, we implement object- in the room either by relative differences (gyroscope oriented user interfaces (OOUI). These are types of user and compass data) or access to absolute position interfaces in which the user interacts explicitly with and rotation data using tracking devices. Using such objects that represent entities in the domain of the displays, which are mainly tablet devices, one can application. They can be seen as the counter approach create a direct access to a virtual scene by putting for function-oriented interfaces that are normally used the control over the virtual camera directly in the in 3D applications. We propose a system in which hands of the users. They can then move this device objects have a dedicated interface only displaying the across the room in order to change position and actions available for this specific object. orientation of the virtual camera. This approach can The interface is positioned relative to the object it is be complemented with basic 2D gestures in order to corresponding to, rather than fixed button positions in extend the interaction space. Previs tasks that make the a static WIMP interface. Object oriented user interfaces most out of this interaction technique are camera work are proven more user friendly compared to other and camera motion. Using spatially-aware displays, interface paradigms and provide several advantages in users can position cameras in a virtual scene and terms of usability [35] . In addition, the relation to navigate in 3D while at the same time having the the object which will be affected by an action is better resulting shot visible through the display at all times for understandable. Users can also classify objects based on high efficiency. Thus, it is very natural to frame a shot in

5 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Rainer Malaka et al. a virtual scene using real-world references by walking switching to a specific camera view, or switching in a real room and orienting the device as needed. between different views and zoom factors.

Marker Positioning (Mobile AR) . With augmented 4.2. Interaction Principles reality, real images and projections of real scenes on screens such as mobile and tablet devices can be Having introduced our core interaction techniques, augmented with digital information. This bridges the we now present interaction principles that enhance real world with the virtual world, allowing for an the interaction by implementing the notion of “how” immersive experience. Marker-based AR could be used the interaction in our tools should be. We organize for layout related tasks that require multiple users these into general principles that should apply to all to work together in a virtual scene. The markers prototypes and specific principles that will only apply connect physical objects with digital media and enable to certain, focused prototypes presented in this paper. interaction with the data via interaction with the General Interaction Principles. objects. These interfaces are intuitive and easy to learn Task Context Depending on what users are doing, since users can directly observe the actions on the it is important to find natural interaction for each digital data which have direct mapping to their actions. specific scenario. An interaction is natural depending Full-body Interaction and Embodiment . Embodied on its context of use. Organizing a digital project interaction relies on the integration of interaction such as files on Windows Explorer or Finder is between humans and computers into the material and more natural on a 2D interface as it stems from the social environment. The recording of motion capture digital domain of graphical user interfaces that most data allows for full body interaction and embodied computer users are familiar with. As Jakob Nielson performance animation through inertial sensors that suggests, It may be difficult to control a 3D space can be worn in the form of a suit. This makes with the interaction techniques that are currently in 8 it easy and natural to record character animations common use for 2D manipulation . It would be very since users can benefit from using their own body different when trying to organize, copying, duplicating as input, directly transferring their motion data to or moving files in VR using other than 2D metaphors. virtual characters. In several previs cases and especially An example for this is the “minority report vision” for character animation, this can make complex and that has often been used to exemplify the use of expensive keyframe animation techniques superfluous. gesture and 3D interfaces for desktop tasks. Studies Such interaction techniques are natural as they do not have shown that these kinds of tasks are slower to rely on a translation between user intent and action. perform in 3D and are cognitively more complex to This approach has a low learning curve regarding the achieve than with 2D GUIs [36]. We pick up on this animation and almost no graphical interface is needed. notion and motivate a natural use depending on the context. For instance, arranging 3D objects in space as Hand-based Interaction . With hand-based interaction well as exploring spaces, getting sense of scale, and users can perform manipulations to digital objects picking shots for cameras are most natural done in using their own hands without using digital tools by VR. On the other hand, project organization is best tracking the hand and finger motion and applying the done on 2D interfaces. Transforming and working with data for the manipulation of the virtual object. We digital content for organization is, as previously stated, employ hand-based interaction for rapid prototyping most natural using GUIs. Another example is motion and modelling of 3D content and objects in the capture. Animating humanoid characters can be done modelling and layout tasks. Here we focus on the in different ways. It can be done by key-framing a workflow of modelling and layout where we support 3D character or by drawing frame by frame. Both has users in “getting their ideas quickly out of their head” advantages and disadvantages, but looking for a natural by removing a (noUI), allowing way of interaction, using the own body is the only them to concentrate on implementing their vision in way of having a direct 1-to-1 relationship between either a 3D model or scene layout. user intend and desired outcome. Using the motion suit, users can work with their own body without Speech . Speech commands are used as a supportive having to understand complex key-framing or drawing layer that can be integrated as a secondary interaction techniques, making it more natural to create animation to complement a primary interaction. For instance, content and providing a low learning curve. in layout, users can filter and select the asset library Easy to Use In order to make a previs software using speech commands. Generic commands include accessible to users with little technical knowledge, deletion, menu interaction, and basic manipulation tasks that are expressive by speech but hard to express through manual interaction, such as flipping an object, 8https://www.nngroup.com/articles/2d-is-better-than-3d/

6 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Using Natural User Interfaces for Previsualization special aspects should be considered to make the Specific Interaction Principles. software easy to use. Interactions should not be Reality-based The interaction in 3D should orient designed to be as fast as possible but plausible and towards the interaction with objects in the real world. intuitive to the user. The system should provide As the interaction in 3D space is still novel and is feedback for the actions and help users to follow the not a known for many users, it should utilize their intent. This is especially important when an action everyday knowledge [17]. This applies to positioning involves multiple sequential interactions. In order to and rotating objects: Small objects like a bottle can not overwhelm the user with options, only a minimal be grabbed, rotated and placed with one hand. Bigger amount of possible options should be shown which can objects like a table or houses on the other hand have be achieved by using interaction scaffolding and nested to be pushed and rotated using two hands. Falling interactions. objects stop when in contact with the ground or other objects. This behaviour should be supported as it is Consistency The interactions should be consistent perceived as natural to the user. Another aspect that within the application. Ideally, users only have to learn should be reality-based is the interaction with buttons, a minimal set of interactions. It should be avoided to knobs and slider elements from the object-oriented use different interaction schemes for the same tasks in interface. A clear 3D representation of the intractable a different context. For example, positioning an object object should be provided comparable to real life light by “grab and place” should work the same way in switches, volume knobs and radio controls. The visual layout mode and in animation mode. The interaction representation should present feedback of the current should be as consistent as possible across different input state of the control. devices. This will help the user to seamlessly switch Playful and Fun Playing is intrinsically motivated between devices and not have to learn or remember and autotelic[37]. When users are provided with a special interactions for this input method. In some user interface that supports playful expression, they cases, this will not be possible as the different input can explore, be creative, try different solutions, and devices provide different input modalities and degrees find joy and amusement even in productive contexts. of freedom. But a primary interaction (left click, tap This increases long-term motivation and lowers the or trigger button) should perform the same action frustration barrier [38] which is achieved by more on all devices. Furthermore, the interaction should be pleasurable interaction rather than optimizing for consistent with other applications from the field, so speed and being goal oriented. Playful interaction that the user who has learned to interact with another invites users to discover features rather than frustrating tool does not get confused by a completely different them with an overloaded interface. Therefore, it is interaction scheme. This means the application should suitable to support novice users and users with little not break with interaction standards from the field, e.g. technical knowledge. support drag and drop. Creativity Support In order to support the creative Feedback The system should provide feedback for process of the user creating a virtual set, animation or every action performed by the user. Visual feedback camera shot, the interaction with the software can be should be provided on all hardware platforms. If the designed accordingly. The system should invite users to object the user is interacting with is currently not in interact with objects in a natural manner rather than the field of view, and visual feedback is not visible, telling them what to do. This can be achieved using assisting indicators at the edges should indicate where affordances [39]. Designing subtle affordances invites the interaction is happening. Haptic feedback should the user to discover through exploration. Presenting be utilized when interacting in virtual reality using the users with different viewpoints on a scene or with the vibration functions of the controllers. Audio feedback sequence of their actions can support them in their is also helpful in certain situations especially when iterative and evaluative process. something is happening outside the field of view. Rapidness, Accuracy and Precision The interaction used should be rapid, avoiding time-based interaction Nested Interaction Rather than putting all actions and large motions. Time based interaction disrupts onto different buttons, the system should utilize nested the workflow, especially for experienced users. For interactions which chains the selection of an action, instance, “look and hold” or “point and hold” parameter finding and performing the action into a interactions could easily frustrate users. In contrast to series of small lightweight interactions. Sequenced the general schema of rapid interaction, it should be actions should be designed so that experienced users possible to improve precision with additional effort. can perform them very quickly and do not perceive For instance, scaling up in order to work on detailed them as impairing. The buttons in a sequenced action structures. should be positioned in close distance, following the Multi-User Handing over a project to others direction of motion. is complicated and requires a lot of management.

7 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Rainer Malaka et al.

However, working together in the same physical context where a common understanding is shared in the physical space through observation and communication of intentions and tasks is natural. This natural interaction could be achieved by providing input to the system that is not bound to one device. For instance, in a multi-user scenario, different devices can be used in a shared context in order to achieve a common goal.

5. Concept Implementation and Evaluation In this section, we present research prototypes developed and evaluated during the project. These Figure 2. Camera prototype with yellow virtual camera (right). prototypes were built to evaluate individual interaction View finder is attached to left controller (left). concepts and principals.

5.1. Virtual Reality for Previsualization In order to support the non-technical professionals to work with 3D content and allow for more expressive interaction, we suggest using virtual reality for previs. In contrast to traditional 2D interfaces, VR offers immersion, illusion of embodiment, and illusion of physical interaction for a more natural interaction (reality-based). We developed two prototypes in VR and Figure 3. Left: The tangible scene design concept. Right: The conducted a user study with non-technical domain 3D-printed tangibles with the highest fidelity experts to investigate how they experience VR use for the purpose of previs. Our first prototype was a tool that focused on camera work for taking shots. The second one was meant can intuitively create previs content (creativity support) for exploring stage designs and experimenting with using natural forms of interactions to create and delve stage equipment in a theater. These two scenarios were into the 3D scenes. The prototype was used to evaluate chosen as they interest professionals from all relevant how the tangibles with distinct haptic fidelity can effect domains such as film, animation, and theater by the performance, immersion, and intuitive interaction covering important aspects of previs such as exploring for creating a 3D scene in VR. In a user study (N = 24), a virtual scene, layout, and shot finding. as well as an interview with eight experts in previs, Our results showed that the participants were tangibles with distinct production processes were com- predominantly positive towards VR for previs and rated pared (uniform objects without resemblance to digital it as a useful technology. All users could perform typical object, 3D-printed and Lego-build tangibles similar to 3D previs tasks even after a very short learning time. the digital object). Using the prototypes also convinced participants of the Our results showed that Lego was preferred as practical use of VR for previs and removed certain it provided fast assembly and adequate fidelity. doubts [3]. No significant differences were observed in terms of haptics, grasping precision and the perceived 5.2. Tangible Scene Design performance. Professionals pointed out that digitally planning and adopting scenes is major benefit that Using physical objects and models to visualize scenes could save a lot of time (rapidness) [40]. is a common practice among professionals. Technolo- gies such as VR have the ability to provide immersive 5.3. Collaborative Scene Design across Virtual experiences with an accurate dept and scale percep- Reality and Tablet Devices tion. Nonetheless, such technologies usually lack the tangibility aspect. In this work, we combined the VR Through the use of head-mounted displays (HMD’s) technology with tangibles to utilize the immersive expe- users can perceive 3D content with believable depth rience with the intuitive controls of miniature mod- sensation, being able to walk around and look around els, following the concepts of 3D direct manipulation virtual objects. Further, room-scale tracking allows and full-body interaction and embodiment. We aimed at for natural movement (in a restricted area) and providing an interface where the non-technical users object manipulation in a natural and direct way.

8 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Using Natural User Interfaces for Previsualization

VR controllers are used as extensions of the own controlled by physically moving and turning the tablet. body, employing easy and intuitive ways interaction In addition, the camera can be moved forward and by grabbing, holding, rotating and placing objects in backward in view-direction by a slider. The camera VR. However, there are aspects in VR that limits the motion can be recorded and replayed. This prototype usefulness of the approach. Specifically in collaborative is intended to give a more natural feeling (reality-based) design, where several users work together in achieving a while controlling virtual cameras by giving the user common design task, VR may restrict the collaboration. the tablet a physical device as a viewfinder for the For instance, in VR users are often isolated and are camera. The prototype was evaluated by three experts not at the same location with other users. To share a in the context of film productions. The participants common working space and work together, they are got an introduction to the tool and afterwards got a commonly connected with other users in VR. This task to record a defined camera motion. In the end interaction is usually done remotely rather than a they were free to continue using the tool on their own. shared office space as most places do not host more Semi-structured interviews were conducted with the than one or two VR sets. This may have effects on the participants to find out about their experience using the collaboration and work distribution between users. tool. In order to address these concerns, we aimed at The results showed that the tool was perceived highly combining the advantages of VR and tablet interaction positive by the experts and they found it intuitive to build a system where users can collaborate using as they are used to having something physical in both devices in a shared environment utilizing direct their hands as a viewfinder. All participants were able manipulation via touch and 3D direct manipulation. to record the defined camera motion in under five We built a prototype where two users can work minutes. together simultaneously (multi-user) using VR and a tablet device in an attempt to improve work efficiency and rapidness (see fig 4). In a lab-study with 18 participants (9 teams) we investigated the impact of the device-dependent interactions on user behavior. Results showed that there are device-dependent differences in the interaction style that also influence user behavior. For instance, tablets were primarily used for overview and rough positioning, while VR was mainly used for smaller object manipulation. However, we did not observe a device-dependent “lead” role as the tasks were mainly distributed and worked on in parallel [41].

Figure 5. The tablet camera prototype

5.5. Embodied Interaction for Animation Capture Another important part of previs is character ani- mation. Only by providing believable and interesting animations, previs becomes a convincing tool for pro- duction planning. However, the creation of character animation is a task for expert animators using complex tools that requires years of experience. Motion captur- Figure 4. Witch house scene from the tale Hansel and Gretel ing could be a possible solution for such problems. created during the study However, there are still several issues with motion capture. One of the biggest is that individuals who work alone have problems with animating a scene completely without context. When wanting to create a scene with 5.4. Tablet Camera Prototype multiple people interacting with each other, different animations from each character has to be recorded with Following the concept of spatially aware displays, the the other movements in mind so that the complete scene tablet camera prototype provides the view through makes sense after editing. In order to overcome this a virtual camera into a scene. The camera can be issue, we developed a VR user interface for capturing

9 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Rainer Malaka et al. embodied animations. Embodied interactions resemble to record the more complex animations without the bodily experiences that every human is familiar with cues. (easy to use, reality-beased). In contrast to traditional interfaces for motion capture, this system enables users to record animations from the perspective of their own body, to slip in any other body (human or not) and perform animations from their perspective (see fig 6). We performed a preliminary user study with 16 participants. Most users quickly became familiar with the VR user interface and interaction styles. We observed that even older participants could sufficiently work in VR after having a short familiarization phase. Nonetheless, we observed the need for another interface when it comes to integrating the tools in a larger working context. Figure 7. Anticipation cues: When recording an animation, the cues are displayed in the field of view of the user at the target location.

5.7. Playful Shooter for Fast and Easy Scene Design Generating self-made content, e.g. for video games, allows players to creatively extend their gaming experience with additional maps or mods. Besides the player enjoyment of the additional content, production studios can benefit from user generated content since long-term motivation and playability is supported, at Figure 6. Users had to perform two basic and two advanced tasks the same time production costs for studios are minimal. using a motion capture suit. Basic tasks included interaction with For this, games that support user content are often the two objects, a green sphere and a red box. accompanied with tools for map or mod making like Valve’s Hammer Editor 9. However, these world or map builders can be cumbersome to use because of complex user interfaces with a high learning curve and limited 5.6. Anticipation Cues for Motion Capture UX. In order to overcome this, we introduce a world To further develop the embodied interaction for anima- builder game where the process of creating a 3D scene tion capture, we integrated anticipation cues in the is play itself. Users of are able to navigate directly form of target indicators, progress bars, and audio feed- in the 3D scene, placing objects in a playful manner back to help the user to record animations in sync with by shooting them in the scene for placement and previously recorded animations or other animations in manipulation. We implemented four weapons: A gun the scene. To design a prototype integrated with the for physics enabled placement, a laser gun for more anticipation cues, we interviewed three experts from precise positioning and manipulation, a sniper rifle the animation domain. These cues offer multi-sensory for far distance interaction, and a hand grenade for indicators for where and when to start the next anima- spawning multiple objects at the same time using an tion. In a first step, the anticipation cues can be set by explosion, all allowing for playful interaction with the the user at the desired time and location. When going 3D content. We assumed that users would find the into the record mode, the cues are displayed in the field design of 3D scenes easier and more fun, feel more of view of the user, at the target location and are audible creative, have less difficulty positioning objects in 3D, as sounds with increasing frequency as well as through and have a lower learning curve compared to standard the vibration of the controller. editors. The prototype was evaluated with 20 students to We conducted an experiment with 17 participants identify differences between multi-sensory anticipation where we compared our game to a simplified version of cues and only visual cues. The results showed a clear preference for multi-sensory anticipation cues. Participants stated that they would not have been able 9https://developer.valvesoftware.com/wiki/Valve_Hammer_Editor

10 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Using Natural User Interfaces for Previsualization the Unity editor, representing usability and workflow through direct manipulation and assembled to more of standard 3D tools like Hammer. We were interested complex structures. in how users perform both in a free building task The prototype was evaluated by eight participants and in a replication task where they had to rebuild a through structured interviews. The results show that 3D scene from a printed template. Results show that the prototype was successfully used to create a users would like to use the game more frequently, variety of scenes in short time. However, we could found it less complex, would not need technical also show that gestures that are too similar quite support and appreciate the learnability and ease negatively influenced the mental workload and the of use. Additionally, we found that hedonistic and, users experience in general due to higher rates of miss- interestingly, also the pragmatic quality of the game is detection. However, this was mainly due to inaccurate rated higher than for the editor. This came as a surprise and unstable tracking and the qualitative feedback was since the editor is highly task oriented and pragmatic in promising. its design, functionality, and user experience [42].

Figure 9. Demonstration of drawing a sphere gesture in mid-air using the starting gesture

5.9. Playful VR Sandbox for Creativity and Exploration Sand playing is a nostalgic activity that reminds one about the childhood days where it opened possibilities Figure 8. Overview of the implemented guns (clockwise): pistol, for observations, creativity and cooperation. Sand laser, sniper, grenade. play provides different developmental benefits such as proprioceptive sensing, body awareness, space awareness, and social skills. Such pleasant activity is not only limited to children, but many adults also 5.8. Rapid No-UI Modeling in VR enjoy sand playing and rediscovering the sensations of this activity. Interactive and augmented sandboxes Creating 3D content, especially the aspect of modeling take this experience to a new level by adding another in 3D can be a challenging task for novice users. interactive element to it, increasing the interaction and Available applications such as Blender or Maya are application possibilities. Simultaneously, people can complex tools that have a high learning curve for play with the sand and experience projected interactive beginners. As much as these applications are ideal for visual feedback onto it. We developed VRBox which their purpose, that is the creation of complex models is an interactive sandbox for playful and immersive including shaders, lighting, animation, etc., beginners terraforming. Our prototype combines the approach can feel overwhelmed by the user interface and the of augmented sandboxes with modern Virtual Reality manifold options. technology and mid-air gestures. By exploiting 3D direct To address these problems, we looked into how to manipulation and embodiement, this system extends make the creation of 3D models and prototypes more the current approaches by adding a new interaction natural and approachable for novice users in order to and visualization layer to interact with the sand create a pleasurable experience that motivates diving and virtual objects. Furthermore, to provide better deeper into the world of 3D modeling. Rather than creativity support, users have the possibility to switch creating detailed and complex models, the focus was between two perspectives, a tabletop mode for the on creating 3D models that represent a rough version creation phase, and a first person mode by teleporting of models that users have in mind through offering directly into their own creation which allows for a rapid prototyping with natural ways of interaction. more immerse experience of the landscape in full- Therefore we explored the use of hand-based interaction size. To evaluate the VRBox, a qualitative user study in combination with drawing gestures. Different mid- was conducted with nine experts from the domains air gestures can be used to create primitive objects, e.g. of education, , and game design. cubes, spheres, pyramids. These objects can be grabbed Our focus was on the user experience, as well as the

11 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Rainer Malaka et al. technical aspects and the possible use cases. Our results indicated highly positive attitudes towards the VRBox which highlights the immersive and creative experience the system offers [43].

Figure 10. The landscape can be shaped by interacting with real sand (Right). VRBox from first person perspective (Left). Figure 11. Illustration of grip adjustment. The glass is held with firm grip (a). Then, the user loosens the grip by releasing the trigger half-way: The glass swings downwards according to 5.10. Grasping Objects in VR gravity (b). The position of the controller does not change. With this prototype, we propose emulating human grip abilities for virtual reality to enable an interaction with virtual objects (3D Direct Manipulation) that and previs. Three prototypes were developed: sketch corresponds better to object manipulation in reality. interaction, virtual keyboard interaction and speech We devised different interaction designs to allow the interaction. All prototypes shared a common back-end user to dynamically set the firmness of the grip and in which assets were stored and associated with tags in thus be able to hold an object firmly and loosely using a one-to-many configuration. Every retrieval type maps conventional controllers. user input to asset tags. A qualitative pilot study and a quantitative main In a within-subjects design user study (n=15), we study have been conducted to evaluate the various examined the usability of each prototype. The results interaction modes and to compare it to the current showed that applications of multimodal queries in status quo of invariable grip. Pivotal design properties an asset retrieval environment are promising. Overall, were identified and evaluated in a qualitative pilot exploring the design space in a more natural direction study. Many test persons appreciated the suggested proved more successful with the Voice interface. interaction’s similarity to real object handling. The Though not in all cases statistically significant, its study participants especially used and valued the usability metrics fell within closer reach of the variability of their grip in vertical tasks in which the Keyboard interface. Moreover, the Keyboard interface angle between object and hand typically needs to be exhibited significantly less task load. A rudimentary altered. Two revised interaction designs with variable object placement system was devised in order to enable grip were compared to the status quo of invariable grip testing with complete tasks found in previs workflows. in a quantitative study. The users performed placing actions with all interaction modes. 5.12. Natural Language for Scene Interaction Results showed that the variable grip has the potential to enhance usability, improve realism, reduce This prototype investigates the possibility to arrange frustration, and better approximate real life behavior objects in a virtual scene using speech in an attempt to depending on tasks, goals, and user preference. provide fast and easy to use interaction. It focuses on The research conducted within the scope of this understanding relative spatial relations and ambiguous work contributed valuable insights regarding the spatial descriptions, e.g., “next to”. The prototype was manipulation of objects in a VR environment. To be developed using the speech API Wit.ai 10 to process precise, we gained an understanding of how to enrich the audio and to identify spatial keywords. A model objects with realistic rotational features in order to of people common understanding of ambiguous spatial enhance overall usability and decrease mental effort as descriptions was developed for to translate the words much as possible [44]. into executable actions for the system. This model was developed based on a user study where participants 5.11. Drawing for Asset Selection had to place object according to ambiguous spatial This prototype examines the feasibility of novel 3D object retrieval user interfaces in the context of VR 10https://wit.ai/

12 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Using Natural User Interfaces for Previsualization

to make the environment look more “exciting”. A CUI would be capable of understanding this intention and have an underlying AI that knows what parameters need to be changed in order to get the desired outcome. For this purpose, we imitated an AI with the option to adjust a scene between the states of “cozy” and “scary” and integrated this approach into a simple slider interface. As a first evaluation of this approach, we implemented a prototype and conducted a comparative user study (n=31). We found that CUIs can offer a significantly higher usability and better user experience than traditional interfaces from the domain of virtual content creation.

Figure 12. Top: The user draws a sketch of the lamp. Bottom: Based on the sketch the user can choose between different 3D objects.

descriptions. The data of this study was used to mathematically define most probable interpretations Figure 13. Example of a CUI capable of switching the scene of the descriptions. The resulting model is used between the states “cozy” and “scary” to translate spatial keywords from user input into positional data which can be used to position the objects. The developed prototype was evaluated in a final user 5.14. ScenARy: Augmented Reality Scene Design study with ten participants. The results showed that the task-load of the system is relatively low and in the This mobile AR prototype is an application which can same range as other input modalities such as mouse be used to layout scene by augmenting a table or room or touch. The participants could successfully position with the virtual objects of the scene. Virtual objects can the objects of a scene according to the tasks they were be chosen from an asset library or created by the user given. Nevertheless, applying this approach to general drawing on the screen. In this marker-based prototype, scenes is challenging as all objects have to be annotated regular playing cards are recognised by the application in order to define their up, front, etc. and virtual objects can be assigned to the cards. This enables the user to arrange the virtual objects by 5.13. Cognitively Enabled Scene Design interacting with the tangible cards instead of the screen that displays the scene. The prototype was evaluated This prototype investigates a way to combine the power with professionals from the theater domain. After an of modern Artificial Intelligence (AI) with the demands introduction to the application, each participant had to of professional designers to fine-tune their content. perform a defined task using the app. It incorporates a novel concept called Cognitively The professionals rated the prototype with a high Enabled User Interfaces (CUI). These interfaces provide usability score. The interview results demonstrate that an innovative and easy to use interaction to support this prototype can be a valuable tool for the initial creatives by taking the user’s cognitive world into planning phase of stage productions. The participants account. To give an example, instead of adjusting the mentioned that they liked being able to physically appearance of a 3D scene by tinkering with various layout the scene using the playing cards and did not menus and options, a scene designer might simply want have to work with complicated on screen controls.

13 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Rainer Malaka et al.

The selection of objects to assign to the cards was shared repository for saving and reloading stage design perceived as textiteasy. All participants mentioned and scene scripts along with undo and redo and multi- that is valuable that the tool encourages collaboration user online collaboration between team members. Scenes and multiple persons can use it together as the layout are organized into projects and each project has its phase is a collaborative process. The professionals also own team. Theatre, animation, film, and visual effects mentioned that they miss some features such as lighting each have different requirements from their respective in order to use it for their professional workflow. user stories. This prototype provides a common core of functionality across all disciplines as well as specifically targeted functionalities for individual domains.

Figure 14. The AR scene design prototype running on a smartphone. The cards presents assets. Figure 15. Possible previs outcome: 3D representation of a theater stage with different lights, props and virtual actors. 6. Main Prototype

Based on different interaction metaphors which are derived from our NUI Concept, in order to best 6.1. Scene Design support the various tasks and diverse users of a A fundamental part of previs is set design and asset previs software, We developed a fully integrated layout. One can start with an empty stage and start software prototype which provides functionality for building on it or simply import a model of an all relevant previs tasks defined earlier, e.g. import, environment. The core functions for dressing the set sketching/modelling, set layout, lighting, animation, and laying out assets are searching the repository for camerawork, motion capture and visual effects. We assets and placing them within the scene, precisely integrated the individual research prototypes or parts translate, rotate or scale the asset, selecting multiple of them into the main prototype based on positive objects to be moved or deleted as a group and attaching evaluation, technical feasibility and positive review by objects to others as parent/child such as cups on a table the application partners, in order to create a direct or lights on a rig. For the selection of the assets, on top impact of the research results. of the primary mode of selecting an asset using the VR This prototype is a VR based application that controllers, a speech interface is developed to retrieve provides a common space for performing previs tasks: the spoken objects. The object placement is carried out the user can move freely within the virtual space using direct manipulation. and directly manipulate objects in 3D using the VR In addition to regular static assets like a table, special hand controllers. It provides the basis for a natural assets that provide some functionality are contained in user interface where assets can be grabbed and moved the repository and can be placed like any other object. directly, characters can be posed by grabbing and These special objects are lights, visual effects, characters moving parts of their body, paths can be manipulated and cameras. by grabbing nodes, and special tools can be wielded For lighting, several different styles of light are to perform tasks such as sketching and painting. included such as a light with visible beams. Once placed Additional controls are provided by a tablet-like in the set, users can adjust the color, brightness, range, interface on the back of the user’s wrist. Touch gestures and opening angle of the lighting. and speech commands are also used to augment the Visual Effects provide dynamic objects which can interface. To support realistic work sessions involving represent fire, water, smoke and more. They are based many tasks as part of a larger workflow, the basic on particle systems which can be activated and looped platform also provides an online asset repository and a by the user.

14 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Using Natural User Interfaces for Previsualization

Figure 16. To place assets in the scene, the artist can pick an item from the asset library.

Figure 18. Humanoid puppets which fall into the ‘Characters’ category.

for the character to follow from one mark to another. This path can then be modified by grabbing and moving it. Humanoid characters will have automatic behaviors such as walk, run, jump, fall, get up etc. Vehicles will also follow paths automatically applying steering, acceleration and braking as required. Figure 17. Light settings can be adjusted in terms of color, Humanoid characters can also be animated in more intensity, spot angle and other options. detail through posing. Each character has handles attached to their body parts, e.g., shoulders, elbows and hands, allowing the puppet to be posed through direct manipulation (see figure 19). Poses set keyframes on the Characters are most typically humanoid puppets but timeline and produce a path in space that allows the also include vehicles. These assets can be animated pose to adjust over time. in detail and are the foreground of previs and story- One of the easiest and most intuitive form of telling. animating a character is through Motion Capture. The In addition to the assets from the repository, users prototype supports the use of the Rokoko Smartsuit 11, can create their own objects through Digital Crafting. allowing the user to directly take over the control of a Using the a Sketch Tool, one can create a shape by character. Through the embodiment of the character the drawing an outline of it and modify the shape by user can perform an animation in a natural way and is grabbing and moving faces, edges or vertices of the not restricted by complex controls(see figure 20). shape. Once a shape has been created, it can be painted with an airbrush tool. Object Animation. Objects of a scene that do not fall in the category of characters can be animated through 6.2. Animation rigid animation. This provides a simple and intuitive way to animate by acting out the motion. A user can A major part of the previs process is animating grab and move an object while recording the performed characters and other objects according to a . motion. This form of animation also refers to the The main prototype provides different animation concept of embodiment and reality-based interaction as possibilities for characters and objects to enable this. it follows the idea of how kids play out scenes with All animations are recorded on a timeline which holds toys. Once an animation path has been created, the path every activity that is happening in the scene. can be further manipulated by grabbing it at points and moving it as desired. Character Animation. One of the most common needs in animation is to have a walking animation. In this prototype, one can simply sweep out the desired path 11https://www.rokoko.com/

15 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Rainer Malaka et al.

walks past it, or a character might kick a ball that then bounces off. For each object physics behaviors can be enabled individually and recorded together with all other forms of animation.

6.3. Camerawork Cameras can be placed on set as an asset or a special tool. The Viewfinder can be used to frame a shot and create a new camera. Camera views can be previewed from the Wrist Pad. By clicking on a camera, the user can enter the “Through the Lens” mode which effectively puts them in the position of a camera operator. Camera motion is animated to move between keyframes. This emulates dolly tracks, cranes and other staples of camerawork.

Figure 19. Puppets can be posed by adjusting the attached handles to their body parts.

Figure 22. View from the “Through the Len” mode.

Figure 20. A performer directly animating the character using 6.4. Shot Sequencer the Rokoko Smartsuit. One of the key outputs of the main prototype are the shot sequences. The shot sequencer is a desktop tool to assemble recorded shots. After recording a shot, one can select and preview cameras, cut to the chosen camera on the master track, trim/reveal camera clips, and export the final project as video file.

6.5. Collaboration While many tasks only directly involve one practitioner, one of the main purposes of previs is to communicate ideas amongst the team and manage any issues and risks that arise. The prototype supports multiple users working on a scene at the same time, each in their own location. Team members can see avatars of each Figure 21. Rigid animation of a mug which can be manipulated other and speak to one another. When a team member by grabbing it at points and moving it as desired. makes an edit to the scene, other team members see the effect of that edit as the objects are synced within the members. Objects in the scene can also be animated through physics interaction. This can be used as a quick and 6.6. Evaluation of the Main Prototype easy way to add natural secondary motions to assets. For We aimed to evaluate the main NUI prototype in instance, a chair might be knocked over as a character terms of applicability for the usage in realistic projects

16 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Using Natural User Interfaces for Previsualization

thus, needed some time to adjust to the new working environment. In contrast to that however, they also rated the overall learnability to be higher compared to any traditional software alternative available on the market. Being asked about the reasoning for this assessment, the experts stated that it normally takes a vast amount of time to become a professional with any standard 3D content creation software. With the NUI software however, they were able to internalize all necessary functions within hours. In addition to learnability, another advantage of NUIs was found in the option to create animations in real time and review Figure 23. After a shot has been recorded, various editing the results on the spot instead of separating these two functions can be accessed to produce the final result. steps. By using the NUI software, artists and directors could communicate more directly exchanging ideas and feedback right there in the animation phase. In contrast to these benefits, the interviewees also mentioned that from the creative industries of film, animation and wearing the head-mounted display for the duration of theater. For that purpose, three project scenarios were the project was rather tiring and exhausted them after defined with professionals from the respective creative a couple of hours. Even for a small-scale project, previs domains. For each project, the scope and overall is a lengthy process and thus, the professionals had to objective were defined based on actual finished or take additional breaks due to the cumbersome nature of ongoing projects that were developed in the experts’ the hardware. companies. For the realization of these projects, the studios used the NUI prototype and conducted Film. The film team included a lighting artist, a set the development process autonomously. Once the designer, a stage designer and a camera operator who projects were completed successfully, we evaluated the worked together to create previs for a short commercial procedure with the help of semi-structured interviews. video for one of the studio’s customers. A large variety Here, we focused on collecting qualitative data to get of positive feedback was expressed by the experts. an insight into the overall user experience of using First off, in accordance with the animation team, NUIs for previs. In total, we presented nine questions learnability of the software was emphasized to be one to the professionals, for instance asking if they were of its major advantages. Instead of lengthy training, able to achieve their goals using the prototype or the NUI software only required some familiarization how the workflow compared to their traditional previs and was rather easy to use. Being asked about specific procedure. Across all examined application domains, use cases where the experts would benefit most from we gathered highly positive feedback. The industry using NUIs, they mentioned the planning phase of experts involved in the interviews emphasized that they placing props and choosing the right equipment for the would benefit greatly by using a software based on NUIs production. Furthermore, they pointed out that previs in their working routines. As the feedback expressed in the film domain usually requires large sets being by the experts differed quite substantially dependent built and moved around. Making use of VR however on the respective domain, we will now provide a short allowed them to freely try out different settings, camera overview for each project individually. For detailed placements and light setups. On top of that, financial information regarding the responses of the interviews, advantages were mentioned as well since time spent please refer to the corresponding case study [6]. on a film set is rather costly and thus, digital previs would reduce costs immensely. As a final note, the Animation. For the animation project, the studio in interviewees concluded that the database-structure of charge assembled a team of five people with varying the prototype would greatly enhance collaboration backgrounds: a director, a production manager, a among team members. This new approach allowed layout artist, a motion-capture (MoCap) performer people involved in the production to work from and a pipeline supervisor. Their objective was to different places around the globe simultaneously. A shoot a trailer for an animated movie by using the suggestion for improvement was expressed by the NUI prototype. In the interviews conducted after camera operator who suggested to have a physical project completion, the experts pointed out that the replica of an actual camera in the virtual scene for better software gave them new ways to express themselves haptic feedback. visually while offering an intuitive and natural form of interaction. On the downside, interviewees stated Theater. For this project, the theater involved a stage that they were not familiar with the prototype and master, a lighting artist, a CTO, a project manager,

17 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Rainer Malaka et al. two stage technicians and an event technology trainee. software. The collaboration aspect is further improved Their objective was to create a collection of scenes to by having the ability to save projects in a database and previsualize an opera performance. Previs in theater access the same scene from anywhere in the world. productions is usually based on analogue elements In the modern industry, it is often the case that team such as cardboard props. Therefore, using VR and members are located in different places. Having such NUIs to work on a production felt new and exciting collaborative features can be extremely practical and from the team’s perspective. Although the experts were helpful. Furthermore, participants found the planning unfamiliar with this approach to previs, they found for placement of lights and superstructures highly the software to be usable, clearly structured, easy to beneficial for their workflow. As this process is usually learn and overall a convincing alternative to traditional cumbersome and expensive when conducted in a methods, offering a “great experience”. One major physical set or stage, the VR previs tool helps to feature they found particularly engaging was the ability overcome these challenges and provides artists with for be fully immersed in a virtual scene instead of an opportunity to try different constellations without working with real props in the outside world. Just like worrying about the costs. in the film production team, the experts pointed out All participants were able to get to the desired financial advantages of using such a software since outcome in a reasonable amount of time, even though time on stage is rather expensive. Additionally, they they had to complete a rather complex task without benefited strongly from using VR to plan out the having prior knowledge about the prototype and some lighting setup for corridors and areas on stage that even without being introduced to VR at all. Obtaining were difficult to illuminate. As a suggestion for further similar results using a professional software without development, some interviewees expressed the need for long training sessions is very unlikely. We can therefore more assets to be used in the stage preparation. argue that our NUI-based prototype has the potential to offer a usable and more intuitive interaction style 7. Discussion than the common software after a short familiarization period or tutorial has been completed. NUIs and VR In this work, we developed and evaluated prototypes are not just limited to the playful applications they are based on our proposed concept of using natural user mostly used in today. Our findings indicate that they are interfaces (NUIs) built around a set of interaction tech- applicable to professional workflows and can provide niques and concepts. Overall, The feedback collected in great advantages to creative work. the course of the project evaluations was highly positive Taking these results into account, we will address and supporting. The professionals from the creative some of our interaction principles that were presented industries stated affirmative appraisal of the prototypes in the early stages of the project. Within the course pointing out how they would benefit from it in their of the project evaluations, we could demonstrate that daily workflows. professional users from the creative industries were To develop a previs software that supports the able to create digital previs in a fast and easy way need of individual creative personnel, we identified having generated accurate and precise end results. the necessary functionalities needed for a dedicated The artists and creatives were able to directly interact previs software. The accessibility of a previs software with the software and exhibit their ideas. In our main can be improved if the identified tasks are supported. prototype evaluations, all participants were capable Moreover, in order to have high usability, these tasks of completing the test cases they were given and need to have intuitive controls and not highly rely on thus, generate convincing previs in the course of mere prior technical knowledge. minutes. Additionally, expert users stated the potential The individual prototypes provided us with valuable of the software and pointed out its ease of use. insights regarding our NUI concept. For each proto- Participants also addressed the entertaining factor of type, we scientifically evaluated individual interaction using the prototype and how they enjoyed the time concepts and principals. These prototypes showed great they spent interacting with the software. Therefore, we potential to support creatives in different stages of the conclude that the “Easy to Use”, “Rapidness, Accuracy previs process. We used the feedback from these studies and Precision”, “Playful and Fun” and “Creativity and integrated them partially or entirely into the main Support” interaction principles have been successfully prototype. integrated into the prototype. The main prototype also Regarding the evaluation of the main prototype, supports a “Multi-User” environment by providing a major aspect expressed by the experts was the the possibility to collaborate easily with other project ability to collaborate easily with other project members. members from various places around the world. Participants stated that animating or sketching in real- Multiple users being able to access the same scene from time and receiving feedback from colleagues instantly different locations and make changes on a project at the is an advantage in comparison to the traditional 3D same time was expressed by the experts.

18 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Using Natural User Interfaces for Previsualization

7.1. Limitations For future developments, we plan to include other modalities that might improve the naturalness of the Due to the large amount of features presented in the interaction and further utilize the feedback from the main prototype, some participants felt overwhelmed experts to refine the software prototype. As stated by to an extent which led to a high mental workload the professionals, haptic feedback plays a major role and a medium assessment of usability. However, we in certain contexts. For this purpose, we intend to should keep in mind that the subjects had to perform identify where and how to include a more tangible a set of complex tasks without knowing the software interaction technique. Moreover, we plan to investigate and without having any prior experience in VR. For the usefulness of speech and for the future iterations of the software, we intend to specific tasks. include a tutorial-mode, slowly guiding through each functionality of the prototype. This mode could help Acknowledgement. This project has received funding from to educate untrained personnel which would result the European Union’s Horizon 2020 research and innovation in a more efficient, less time-consuming usage of the programme (No 688244). We thank all partners who software overall. contributed to this research. Fatigue after long sessions was mentioned as another point of criticism. Especially wearing an HMD for References longer periods of time was perceived to be rather cumbersome. Overcoming this problem poses quite the [1] Yamaguchi, S., Chen, C., Yoon, D. and Gregoire, challenge for software developers since the hardware D. (2015) Previsualization: How to develop previs in asia? In SIGGRAPH Asia 2015 Symposium on dictates wearing comfort and thus, how physically Education, SA ’15 (New York, NY, USA: ACM): 16:1– demanding the interaction can become. However, one 16:3. doi:10.1145/2818498.2818515, URL http://doi. potential solution could be to offer desktop-alternatives acm.org/10.1145/2818498.2818515. to some of the functionalities. This way, users could take [2] Wong, H.H. (2012) Previsualization: Assisting filmmak- breaks from VR in between long work sessions while ers in realizing their vision. In SIGGRAPH Asia 2012 still being able to work on their product. Certainly, Courses, SA ’12 (New York, NY, USA: ACM): 9:1–9:20. this would take away the naturalness of the interaction doi:10.1145/2407783.2407792, URL http://doi.acm. for these periods. Nevertheless, it could be a way org/10.1145/2407783.2407792. to prevent physical exhaustion and on top of that, [3] Muender, T., Fröhlich, T. and Malaka, R. (2018) empower users further by providing different options Empowering creative people: Virtual reality for previ- to use the software. sualization. In Extended Abstracts of the 2018 CHI Con- ference on Human Factors in Computing Systems, CHI EA ’18 (New York, NY, USA: ACM): LBW630:1–LBW630:6. 8. Conclusion doi:10.1145/3170427.3188612, URL http://doi.acm. org/10.1145/3170427.3188612. In this article, we introduced a NUI concept for [4] Nitsche, M. (2008) Experiments in the use of game previsualization in an attempt to empower artists and technology for pre-visualization. In Proceedings of the practitioners to intuitively visualize their ideas and 2008 Conference on Future Play: Research, Play, Share, show their capabilities depending on the task they want Future Play ’08 (New York, NY, USA: ACM): 160–165. http://doi.acm. to perform in the previs cycle. Initially, requirements doi:10.1145/1496984.1497011, URL org/10.1145/1496984.1497011. for a previs software were collected from experts in [5] Muender, T., Volkmar, G., Wenig, D. and Malaka, R. the application areas and the core functionalities were (2019) Analysis of previsualization tasks for animation, defined based on those requirements. In addition, a film and theater. In Extended Abstracts of the 2019 concept for the interaction with the system based CHI Conference on Human Factors in Computing Systems, on natural user interfaces was introduced to guide CHI EA ’19 (New York, NY, USA: ACM): LBW1121:1– the development of the user interface for the main LBW1121:6. doi:10.1145/3290607.3312953, URL http: prototype as well as exploratory research prototypes. //doi.acm.org/10.1145/3290607.3312953. Our findings suggest that NUIs can provide an [6] Volkmar, G., Muender, T., Wenig, D. and Malaka, applicable alternative to traditional design tools and R. (2020) Evaluation of natural user interfaces in the only needs a brief familiarization phase instead of creative industries. In Extended Abstracts of the 2020 CHI lengthy training as professional software demands. For Conference on Human Factors in Computing Systems: 1–8. [7] Vetere, F., O’Hara, K., Paay, J., Ploderer, B., Harper, non-technical personnel, this approach might not only R. and Sellen, A. (2014) Social nui: Social perspectives offer an alternative, but a more usable and empowering in natural user interfaces. In Proceedings of the 2014 tool than current software solutions. Overall, the Companion Publication on Designing Interactive Systems, application of VR for previs seems to be beneficial for DIS Companion ’14 (New York, NY, USA: ACM): 215– creative personnel with little technical knowledge due 218. doi:10.1145/2598784.2598802, URL http://doi. to its easy and intuitive use. acm.org/10.1145/2598784.2598802.

19 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Rainer Malaka et al.

[8] Fu, L.P., Landay, J., Nebeling, M., Xu, Y. and Interactive Techniques, SIGGRAPH ’02 (New York, NY, Zhao, C. (2018) Redefining natural user interface. USA: ACM): 491–500. doi:10.1145/566570.566607, URL In Extended Abstracts of the 2018 CHI Conference http://doi.acm.org/10.1145/566570.566607. on Human Factors in Computing Systems, CHI EA [20] Chai, J. and Hodgins, J.K. (2005) Performance anima- ’18 (New York, NY, USA: ACM): SIG19:1–SIG19:3. tion from low-dimensional control signals. In ACM SIG- doi:10.1145/3170427.3190649, URL http://doi.acm. GRAPH 2005 Papers, 686–696. org/10.1145/3170427.3190649. [21] Walther-Franks, B., Biermann, F., Steenbergen, N. and [9] García-Peñalvo, F.J. and Moreno, L. (2019) Special Malaka, R. (2012) The animation loop station: Near real- issue on exploring new natural user experiences. time animation production. In Proceedings of the 11th Universal Access in the Information Society 18(1): 1– International Conference on Entertainment Computing, 2. doi:10.1007/s10209-017-0578-0, URL https://doi. ICEC’12 (Berlin, Heidelberg: Springer-Verlag): 469–472. org/10.1007/s10209-017-0578-0. doi:10.1007/978-3-642-33542-6_55, URL http://dx. [10] Naumann, A., Hurtienne, J., Israel, J., Mohs, C., doi.org/10.1007/978-3-642-33542-6_55. Kindsmüller, M., Meyer, H. and Husslein, S. (2007) [22] Walther-Franks, B., Herrlich, M., Karrer, T., Wit- Intuitive use of user interfaces: Defining a vague tenhagen, M., Schröder-Kroll, R., Malaka, R. and concept: 128–136. doi:10.1007/978-3-540-73331-7_14. Borchers, J. (2012) Dragimation: Direct manipulation [11] Clifford, R.M.S. and Billinghurst, M. (2013) Designing keyframe timing for performance-based animation. In a nui workstation for courier dispatcher command and Proceedings of Graphics Interface 2012, GI ’12 (Toronto, control task management. In Proceedings of the 14th Ont., Canada, Canada: Canadian Information Processing Annual ACM SIGCHI-NZ Conference on Computer-Human Society): 101–108. URL http://dl.acm.org/citation. Interaction, CHINZ ’13 (New York, NY, USA: ACM): 9:1– cfm?id=2305276.2305294. 9:8. doi:10.1145/2542242.2542252, URL http://doi. [23] Igarashi, T., Moscovich, T. and Hughes, J.F. acm.org/10.1145/2542242.2542252. (2005) Spatial keyframing for performance- [12] Bolt, R.A. (1980) “put-that-there” voice and gesture at driven animation. In Proceedings of the 2005 ACM the graphics interface. In Proceedings of the 7th annual SIGGRAPH/Eurographics Symposium on Computer conference on Computer graphics and interactive techniques: Animation, SCA ’05 (New York, NY, USA: ACM): 262–270. 107–115. doi:10.1145/1073368.1073383, URL [13] Francese, R., Passero, I. and Tortora, G. http://doi.acm.org/10.1145/1073368.1073383. (2012) Wiimote and kinect: Gestural user [24] Liu, C.K. and Zordan, V.B. (2011) Natural user interfaces add a natural third dimension to hci. interface for physics-based character animation. In doi:10.1145/2254556.2254580. Proceedings of the 4th International Conference on Motion [14] Delimarschi, D., Swartzendruber, G. and Kagdi, H. in Games, MIG’11 (Berlin, Heidelberg: Springer-Verlag): (2014) Enabling integrated development environments 1–14. doi:10.1007/978-3-642-25090-3_1, URL http:// with natural user interface interactions. In Proceedings dx.doi.org/10.1007/978-3-642-25090-3_1. of the 22Nd International Conference on Program Compre- [25] Shum, H. and Ho, E.S. (2012) Real-time physical mod- hension, ICPC 2014 (New York, NY, USA: ACM): 126– elling of character movements with microsoft kinect. 129. doi:10.1145/2597008.2597791, URL http://doi. In Proceedings of the 18th ACM Symposium on Virtual acm.org/10.1145/2597008.2597791. Reality Software and Technology, VRST ’12 (New York, NY, [15] Jain, J., Lund, A. and Wixon, D. (2011) The future of USA: ACM): 17–24. doi:10.1145/2407336.2407340, URL natural user interfaces. In CHI’11 Extended Abstracts on http://doi.acm.org/10.1145/2407336.2407340. Human Factors in Computing Systems, 211–214. [26] Herrlich, M., Braun, A. and Malaka, R. (2012) Towards [16] Weiyuan Liu (2010) Natural user interface- next bimanual control for virtual sculpting. In Reiterer, mainstream product user interface. In 2010 IEEE H. and Deussen, O. [eds.] Mensch & Computer 2012: 11th International Conference on Computer-Aided interaktiv informiert – allgegenwärtig und allumfassend!? Industrial Design Conceptual Design 1, 1: 203–205. (München: Oldenbourg Verlag): 173–182. doi:10.1109/CAIDCD.2010.5681374. [27] Galyean, T.A. and Hughes, J.F. (1991) Sculpting: An [17] Jacob, R.J., Girouard, A., Hirshfield, L.M., Horn, interactive volumetric modeling technique. In Proceed- M.S., Shaer, O., Solovey, E.T. and Zigelbaum, J. ings of the 18th Annual Conference on Computer Graphics (2008) Reality-based interaction: A framework for and Interactive Techniques, SIGGRAPH ’91 (New York, post-wimp interfaces. In Proceedings of the SIGCHI NY, USA: ACM): 267–274. doi:10.1145/122718.122747, Conference on Human Factors in Computing Systems, URL http://doi.acm.org/10.1145/122718.122747. CHI ’08 (New York, NY, USA: ACM): 201–210. [28] Chen, H. and Sun, H. (2002) Real-time haptic doi:10.1145/1357054.1357089, URL http://doi.acm. sculpting in virtual volume space. In Proceedings of org/10.1145/1357054.1357089. the ACM Symposium on Virtual Reality Software and [18] Walther-Franks, B. and Malaka, R. (2014) An inter- Technology, VRST ’02 (New York, NY, USA: ACM): 81– action approach to computer animation. Entertainment 88. doi:10.1145/585740.585755, URL http://doi.acm. Computing 5(4): 271–283. org/10.1145/585740.585755. [19] Lee, J., Chai, J., Reitsma, P.S.A., Hodgins, J.K. and [29] Galoppo, N., Otaduy, M.A., Tekin, S., Gross, M. and Pollard, N.S. (2002) Interactive control of avatars Lin, M.C. (2007) Soft articulated characters with fast animated with human motion data. In Proceedings of contact handling. Computer Graphics Forum 26(3): the 29th Annual Conference on Computer Graphics and 243–253. doi:10.1111/j.1467-8659.2007.01046.x, URL

20 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5 Using Natural User Interfaces for Previsualization

https://onlinelibrary.wiley.com/doi/abs/10. 55(1): 68. 1111/j.1467-8659.2007.01046.x. [38] Deterding, S., Dixon, D., Khaled, R. and [30] Wesson, B. and Wilkinson, B. (2013) Evaluating organic Nacke, L. (2011) From game design elements to 3d sculpting using natural user interfaces with the gamefulness: Defining gamification. 11: 9–15. kinect. In Proceedings of the 25th Australian Computer- doi:10.1145/2181037.2181040. Human Interaction Conference: Augmentation, Application, [39] Norman, D. (2013) The Design of Everyday Things: Innovation, Collaboration, OzCHI ’13 (New York, NY, Revised and Expanded Edition (Constellation). USA: ACM): 163–166. doi:10.1145/2541016.2541084, [40] Muender, T., Reinschluessel, A.V., Drewes, S., Wenig, URL http://doi.acm.org/10.1145/2541016.2541084. D., Döring, T. and Malaka, R. (2019) Does it feel real?: [31] "Vogel, D., Lubos, P. and Steinicke, F. ("2018") "anima- Using tangibles with different fidelities to build and tionvr - interactive controller-based animating in virtual explore scenes in virtual reality. In Proceedings of the reality". In "Proceedings of the 1st Workshop on Anima- 2019 CHI Conference on Human Factors in Computing tion in Virtual and Augmented Environments" ("IEEE"). Systems, CHI ’19 (New York, NY, USA: ACM): 673:1– URL "http://basilic.informatik.uni-hamburg.de/ 673:12. doi:10.1145/3290605.3300903, URL http:// Publications/2018/VLS18b". doi.acm.org/10.1145/3290605.3300903. [32] Okun, J., Zwerman, S. and Society, V.E. (2010) [41] Fröhlich, T. (2020) Natural and playful interaction for The VES Handbook of Visual Effects: Industry Stan- 3d digital content creation . dard VFX Practices and Procedures, Media Technology [42] Fröhlich, T., von Oehsen, J. and Malaka, R. (2017) (Focal Press/Elsevier). URL https://books.google. Productivity & play: A first-person shooter for fast and de/books?id=G_08wSzR4EIC. easy scene design. In Extended Abstracts Publication of [33] Chemuturi, M. (2012) Requirements Engineering and the Annual Symposium on Computer-Human Interaction in Management for Software Development Projects (Springer Play, CHI PLAY ’17 Extended Abstracts (New York, NY, Publishing Company, Incorporated). USA: Association for Computing Machinery): 353–359. [34] Kwon, B.c., Javed, W., Elmqvist, N. and Yi, J.S. doi:10.1145/3130859.3131319, URL https://doi.org/ (2011) Direct manipulation through surrogate objects. In 10.1145/3130859.3131319. Proceedings of the SIGCHI Conference on Human Factors in [43] Fröhlich, T., Alexandrovsky, D., Stabbert, T., Döring, Computing Systems, CHI ’11 (New York, NY, USA: ACM): T. and Malaka, R. (2018) Vrbox: A virtual reality 627–636. doi:10.1145/1978942.1979033, URL http:// augmented sandbox for immersive playfulness, cre- doi.acm.org/10.1145/1978942.1979033. ativity and exploration. In Proceedings of the 2018 [35] Raskin, J. (2000) The Humane Interface: New Directions for Annual Symposium on Computer-Human Interaction in Designing Interactive Systems (New York, NY, USA: ACM Play, CHI PLAY ’18 (New York, NY, USA: ACM): 153– Press/Addison-Wesley Publishing Co.). 162. doi:10.1145/3242671.3242697, URL http://doi. [36] Bérard, F., Ip, J., Benovoy, M., El-Shimy, D., Blum, J.R. acm.org/10.1145/3242671.3242697. and Cooperstock, J.R. (2009) Did "minority report" get [44] Bonfert, M., Porzel, R. and Malaka, R. (2019) Get a it wrong? superiority of the mouse over 3d input devices grip! introducing variable grip for controller-based vr in a 3d placement task. In INTERACT. systems. In 2019 IEEE Conference on Virtual Reality and [37] Ryan, R.M. and Deci, E.L. (2000) Self-determination 3D User Interfaces (VR) (IEEE): 604–612. theory and the facilitation of intrinsic motivation, social development, and well-being. American psychologist

21 EAI Endorsed Transactions on Creative Technologies 12 2020 - 03 2021 | Volume 8 | Issue 26 | e5