<<

23rd ICCRTS Info. Central, Pensacola, Florida 6 - 9 November 2018

Assessing the quantitative and qualitative effects of using mixed for operational decision making

Mark Dennison1, Jerald Thomas2, Theron Trout1,3 and Evan Suma Rosenberg2 1U.S. Army Research Laboratory West, Playa Vista, CA 2University of Minnesota, Minneapolis, MN 3Stormfish Scientific, Chevy Chase, MD

Abstract parate systems (physical objects, computers, paper documents, etc.) that require a significant amount The emergence of next generation VR and AR de- of resources and effort to bring into a unified space. vices like the Rift and HoloLens Human interactions require shared cognitive mod- has increased interest in using (MR) for els where interaction with systems must support and simulated training, enhancing command and control, maintain this shared representation, stored informa- and augmenting operator effectiveness at the tactical tion must persist the representation at a fundamen- edge. It is thought that virtualizing mission relevant tal data-level, and the underlying network must allow battlefield data, such as satellite imagery or body- these data and information to flow without hindrance worn sensor information, will allow commanders and between human collaborators and non-human agents analysts to retrieve, collaborate, and make decisions (1; 2). about such information more effectively than tradi- The emergence of Mixed Reality (MR) technolo- tional methods, which may have cognitive and spa- gies has provided the potential for new methods for tial constraints. However, there is currently little the Warfighter to access, consume, and interact with evidence in the scientific literature that using mod- battlefield information. MR may serve as a unified ern MR equipment provides any qualitative benefits platform for data ingestion, analysis, collaboration, or quantitative benefits, such as increased task en- and execution and also has the benefit of being cus- gagement or improved decision accuracy. There are tomized based on the mission needs and requirements also no validated metrics in the field for comparing of each operator (see Figure 1). MR lies in the across display devices and tasks. In this paper, we Reality- Continuum between the physical surveyed potential metrics for assessing the usefulness and the digital world (3). (AR) of MR technologies, discuss how these data might be and (VR) exist at the extremes of acquired in experimental and tactical scenarios, and the MR spectrum, as shown in Figure 2. Where AR pose issues in multi-user and collabo- superimposes generated content over the real world, ration. We also introduce the Mixed Reality Tactical VR occludes the real world entirely to present some- Analysis Kit (MRTAK), which functions as an exper- thing entirely fabricated. Each of these immersive imental platform to perform these assessments during technologies has benefits and drawbacks, but there collaborative mission planning and execution. has been limited research exploring what these are beyond speculation. Importantly, MR ensures a vi- 1 Background sual connection to the physical world, while utilizing elements that may be superimposed on reality or oc- The modern battlefield and Army operational envi- cluding it completely with a purely virtual rendering ronment is becoming more varied and dynamic, with of information and objects. Thus, MR may serve as a greater reliance on the integration of information a medium to integrate data from sensors monitoring from intelligent things/devices, agents, and systems. the real world, with the ability to perceive and caused by multitasking and mis- on this information without many of the spatial and sion execution at standoff remain significant chal- physical constraints of currently used C3I systems. lenges in C3I scenarios. The integration of infor- The recent increase in the ease of access to mod- mation for decision making and other mission com- ern high-fidelity head-mounted displays has caused a mand tasks is often still done using discrete and dis- resurgence in interest for using immersive technolo-

Page 1 Figure 1: Information flow between forward operators and analysts using immersive display devices.

gies in the Private and public sectors. Consequently, 1.1 Evaluation of Immersive Tech- the “cool factor” associated with VR and AR tech- nologies nologies has become a common reason for their adop- tion, with little empirical data backing this up. Sim- Recently, this area of research has been referred to as ilar issues have been found in adding gamified ele- “Immersive Analytics” (5). Chandler and colleagues ments to training, which can be completely ineffec- suggest five major topic areas: 1) What paradigms tive or simply less effective than less engaging tradi- are enabled by immersive technologies and how do tional methods. With respect to command and con- we evaluate them over other traditional mediums and trol, prior work has shown that novel interfaces that each other?, 2) Do these technologies provide a more support decision making have traditionally been chal- holistic way of looking at data that contains 3D spa- lenging for users to understand and interpret (4). Ad- tial and abstract information?, 3) What are the best ditionally, there is little evidence in the literature of interface “tricks” and affordances that change a user’s the quantitative benefits of using immersive technolo- perspective from an allocentric to egocentric view of gies in operational decision-making, nor are there a the data?, 4) Do these technologies invalidate the set of tasks and validated measures for assessing opti- literature on 2D data interaction?, and 5) What is mality. Here, we review the limited current literature the typical work-flow for examining data across do- that have attempted to address this issue with re- mains and how do we develop generic platforms to spect to evaluation and communication in MR, and support immersive analytics? Although each of these discuss how the MRTAK project seeks to build upon questions is important, here we focus on items one this work as a sandbox for immersive C3I research. and two, which pose the more general question of how should we evaluate the effectiveness of immer- sive technologies and what data is necessary to per- form this assessment.

Uses for MR Technology One area where immersive technologies have been used extensively is for simulation and training on real- world tasks. For example, a study by Donalek and colleagues (6) reported that in a way-point drawing task, subjects who viewed the environment in an Ocu- Figure 2: Mixed reality technology spectrum. lus Rift HMD performed with less distance and angle Adapted from Milgram (3) errors than those who viewed the environment on a 2D desk-top monitor. Moran and colleagues (7) cre- ated an immersive virtual environment where Twit- ter data was overlaid atop real geography to improve

Page 2 the experience for analysts. The authors claimed that icantly and the authors felt that researchers focused this MR environment enhanced situational awareness, solely on assessing time and task errors that they cognition, and that pattern and visual analytics were failed to measure for critical effects such as motion more efficient than on traditional 2D displays. A sickness, decay and recall of skills after immersion, study by Dan & Reiner (8) measured performance the level of trust or acceptance of the device, and differences among subjects who had to complete a the users’ prior attitude, skills, and experience with paper folding task after viewing information on a 2D similar technologies. desktop monitor or through an augmented environ- Some studies have shown the importance of the is- ment. Subjects showed a higher cognitive load index sues brought up by Borsci. For example, it was found when in 2D vs 3D, as measured by the ratio that reported VR system usability was correlated of frontal theta power over parietal alpha power. This strongly with a user’s level of trust in that system indicated that information transfer was significantly (14; 15). These criteria were assessed through vali- easier when the data was viewed in an MR environ- dated metrics such as the System Usability Scale (16) ment. Other work has shown that the of and Trust in Technology questionnaires (17). Neuro- one’s virtual body and hands is also a critical feature physiological surveys have also been shown to corre- when performing cognitively demanding tasks, such late strongly with performance in immersive environ- as memorization, when done in a virtual environment ments. Davison (18) showed that Performance on the (9). This decreased cognitive load may be related to Trail-Making Task A (TMT-A) (19), a task consid- the fact that humans are “biologically optimized” to ered to assess motor speed, was found to be signifi- perceive in 3D (6). McIntire and colleagues (10; 11) cantly related to other measures which also assessed reported that use of a 3D stereoscopic display in- speed, such as the time taken to complete parking creased task performance by roughly 60% on average. simulator levels and the time taken to place virtual Recently, it was reported immersive AR was found to objects around a room. Measures of executive func- be better when manipulating data that required spa- tion, such as TMT B performance, was found to be tial perception and interactions with a high degree of significantly related to performance on both of these freedom (such as tangible user interfaces), but users spatial location tasks. Dennison and colleagues found were generally faster on the desktop if the task was that motion sickness caused by immersion in a virtual familiar (12). Generally, though, these studies pro- environment (VE) greatly impacted the duration to vide limited empirical evidence for which immersive which participants elected to remain in the VE and mediums (VR, AR, MR) are best for improving user complete decision making tasks (20; 21; 22). Collec- decision making across content domains, and many tively, these studies demonstrate the need to assess do not use similar or easily comparable metrics. not only the psycho-physiological profiles of intended users — as measured through questionnaires and pre- Evaluation Metrics task assessments — but also the potential benefit of monitoring these states during real time use of im- The issue of evaluation metrics is of critical impor- mersive technologies, when possible. tance. In a seminal study by Borsci and colleagues (13), the authors conducted a review of all existing 1.2 Evaluation of Multi-User Interac- studies that performed an assessment between an im- tion mersive technology, such as AR and VR, and a tradi- tional, such as a desktop monitor, or between different The majority of scenarios in which MR can be applied immersive devices. They list nine evaluation crite- have multiple users. These users can be operating ria used previously in the field: 1) Cognitive skills 2) with the same or different immersive technology, can visuo-spatial abilities 3) Levels of trust/acceptance of be colocated or connected remotely over a network, VR/MR tools and motivation in use, 4) Participants and may have access to the same information or only attitude, 5) Previous experience, 6) motion sickness, pieces of it (information symmetry). A task incorpo- 7) physiological Reactions ( shift, cognitive rating multiple participants will be affected, to some load, stress) 8)Level of presence and engagement, and degree, by the communication behaviors among par- 9) Technical aspects and tools. Studies also reported ticipants. Consequently, researchers must consider using pre-training assessments and demographic mea- these dynamics when determining metrics that assess sures, task performance assessments, varying experi- the effectiveness of an immersive technology for an mental conditions, and assessing post-training crite- entire scenario or for component tasks. We examined ria, typically through questionnaires. Across the lit- literature from the field of computer mediated com- erature surveyed, the degree of overlap varied signif- munication (CMC) because interpersonal communi-

Page 3 cation, as mediated by virtual environments, is still a different performance outcomes. Third, include ap- new concept within the VR, AR, and MR fields. propriate questionnaires to tap into specific measures It is first important to consider the research re- of interest, rather than asking overly generalizable or garding the efficacy of various CMC modalities. How vague questions, such as "How did you like the com- does an observing analyst best share important in- munication system?". A questionnaire determining formation to a squad at the tactical edge? When a participant’s level of trust in another participant bandwidth on a tactical network is limited, or the can be adapted from Rotter (28) and a questionnaire likelihood of unwanted third-party observation, in- determining participants’ level of rapport with one terception, or tampering of is high, another can be adapted from Puccinelli and Tickle- what is the level of fidelity required to effectively ex- Degnen (29), as examples. ecute command and control to complete the mission? These questions, which are not tied to any specific technology, are critical factors when assessing how an 2 The Mixed Reality Tactical immersive medium might help or hinder individuals Assessment Kit in a decision-making scenario. Currently, there are many theories and models re- The U.S Army Research Lab and industry partner garding CMC (23). Despite this, it is difficult to find Stormfish Scientific have built a collaborative mixed metrics or evaluation frameworks that provide em- reality infrastructure, called the Mixed Reality Tacti- pirical evidence on the efficacy of these CMC modal- cal Assessment Kit (MRTAK). The goal of MRTAK ities. To the best of our knowledge, existing stud- is to allow researchers to perform controlled stud- ies examine only specific CMC modalities and com- ies evaluating how immersive technologies compare pare them strictly against face-to-face communication against traditional systems in single and multi-user methods. Across these studies, task performance has operational decision making tasks. MRTAK also will been used as the key metric for determining CMC ef- allow researchers to test and evaluate different net- ficacy (24; 25). It is also important to note that, gen- work and data management control frameworks. Cur- erally, the only independent variable in these studies rently, we are collaborating with academic partners is the CMC modality. Furthermore, obtaining em- at the University of Minnesota, USC Institute for pirical evidence about task performance becomes in- Creative Technologies, and University of California creasingly difficult when multiple people are perform- Irvine. ing the task together (23). Factors such as the par- ticipants’ relationship (26), amount of trust in their The DICE Network teammate (25), and the ability of each individual to perform their designated portion of the task (27) One of the key components of MRTAK is the De- can individually and collectively be extremely hard to fense Integrated Collaborative Environment (DICE) control. These factors must be considered both when network (30), developed at ARL with Stormfish Sci- designing experimental measures and when evaluat- entific. DICE hosts a confidential private network ing immersive systems. If a participant does not trust where collaborative MR services are hosted, and lo- their counterpart as a valid source of information or cal and remote clients can connect to these services they do not believe that they can competently com- through a secure VPN. A diagram of DICE is shown plete the task, it will likely have a significant effect in Figure 3. This network was designed to meet rigor- on task performance. ous Department of Defense and Army Standards and After examining the CMC literature, we have a uses policy based security so that access can be con- compiled a list of suggestions for conducting MR user trolled at multiple-levels of granularity. Thus, DICE studies with multiple local or remote users. First, it allows for controlled experimentation of how normal is important to create a rigorous study design that and degraded network conditions, where bandwidth controls for confounding factors. If the experiment is may be extremely limited, affect different aspects of using task performance to measure both technology multi-user collaboration. For example, consider a sit- efficacy and communication efficacy, make sure to in- uation where one user is providing navigation to an- clude an appropriate number of permutations within other using complex spatial markers rendered over the study design to control for order effects. Second, an AR display. If the network were to be strained consider the participants’ relationship as a factor. A to the point that image data could no longer be sent group of friends and a group of strangers will be- from the edge, researchers could test how teams could have and communicate differently, effecting the way communicate that critical information over alterna- in which they execute the task and how they value tive channels until bandwidth was restored.

Page 4 ground and experience with similar systems. Decision making tasks should be broken down into key pro- cessing steps and performance increases or decreases should be discussed with respect to these elements and with respect to the overall mission. Physiologi- cal sensors can also be used to track informa- tion that may not be readily accessible through sur- veys or behavior, such as cognitive load or task en- gagement. Finally, future work should take special consideration when designing studies involving mul- tiple participants and rigorously control for commu- nication styles, prior relationships, and even cultural differences. Figure 3: Overview of the DICE Network structure. References

Sensor Connectivity and [1] J. Galegher, R. E. Kraut, and C. Egido, Intel- MRTAK is also equipped with a fully synchronized lectual teamwork: Social and technological foun- data source management system. This system al- dations of cooperative work. Psychology Press, lows for ingestion of sensor data from local users or 2014. from external devices such as Internet of Battlefield [2] S.-Y. Tu and A. H. Sayed, “Distributed decision- Things (IoBT) sensors, and is integrated through the making over adaptive networks.,” IEEE Trans. DICE network. With this system, researchers are Signal Processing, vol. 62, no. 5, pp. 1054–1069, able record data from all aspects of the collabora- 2014. tive decision making in real time. Moreover, these data can even be viewed from within the immer- [3] P. Milgram, H. Takemura, A. Utsumi, and sive environment as a form of training or feedback. F. Kishino, “Augmented reality: A class of dis- MRTAK allows for key data to be recorded at each plays on the reality-virtuality continuum,” in step of the decision making process, and allows for Telemanipulator and technologies, experimenters to freely choose which display devices vol. 2351, pp. 282–293, International Society for are used and whether or not the participants are lo- Optics and Photonics, 1995. cal or remote. Similarly, communication among users via any modality (voice, symbology, tracks) can be [4] M. A. Livingston, S. Russell, J. W. Decker, recorded and used for later analysis. The underlying E. Leadbetter, and A. Gilliam, “Cedars: Com- data framework also makes it easy to run machine bined exploratory data analysis recommender learning applications on data generated from partic- system,” in Large Data Analysis and Visualiza- ipants in the environment or on incoming informa- tion (LDAV), 2015 IEEE 5th Symposium on, tion from external programs or sensors. Thus, models pp. 139–140, IEEE, 2015. for value of information (31), information availability (32), or uncertainty (33) can be integrated and tested [5] T. Chandler, M. Cordeil, T. Czauderna, with respect to which display platform or tactical de- T. Dwyer, J. Glowacki, C. Goncu, M. Klapper- cision they are optimal for. stueck, K. Klein, K. Marriott, F. Schreiber, and E. Wilson, “Immersive Analytics,” in 2015 Big Data Visual Analytics, BDVA 2015, pp. 1–8, 3 Conclusion IEEE, sep 2015.

In conclusion, the current literature suggests that im- [6] C. Donalek, S. G. Djorgovski, A. Cioc, A. Wang, mersive technologies may provide a means of improv- J. Zhang, E. Lawler, S. Yeh, A. Mahabal, ing certain aspects of operational decision-making. M. Graham, A. Drake, et al., “Immersive and Future work should aim to report more objective and collaborative data using virtual re- precise measurements of task outcomes when compar- ality platforms,” in Big Data (Big Data), 2014 ing different immersive interfaces and, when possible, IEEE International Conference on, pp. 609–614, include comprehensive assessments of a user’s back- IEEE, 2014.

Page 5 [7] A. Moran, V. Gadepally, M. Hubbell, and [17] D. H. Mcknight, M. Carter, J. B. Thatcher, and J. Kepner, “Improving big data visual analytics P. F. Clay, “Trust in a specific technology: An with interactive virtual reality,” in High Perfor- investigation of its components and measures,” mance Extreme Conference (HPEC), ACM Transactions on Management Information 2015 IEEE, pp. 1–6, IEEE, 2015. Systems (TMIS), vol. 2, no. 2, p. 12, 2011.

[8] A. Dan and M. Reiner, “Eeg-based cognitive load [18] S. M. C. Davison, C. Deeprose, and S. Ter- of processing events in 3d virtual worlds is lower beck, “A comparison of immersive virtual reality than processing events in 2d displays,” Inter- with traditional neuropsychological measures in national Journal of Psychophysiology, vol. 122, the assessment of executive functions,” vol. 30, pp. 75–84, 2017. pp. 1–11, apr 2017.

[9] A. Steed, Y. Pan, F. Zisch, and W. Steptoe, “The [19] R. M. Reitan and D. Wolfson, The Halstead- impact of a self-avatar on cognitive load in im- Reitan neuropsychological test battery: Theory mersive virtual reality,” in Virtual Reality (VR), and clinical interpretation, vol. 4. Reitan Neu- 2016 IEEE, pp. 67–76, IEEE, 2016. ropsychology, 1985. [10] J. P. McIntire, P. R. Havig, and E. E. Geisel- [20] M. S. Dennison, A. Z. Wisti, and M. D’Zmura, man, “What is 3D good for? A review of hu- “Use of physiological signals to predict cybersick- man performance on stereoscopic 3D displays,” ness,” Displays, vol. 44, pp. 42–52, 2016. p. 83830X, 2012. [21] M. S. Dennison and M. D’Zmura, “Cybersickness [11] J. P. McIntire and K. K. Liggett, “The (possible) without the wobble: Experimental results speak utility of stereoscopic 3D displays for informa- against postural instability theory,” Applied er- tion visualization: The good, the bad, and the gonomics, vol. 58, pp. 215–223, 2017. ugly,” 2014 IEEE VIS International Workshop on 3DVis (3DVis), 2014. [22] M. Dennison and M. D’Zmura, “Effects of un- [12] B. Bach, R. Sicat, J. Beyer, M. Cordeil, and expected visual motion on postural sway and H. Pfister, “The Hologram in My Hand: How motion sickness,” Applied ergonomics, vol. 71, Effective is Interactive Exploration of 3D Visu- pp. 9–16, 2018. alizations in Immersive Tangible Augmented Re- [23] J. B. Walther, “Theories of computer-mediated ality?,” vol. 24, pp. 457–467, jan 2017. communication and interpersonal relations,” The [13] S. Borsci, G. Lawson, and S. Broome, “Empir- handbook of interpersonal communication, vol. 4, ical evidence, evaluation criteria and challenges pp. 443–479, 2011. for the effectiveness of virtual and mixed reality [24] M. Alavi, “Computer-mediated collaborative tools for training operators of car service main- learning: An empirical evaluation,” MIS quar- tenance,” vol. 67, pp. 17–26, feb 2015. terly, pp. 159–174, 1994. [14] D. Salanitri, C. Hare, S. Borsci, G. Lawson, S. Sharples, and B. Water Fi Eld, “Relation- [25] N. Bos, J. Olson, D. Gergle, G. Olson, and ship between trust and usability in virtual envi- Z. Wright, “Effects of four computer-mediated ronments: An ongoing study,” in Lecture Notes communications channels on trust development,” in (including subseries Lec- in Proceedings of the SIGCHI conference on hu- ture Notes in Artificial and Lecture man factors in computing systems, pp. 135–140, Notes in ), vol. 9169, pp. 49–59, ACM, 2002. Springer, Cham, 2015. [26] J. B. Walther, “Computer-mediated communi- [15] S. Borsci, G. Lawson, B. Jha, M. Burges, and cation: Impersonal, interpersonal, and hyper- D. Salanitri, “Effectiveness of a multidevice 3D personal interaction,” Communication research, virtual environment application to train car ser- vol. 23, no. 1, pp. 3–43, 1996. vice maintenance procedures,” Virtual Reality, [27] G. Bubaš, “Computer mediated communication vol. 20, pp. 41–55, mar 2016. theories and phenomena: Factors that influence [16] J. Brooke et al., “Sus-a quick and dirty usability collaboration over the internet,” in 3rd CAR- scale,” Usability evaluation in industry, vol. 189, Net users conference, Zagreb, Hungary, Citeseer, no. 194, pp. 4–7, 1996. 2001.

Page 6 [28] J. B. Rotter, “A new scale for the measurement of interpersonal trust,” Journal of personality, vol. 35, no. 4, pp. 651–665, 1967.

[29] N. M. Puccinelli and L. Tickle-Degnen, “Know- ing too much about others: Moderators of the relationship between eavesdropping and rapport in social interaction,” Journal of Nonverbal Be- havior, vol. 28, no. 4, pp. 223–243, 2004.

[30] T. T. Trout, S. Russell, A. Harrison, M. Denni- son, R. Spicer, E. S. Rosenberg, and J. Thomas, “Collaborative mixed reality (mxr) and net- worked decision making,” in Next-Generation Analyst VI, vol. 10653, p. 106530N, International Society for Optics and Photonics, 2018.

[31] J. R. Michaelis, “Value of information driven content management in mixed reality infras- tructures,” in Next-Generation Analyst VI, vol. 10653, p. 106530O, International Society for Optics and Photonics, 2018.

[32] L. R. Marusich, J. Z. Bakdash, E. Onal, M. S. Yu, J. Schaffer, J. O’Donovan, T. Höllerer, N. Buchler, and C. Gonzalez, “Effects of infor- mation availability on command-and-control de- cision making: performance, trust, and situa- tion awareness,” Human factors, vol. 58, no. 2, pp. 301–321, 2016. [33] A. Raglin and A. Harrison, “Reasoning with an uncertainty of information measure: de- cision making for military and non-military applications,” in Next-Generation Analyst VI, vol. 10653, p. 1065308, International Society for Optics and Photonics, 2018.

Page 7