OmniWalker: An Omni-directional Walking Platform for Navigating Immersive Computer Based Mine Simulations

By Minghadi Suryajaya

A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy

The University of New South Wales School of Mining Engineering Sydney, Australia

July 2010 ORIGINALITY STATEMENT

I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institu- tion, except where due acknowledgment is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project’s design and conception or in style, presentation and linguistic expression is acknowledged.

Signed ...... Date ......

i Abstract

Many computer based simulations rely on conventional navigation methods such as a joystick and keyboard to enable the user to navigate around the virtual environment. In most cases, this constrains the users and prevents them from actually walking around in the synthesized environment as they would in a real environment. In some instances of safety oriented simulations this may create a false impression of the difficulty of the tasks to be undertaken and the complexity of the environment. This thesis project presents a new method for omni-directional locomotion in the virtual environment. The method involved invention of a mechanical walking platform and development of the software tracking mechanism that runs real-time on commodity hardware. At the end of the project, a formal usability assessment of the system was conducted to evaluate the users’ acceptance of the system. The mechanical walking platform is relatively inexpensive and very robust for regular usage. The geometry of the platform allows the user to walk in a natural fashion without the need of active motors, increasing the user’s safety while allowing much greater immersiveness compared to handheld devices such as a joystick or a mouse. The software component of the system is comprised of the tracking module and the visualisation module. Camera based tracking was chosen as the re- search direction of the tracking module in this project due to its low-cost and robustness. Three different approaches were developed and assessed; markerless tracking, marker-based parallel-camera tracking and marker-based multi-camera tracking. The visualisation module is composed of custom-built visualisation soft-

ii Abstract iii ware incorporating visual widgets that give real-time feedback to the user. The combined features of the tracking module and the visualisation module allow the user to learn to use the system quickly without significant formal training. Many open- software libraries were used to accelerate the software development. However, significant software development effort was still necessary because sig- nificant parts of the needed functionalities are not available off-the-shelf. Acknowledgments

This thesis project has been an enormous undertaking with far too many ups and downs. I thank God for all the strength and faith. I would like to express sincere thanks to the following persons for providing assistance and guidance during the process of research and writing of this thesis: Dr. Chris Fowler, my supervisor, for his advice, guidance and support during my research. The degree would not have been completed without his virtuous guidance. Dr. Tim Lambert, for his insightful technical feedback and continuous encour- agement. His assistance during this last few years has helped tremendously in getting this thesis project completed. Dr. Phillip Stothard and Prof. Jim Galvin, my co-supervisors, for their valu- able comments and suggestions on my work. Dr. Serkan Saydam, the postgraduate coordinator for the School of Mining Engineering. His feedback and organisational assistance have helped me through this long journey of research. All the staff members of the School of Mining Engineering for their individual help, encouragement and provision of both resources and friendship during my time with the school. Finally to my beloved Gladys and my family for their constant love and under- standing during my long awaited thesis completion. Special thanks to Gladys for forcing me to work on a 16/7 schedule on this thesis project. Her understanding and discipline have sustained me throughout my study.

iv To my parents

v Contents

Abstract ii

Acknowledgments iv

Contents vi

List of Figures xi

List of Tables xix

Nomenclature xxii

1 Introduction 1 1.1 Background and Motivation ...... 1 1.2 Thesis Goal and Contribution ...... 3 1.3 Thesis Structure ...... 4 1.4 Produced Papers ...... 5

2 Locomotion Interfaces: An Overview 6 2.1 Walking-based Locomotion Interfaces ...... 7 2.1.1 Motor-based Locomotion Interfaces ...... 7 2.1.2 Sliding-based Locomotion Interfaces ...... 15 2.2 Leg-based Locomotion Interfaces ...... 17 2.2.1 Pedal-based Locomotion Interfaces ...... 18 2.2.2 Step-in-Place Locomotion Interfaces ...... 19 2.2.3 Other Novel Techniques ...... 21

vi CONTENTS vii

2.3 Hand-based Locomotion Interfaces ...... 25 2.3.1 Joystick-based Locomotion Interfaces ...... 25 2.3.2 The Magic Barrier Tape ...... 26 2.4 Concluding Remarks ...... 26

3 Mechanical Hardware Design for the OmniWalker 30 3.1 Laboratory Layout ...... 30 3.2 Walking Platform ...... 30 3.2.1 Ball Transfer Conveyor ...... 30 3.2.2 Omni-directional Stroller (Early Designs) ...... 33 3.2.3 The Dish Platform ...... 34 3.2.4 Rubber Enclosure ...... 36 3.2.5 Holding Frame ...... 37 3.3 Safety Peripherals ...... 37 3.3.1 Safety Harness Support Structure ...... 37 3.3.2 Safety Harness ...... 40 3.4 Conclusion ...... 40

4 System Architecture and System Hardware 42 4.1 System Architecture ...... 42 4.2 Computer and Tracking Hardware ...... 43 4.2.1 Camera ...... 43 4.2.2 Controller and Visualisation Hardware ...... 45 4.2.3 Markers ...... 46 4.2.4 Computer Hardware ...... 50

5 Tracking Module 52 5.1 Literature Review on Motion Capture ...... 52 5.1.1 Mechanical Trackers ...... 52 5.1.2 Magnetic Trackers ...... 53 5.1.3 Inertial Trackers ...... 54 CONTENTS viii

5.1.4 Optical Trackers ...... 55 5.2 Attempts at Markerless Tracking ...... 62 5.2.1 Motivation ...... 62 5.2.2 System Design ...... 63 5.2.3 Experiment Results ...... 64 5.2.4 Discussion ...... 69 5.3 Marker-based Tracking using a Parallel Camera System ...... 71 5.3.1 System Design and Overview ...... 71 5.3.2 Camera Calibration ...... 72 5.3.3 2D Tracking ...... 73 5.3.4 3D Tracking ...... 80 5.3.5 Kalman Filter ...... 81 5.3.6 Avatar Control ...... 82 5.3.7 Tracking GUI Design and Implementation ...... 85 5.3.8 Experimental Results and Discussion ...... 88 5.4 Marker-based Tracking using a Multi Camera System ...... 89 5.4.1 System Design and Overview ...... 89 5.4.2 2D Tracking ...... 90 5.4.3 3D Tracking and Camera Calibration ...... 95 5.4.4 Avatar Control ...... 99 5.4.5 The Tracking GUI Design and Implementation ...... 99 5.4.6 Experimental Results ...... 102

6 Visualisation Module 104 6.1 Literature Review ...... 104 6.1.1 Overview of Graphics , Graphics Rendering Engines and Game Engines ...... 104 6.1.2 Physics Engines ...... 108 6.1.3 Character Animation Library ...... 109 6.1.4 GUI Library ...... 110 CONTENTS ix

6.1.5 Input Devices Library ...... 110 6.1.6 Modelling Software ...... 110 6.2 Components of Choice ...... 111 6.3 Modelling the Avatar ...... 112 6.4 Modelling the Virtual Environment ...... 115 6.5 Software System ...... 117 6.5.1 Class Design and Implementation ...... 117 6.5.2 System Prototype ...... 120

7 User Evaluation 122 7.1 Literature Review ...... 122 7.1.1 Usability Evaluation of Locomotion Interface ...... 122 7.1.2 Distance Perception in Virtual Reality ...... 126 7.1.3 Measuring Presence in Virtual Reality ...... 128 7.2 Usability Test ...... 130 7.2.1 Distance Estimation in Forward Walking (Experiment 1) . 130 7.2.2 Assessing Presence and Distance Estimation in Free Explo- ration Mode (Experiment 2) ...... 134 7.3 Discussion ...... 137

8 Conclusion and Future Work 140 8.1 Conclusion ...... 140 8.2 Future Work ...... 143

References 146

Appendices 162

A The HSV colour Model 163

B Results from Forward Walk Test (Experiment 1) 165 CONTENTS x

C Distance Estimation Results in Free Exploration Mode (Experi- ment 2) 167

D The Presence Questionnaire 169

E Presence Questionnaire Results 171 List of Figures

2.1 The Sarcos Treadport’s CAVE display and simulated tilted floor. . 8 2.2 Omni-directional treadmills: (a) Early design by VSD Inc. (b) Torus Treadmill (c) Cyberwalk ...... 9 2.3 (a) The Gait Master 1 (b) The Gait Master 2 (Iwata et al, 2001), (c) Mobility Simulator (Boian et al, 2004) ...... 10 2.4 The Cybersphere (Fernandes et al, 2003) ...... 12 2.5 The Virtusphere (Virtusphere Inc.) ...... 12 2.6 The Circula Floor (Iwata et al, 2005) ...... 13 2.7 Powered Shoes (Iwata et al, 2006) ...... 14 2.8 The String Walker (Iwata et al, 2007) ...... 15 2.9 The Cybercarpet (Schwaiger et al, 2007) ...... 15 2.10 The Virtual Perambulator (Iwata and Fujii, 1996) ...... 16 2.11 Omni-directional Ball-bearing Platform (Huang, 2003): (a) side view, (b) surface of the OBDP, (c) cross-sectional view of the po- sition sensor, (d) snapshot of the raw tracking data...... 17 2.12 The Sarcos Uniport (Darken et al, 1997) ...... 18 2.13 Walking Pad (Bouguila et al, 2004): (a) first implementation and (b) second implementation ...... 19 2.14 The Gaiter system (Templeman et al, 1999) ...... 20 2.15 Using Wiimote accelerator in the leg interfaces (Shiratori and Hod- gins, 2008): (a) walking, (b) running, (c) jumping and (d) turning 21

xi LIST OF FIGURES xii

2.16 Redirected walking (Engel et al, 2008): (a) virtual environment (b) real walking path ...... 22 2.17 The Step WIM (LaViola et al, 2001). (a) the scaled map, (b) the foot-based interface and (c) user leaning in local navigation mode 23 2.18 Game console controllers: (a)Sony PS3 Six axis (b)XBox gamepad (c)Nintendo Wiimote and Nunchuk (Amazon Store, 2009) . . . . 26 2.19 The Magic Barrier Tape: (a) User with markers attached on the HMD and the hand (b) The barrier tape got pushed by the user (c) The warning tape (red coloured) showing the boundary of the physical walking space. (Cirio et al, 2009) ...... 27

3.1 Lab room layout ...... 31 3.2 Perspective view of the lab room design ...... 31 3.3 The custom-made ball transfer unit ...... 32 3.4 Lemcol ball transfer: (a) Schema diagram, (b) Original form, (c) Trimmed form...... 33 3.5 The first prototype of the omni-directional stroller. (a) top-view (b) bottom-view ...... 33 3.6 The second prototype of the omni-directional stroller. (a) bottom- view showing the Lemcol units (b) top-view showing the shoe strings...... 34 3.7 Dish platform: (a)1.5m diameter, (b)0.75m diameter...... 35 3.8 Inclination test platform ...... 36 3.9 Platform - work in progress: (a) the cardboard prototype, (b) the cut-out plywood dish ...... 36 3.10 Rubber cover: (a) First layer (b) Second layer cover ...... 37 3.11 Holding frame (a) height adjustment mechanism shown (b) open- ing method ...... 38 3.12 Draft design of the safety platform–using sliding mechanism . . . 38 3.13 Wall-mounted support structure design for the safety platform . . 39 LIST OF FIGURES xiii

3.14 Support structure with the beams connector and its base magni- fied. (a) magnified beams connector (b) the hanging beam, com- plete view (c) magnified base with rotational joint ...... 39 3.15 Safety harness: (a) confined space type (b) fall arrest type (MSA Incorporated) (c) Dee Shackles ...... 40 3.16 Safety strap: (a) stainless steel chain (b) and (c) adjustable cam- buckle ...... 41

4.1 System architecture ...... 43 4.2 The cameras positions. Note the camera above the screen. . . . . 46 4.3 Visualisation peripherals: (a) Vuzix VR920 HMD (b) projector . . 47 4.4 Wiimote Nunchuk ...... 47 4.5 Plain coloured markers ...... 48 4.6 LED markers ...... 48 4.7 (a) UV light bulb (b) UV illuminated markers ...... 49 4.8 Chessboard calibration object: (a) A4 size (b) A2 size ...... 49 4.9 Calibration markers: (a) LED torch, (b) LED markers ...... 50 4.10 PC Workstations: Workstation 1 (right) and Workstation 2 (top- left) attached to the monitor (bottom-left) ...... 50

5.1 Gypsy 7 (Animazoo Ltd.) ...... 53 5.2 Flock of Birds magnetic tracker (Ascension Tech Co.) ...... 54 5.3 IGS-190 (Animazoo Ltd.) and Inertia Cube(i-glassesstore.com) . . 55 5.4 Optotrak System: (a)Vertus cameras (b)6DOF rigid body markers (c)7mm and 11mm IRED markers ...... 56 5.5 Vicon tracking system (UC Merced’s School of Engineering, 2008) 57 5.6 HiBall-3100 Tracker (3rdTech Inc.) ...... 57 5.7 Pfinder results (Wren, 1996) ...... 58 5.8 Images from the chamfer matching method (Gavrila and Davis, 1996). (a) raw image (b) scene edge image (c) filtered edge image (d) chamfer image ...... 59 LIST OF FIGURES xiv

5.9 Images from the method of chamfer matching results (Gavrila and Davis, 1996) ...... 59 5.10 Images from the method of motion estimation using exponential maps and twist motion (Bregler and Malik, 1998) ...... 60 5.11 Images from the method of articulated body motion capture by annealed particle filtering (Deutscher et al, 2000) ...... 60 5.12 Multi camera tracking results from Dockstader and Tekalp(2001) . 60 5.13 Tracking Human Motion from Silhouettes (Agarwal and Triggs, 2004) ...... 61 5.14 Tracking Human Motion using Inverse Kinematic constraints (Boulic et al, 2005) (a) position of user against plain background (b) palms tracked by skin colour segmentation (c) search result for elbows and limbs ...... 62 5.15 Markerless tracking module design ...... 63 5.16 Chroma keying result using blue coloured background in which the walking platform not able to be segmented out. (a) raw image (b) foreground image (c) foreground mask ...... 64 5.17 Absolute difference technique: (a) background model (b) real- time capture (c) absolute difference result (d) binary threshold- ing(intensity threshold=120) ...... 65 5.18 Codebook foreground segmentation method: (a)Background model (b)Real time capture (c)Foreground mask (d)Segmented foreground ...... 65 5.19 Top level flow diagram of the markerless foreground segmentation 66 5.20 Subflow process diagram of the upper body segmentation . . . . . 67 5.21 Subflow process diagram of the lower body segmentation . . . . . 68 5.22 Foreground segmentation results of user in various poses. (a) arms apart in T-stand shape (b) right forward walking gait (c) left for- ward walking gait (d) side stepping gait ...... 70 5.23 Parallel camera setups and the generated views ...... 71 LIST OF FIGURES xv

5.24 System design of the parallel camera tracking system ...... 72 5.25 The stereo coordinate system used in OpenCV for undistorted rec- tified cameras (Bradski and Kaehler, 2008) ...... 73 5.26 Chessboard samples for calibration ...... 74 5.27 2D Tracking flow diagram in parallel camera system ...... 75 5.28 Histogram example: given a set of coloured regions(a), a histogram with bin size of 5 represents the count for each category of bin (b). 75 5.29 Colour models: (a) CIE 1931 (b) RGB (c) CMY (d) YUV (U-V colour plane for Y = 0.5) (e) HSV (Rolf G. Kuehni, 2003) . . . . 76 5.30 Parallel camera system: histogram backprojection on green colour, user in T-stand posture ...... 77 5.31 SV thresholding (a) enumerated positions of markers (b) position of markers as shown by thresholding...... 77 5.32 Mean-shift algorithm at work (Bradski and Kaehler, 2008). An initial window is placed over an array of data points at the top right corner and is successively recentered over the local peak of its data distribution until convergence...... 79 5.33 2D Tracking initialisation with parallel cameras: (a) rectified image (b) post-S&V thresholding (c) contour count ...... 79 5.34 The recursive cycle of the Kalman filter (Welch and Bishop, 2006) 82 5.35 Kalman filter output of the raw z-position data ...... 83 5.36 Walking animation states ...... 84 5.37 GUI design of the tracking module ...... 86 5.38 Tracking from three acquired images into 3D tracking points using a multicamera system (3 cameras) and LED orbs (actual images are darker than shown here)...... 89 5.39 System flow diagram ...... 90 5.40 2D Tracking flow diagram. Dashed lines represent one-off processes that are run during initialisation...... 91 5.41 Intensity thresholding examples ...... 92 LIST OF FIGURES xvi

5.42 Multi camera system: histogram backprojection on a magenta colour 92 5.43 Mean-shift tracking inputs generated from intensity thresholding and histogram backprojection ...... 93 5.44 2D Tracking initialisation on multi camera system. All feature points are found at the threshold value of 50...... 94 5.45 Multicam process flow diagram ...... 96 5.46 Cameras placement ...... 96 5.47 Results of multi camera calibration using Svoboda et al’s calibra- tion toolkit (a) the track of the marker’s wand (b) mean and stan- dard deviation of 2D error ...... 97 5.48 LED markers placed on the rig to generate the ground projection matrix...... 98 5.49 GUI design of the tracking module ...... 101 5.50 Kalman filter output from the current prototype (top) and the earlier prototype (bottom). Dashed line represents the original 3D position trajectory...... 102 5.51 X-axis and Z-axis position of the right foot during forward walking gait (a) and during left rotation gait (b). The X-axis represents the image acquisition rate of 2ms in-between ...... 103

6.1 Character animation: (a)mesh-based (b)skeleton-based ...... 109 6.2 The first import of a ninja avatar model ...... 113 6.3 Completed mine worker avatar model ...... 115 6.4 Exporting avatar model into OgreMesh file ...... 116 6.5 Importing 3DSMax file to Blender3D and Ogre3D ...... 117 6.6 Class diagram of the visualisation module ...... 118 6.7 OmniWalker widgets consists of 7 images; one for the (transparent) overlay and six images for each foot tile ...... 119 6.8 The comparison between the coordinate system given by the track- ing module and the coordinate system of the Ogre3D overlay. Vi- olet arrows indicate the forward facing direction...... 120 LIST OF FIGURES xvii

6.9 Visualisation module implementation with all the available input devices...... 121 6.10 Screenshot of the user’s first-person view and third-person view. . 121

7.1 WalkingPad: (a) simulated environment labyrinth (b) experiment result for 2 walking trials (Bouguila et al, 2004) ...... 123 7.2 Virtual Perambulator: (a) Straight path walk (b) Closed square path walk (Iwata and Matsuda, 1992) ...... 124 7.3 Accelerator-based locomotion interface test tracks (Shiratori and Hodgins, 2008) ...... 124 7.4 Mean error distances for the three locomotion modes (Iwata, 1999) 125 7.5 Distance judgments comparisons from different graphical rendering methods (Thompson et al, 2004) ...... 127 7.6 Judged distance in direct walking, comparison between real HMD in a virtual world and mock HMD in the real world (Willemsen et al, 2004) ...... 128 7.7 Distance judgment results comparing the use of an avatar with no avatar (Mohler et al, 2008) ...... 129 7.8 Comparison of the difference in participants’ average relative error in the virtual and real world(Ries et al, 2009) ...... 130 7.9 Screenshots of the user’s first person view in the first experiment: (a) 6m (b) 18m (c) 30m (d) 42m...... 131 7.10 Third person view of the walking track (18m distance) ...... 132 7.11 Usability test participant: (a) using the OmniWalker (b) using the Nunchuk joystick ...... 133 7.12 Distance estimation absolute error results from Experiment 1. The error bars represent standard error of the mean of the estimated distances...... 134 LIST OF FIGURES xviii

7.13 Distance estimation relative error result from Experiment 1. The error bars represent standard error of the mean of the relative errors. Negative value means the user underestimated the actual distance walked...... 135 7.14 Plain scene: (left) top view, (right) perspective view ...... 136 7.15 Underground coal mine scene: (a) top view, (b) first-person view showing personnel transport vehicle and lifeline ...... 137 7.16 Distance estimation relative error result (PS: plain scene, CM: coal mine, Red Width: width of red block, B&Y dist: distance between blue and yellow blocks) ...... 138 7.17 Presence Questionnaire result showing 95% confidence intervals for the mean result...... 139

A.1 The HSV hex cone ...... 163 List of Tables

2.1 Number of game consoles sold worldwide. The NintendoWii figure is quoted for April 2009 to March 2010...... 25 2.2 Summary of the walking-based locomotion interfaces ...... 29

4.1 Camera comparisons ...... 45 4.2 Workstation specifications...... 51

5.1 Parallel camera tracking module configuration file parameters . . 85 5.2 Initialisation parameters in multi camera system. NUMTRACK- POINTS variable refers to the number of markers need to be tracked. 95 5.3 Tracking module configuration file parameters for multi-camera system. This configuration mechanism allowed the system to use different input mode: live camera feed, video camera, or live cam- era with recording capability...... 100

6.1 common plug ins ...... 107 6.2 Ogre3D common plug ins ...... 107 6.3 Delta3D underlying components ...... 108 6.4 Components of choice ...... 111 6.5 Avatar key frames animation (Psionic3D) ...... 114 6.6 OgreMeshExport script parameters ...... 116

B.1 Distance estimation results from experiment 1...... 165

C.1 Distance estimation results from experiment 2...... 168

xix LIST OF TABLES xx

D.1 Presence questionnaire used in experiment 2 ...... 169

E.1 Presence questionnaire results using a joystick in experiment 2. The results are on a 7 point scale...... 172 E.2 Presence questionnaire results using the OmniWalker in experi- ment 2. The results are on a 7 point scale...... 173 xxi Nomenclature xxii

Nomenclature

API Application Programming Interface ATLAS ATR Locomotion Interface for Active Self Motion CAVE Cave Automatic Virtual Environment CIE International Commission on Illumination CMY Cyan Magenta Yellow CUDA Compute Unified Device Architecture DLT Direct Linear Triangulation DOF Degrees of Freedom fps frames per second GUI Graphical User Interface HMD Head Mounted Display HSV Hue Saturation Value HVA Horizontal Viewing Angle KMJ Keyboard and/or Mouse and/or Joystick (referring to the use of any combination of these devices) LED Light Emitting Diode OBDP Omni-directional Ball-bearing Disc Platform ODT Omni-directional Treadmill OpenCV Open Source Computer library QVGA Quarter Video Graphics Array – 320 x 240 pixels resolution RGB Red Green Blue SDK Software Development Kit VE Virtual Environment VR Virtual Reality VRPN Virtual Reality Peripheral Network WIM World in Miniature Chapter 1

Introduction

1.1 Background and Motivation

Mining is a high-risk industry that requires highly trained and experienced per- sonnel to maintain safety standards [116]. The safety performance report of the Australian minerals industry [83] reported 1,484 injuries and 4 deaths in mining accidents in 2008. Also, quoted from the same report, there were 3,037 injuries with 25 fatalities and 1,944 injuries with 85 fatalities in the US and South Africa respectively. Many virtual training simulations have been developed over the past years for the mining industry. Denby et al [35] and Squelch [115] at the University of Not- tingham pioneered the work using low cost simulator software. Stothard [116] at the University of New South Wales has developed immersive simulations for un- derground coal mines using a 360 degree projection screen. Kizil [70] and Kizil et al [71] at the University of Queensland Minerals Industry Safety and Health Cen- tre (MISHC) have developed VR applications to address mining industry issues such as safety training and accident reconstruction. Schofield [109] and Schofield et al [108] have conducted research on the effects of viewing and interacting with VR environments. Recently, the Australian Mining Industry Skills Centre has established Project Canary [84] to develop serious games-based training tools for the resources industry.

1 1.1. Background and Motivation 2

Immersiveness is one of the key factors that influences the quality of the virtual reality experience [23]. Some virtual reality training simulators, such as a driving and flight simulators, can be built in a cabin-like form to immerse a user in the virtual environment. However, there are other simulations in which such an arrangement is not suitable; for example, in a scenario known as unaided Self Escape [116] from an underground coalmine the user in the real scenario would not be constrained within a cabin-like environment. The keyboard, mouse and joystick (KMJ) have been the most prevalent in- teraction methods in virtual simulations. The KMJ were developed in the 1960s and have matured over the years. These devices are the easiest and the most robust interaction devices available to this day. Because of this, most of the currently developed training simulations that require users to walk around the virtual environment use combinations of these (KMJ) devices. Although many learning objectives can be achieved using the KMJ, there are some factors that can not be sufficiently represented, such as fatigue and environmental awareness. Some training scenarios such as Unaided Self Escape will require a user to walk from a few hundred metres to a few kilometres using a breathing apparatus. The use of handheld devices such as a joystick cannot simulate this experience. Usoh et al [128] and Zanbaka et al [144] have found that real walking, rather than “fly-at-ease” joystick controlled motion significantly improves the subject’s learning rate and understanding of the virtual environment. Suma et al [117] stud- ied the difference in users’ behavior in exploring a virtual maze using a handheld controller or real walking techniques. They found that real walking techniques facilitated quicker exploration and users tend to better remember the locations of objects in the environment. In some training scenarios such as underground coal mine Map Reading Training [90] these factors could greatly enhance the user’s learning rate. A device that allows the user to perform actual walking is called a locomotion system. Most of these devices have issues involving safety and are very expensive 1.2. Thesis Goal and Contribution 3 to and maintain. One of the best known locomotion system is the omni-directional tread- mill [33, 66]. It consists of two perpendicular treadmills woven together in a mechanical fabric. There are other locomotion systems where motorized plat- forms are used to compensate for the subject’s foot movement and hence keep the subject stationary [4, 64, 63, 68, 112]. These locomotion systems are often very speed-limited for safety reasons. Sudden acceleration and deceleration while walking is something we commonly do and active motors are inherently incapable of responding rapidly to such unpredictable changes of velocity [19]. Added to the problems of locomotion systems are the cost of building and maintaining these devices. Hollerbach [55] has found that concrete implementations of locomotion inter- faces are not as developed as haptic interfaces. This is due to the safety, cost and complexity of these sophisticated locomotion devices. Hence, in this thesis project we aim to use commonly available hardware, reusable software components and flexible system architecture.

1.2 Thesis Goal and Contribution

The goal of this thesis project is to develop a locomotion system for virtual environments that is relatively inexpensive and robust for regular usage yet still capable of delivering superior immersiveness to the user. This thesis project presents the following contributions:

• A design of the OmniWalker platform that is relatively inexpensive to build and robust for regular usage. The platform is built from the ground up using heavy duty industry components.

• A real-time marker-based tracking system using inexpensive webcams and multi-coloured markers. The 3D coordinates obtained are sufficiently accu- rate to be used as full body motion capture. 1.3. Thesis Structure 4

• A real-time walking recognition system for the OmniWalker. The Omni- Walker platform allows the user to perform a real walk, however the walk cannot be mapped directly to the avatar due to the sliding motion of the foot. A separate gesture classifier was developed to produce a natural look- ing walking motion.

• A more extensive evaluation method for a locomotion interface than is cur- rently available. It is a common practice in locomotion interface research to evaluate user’s performance based on specific metrics. We present a more complete evaluation method incorporating more usability aspects.

1.3 Thesis Structure

This thesis is divided into eight chapters. This first chapter presents the motiva- tion for the work in the remainder of the thesis. Chapter 2 is a review of the literature associated with different locomotion interfaces. The major aspects discussed are the advantages and limitations of different approaches. Chapter 3 describes the construction of the OmniWalker hardware platform and safety measures built into the device. Chapter 4 describes the system architecture in general. It sets the opening for the separate components described in the next chapters and how they fit together into an integrated system. Chapter 5 contains the tracking mechanism being used in the OmniWalker. It describes the earlier prototype which used a parallel camera system and compares the performance of both approaches. Chapter 6 describes the development of the visualisation module. The vi- sualisation module is responsible for the rendering and pooling of all the input interfaces. Chapter 7 contains the user evaluation of the OmniWalker in distance esti- mation and an assessment of user performance in different conditions. 1.4. Produced Papers 5

Chapter 8 contains the conclusion and possible future work regarding the OmniWalker. Limitations of the present system and suggestions and recommen- dations for further study have also been made.

1.4 Produced Papers

The following are the publications which have been produced by or in conjunction with the author during his doctoral candidacy.

1. M. Suryajaya, T. Lambert and C. Fowler, ”Camera-based OBDP Lo- comotion System,” Proceedings of the 16th ACM Symposium on Virtual Reality and Simulation Technology (VRST 2009), Kyoto, Japan, pp. 31-34, Nov. 2009.

2. M. Suryajaya, T. Lambert, C. Fowler, P. Stothard, D. Laurence and C. Daly, ”OmniWalker: Omni-directional Stroller-based Walking Platform, ” Proceedings of the 12th International Conference on Virtual Reality (Laval Virtual VRIC 2010), Laval, France, pp. 181-182.

3. M. Suryajaya, C. Fowler, T. Lambert, P. Stothard, D. Laurence, and C. Daly, ”Development and Evaluation of OmniWalker for Navigating Im- mersive Computer Based Mine Simulations, ” Proceedings of the SimTect 2010 Conference, 2010, Brisbane, Australia, pp. 209-214. (Received The Best Paper Award at which the paper was represented at the Interser- vice/Industry Training Simulation and Education Conference (I/ITSEC) in Orlando, Florida in December 2010 )

4. M. Suryajaya, T. Lambert, C. Fowler and P. Stothard, ”Assessing user’s egocentric distance estimation in OmniWalker, ” Proc. Virtual Reality Soft- ware and Technology (VRST 2010), Hong Kong, China, under review. Chapter 2

Locomotion Interfaces: An Overview

In Virtual Reality, walking is the most intuitive way for people to move about [68, 128,55]. It provides a better sense of distance and direction compared to driving or riding a vehicle. This chapter gives an overview of related work in the area of locomotion interfaces that have been developed over the past decades. The criteria for defining a locomotion interface vary significantly from one source to another. The Locomotion Interface Research Lab (LIRL) at the Uni- versity of Utah [111] defined a locomotion interface as an energy-extractive device that allows the user, within a confined space, to experience human mobility such as walking and running over an unrestrained distance in a virtual environment. The LIRL definition explicitly excludes joysticks or other hand held devices from consideration as locomotion interfaces. The LIRL definition is centred on the hypothesis that the user experiences increased immersion with the ability to ex- plore a virtual environment by foot. The LIRL definition is also supported by other researchers [66, 56]. On the other hand, Burdea and Coiffet [23] defined a locomotion interface as a device that is able to measure the motion of the user’s head, hands or limbs for the purpose of navigating around the virtual environ- ment. This definition is considerably broader in scope than the LIRL definition. It includes all possible means that enable the user to move around the virtual

6 2.1. Walking-based Locomotion Interfaces 7 environment. Many hybrid approaches combine the use of hands and legs, for example [130, 27]. To add further confusion to this, Bowman et al [19] define the use of locomotion devices where the user does not move from his real-world position as passive locomotion. Comparatively, active locomotion refers to a real walk where the virtual environment is a direct map to the real environment. The active locomotion definition does not take into consideration recent novel approaches such as redirected-walking [44, 103]. In the redirected-walking ap- proach, the user is walking in a straight line in the virtual world without being aware that the view is distorted, in such a way that the user is actually walking in a curved path in the real world. In this chapter we discuss different approaches categorized by their type of interface. In section 1, we describe walking-based locomotion interfaces. These locomotion interfaces fit the LIRL’s definition of what constitutes a locomotion interface. They feature energy extraction and unrestrained mobility within a confined space. In section 2, we describe leg-based locomotion interfaces. These interfaces feature energy extraction but do not simulate walking or running ac- tivity. The locomotion devices that fit into this category are pedal-based and step-in-place locomotion interfaces. The third section describes a range of novel hand-based locomotion interfaces. These devices are aimed at providing greater usability to the user. In the last section we provide a discussion of the advan- tages and disadvantages of these devices, which lays the foundation for the later chapters.

2.1 Walking-based Locomotion Interfaces

2.1.1 Motor-based Locomotion Interfaces

Linear Treadmills

Linear treadmills have been used since the early research into locomotion inter- faces. Researchers have used a variety of mechanisms to improve the basic linear 2.1. Walking-based Locomotion Interfaces 8 treadmill. Brooks [22] utilizes a steering bar for changing direction. Noma and Miyasato [91] employ a series of linear actuators underneath the treadmill belt for simulating the slope of a virtual terrain. Christensen et al [26] use a large manipulator connected to the walker to provide a sense of gravitation while the walker is passing on a slope. Hollerbach et al [4] developed the Sarcos Treadport which enabled the simulation of a tilted floor (Figure 2.1). The system employs a large tilting treadmill, an active mechanical tether, and a CAVE-like visual display. The CAVE-like visual display consists of three rear-projection screens surrounding the user (Figure 2.1 (left)). One advantage of using a linear treadmill is the robustness of the hardware since treadmills have been around for decades. However there is an issue with controlling the turning. Noma and Miyasato [91] developed the ATLAS system which is a linear treadmill suspended on a spherical joint that can pitch, yaw and roll. The turning in the ATLAS system is achieved by turning the treadmill in the direction that the user is stepping.

Figure 2.1: The Sarcos Treadport’s CAVE display and simulated tilted floor.

Omni-directional Treadmills

Virtual Space Devices (VSD), Inc. [33] built the first Omni-directional Treadmill (ODT) (Figure 2.2(a)). It employs two perpendicular treadmills, one inside the other. Each belt is made of approximately 3400 separate rollers woven together into a mechanical fabric. Separate servo motors control each belt. The motion of the lower belt is transmitted by the rollers to the walker. 2.1. Walking-based Locomotion Interfaces 9

Iwata [66] improved this further by developing the Torus Treadmill. It ad- dresses some of the problems with the omni-directional treadmill; among them were the severe noise level generated and the low level of safety when the user crawled on the surface. The Torus Treadmill also has the potential to present an uneven surface using an array of linear actuators. M¨unchen Technical University built a similar system called Cyberwalk [89] (Figure 2.2(c)).

Figure 2.2: Omni-directional treadmills: (a) Early design by VSD Inc. (b) Torus Treadmill (c) Cyberwalk

As reported by Darken et al [33], there were major problems with the early ODT’s tracking systems and machine control mechanisms for centring the user on the treads. There was a tendency for the user to stumble when the user intended to stop but the system had not yet settled to rest. The worst cases of stumbling occurred when the treads over-rotated in centring the user; the user in reflex would place a foot either in front of or behind the body to stop the fall. In turn, this brief motion was sensed by the tracker which led the ODT to respond by pulling the treads even more in the direction of the fall. Consequently, this triggered the kill switch and shut down the system. These major limitations were also not addressed in other omni-directional treadmills [66] [89].

The Gait Master

The Gait Master (Figure 2.3(a) & (b)) is a locomotion interface that creates a sense of walking on an uneven surface [64]. There were two implementations for this device. 2.1. Walking-based Locomotion Interfaces 10

Figure 2.3: (a) The Gait Master 1 (b) The Gait Master 2 (Iwata et al, 2001), (c) Mobility Simulator (Boian et al, 2004)

The Gait Master 1 consisted of a two degrees of freedom (2 DOF) motion platform that moved back-and-forth as well as up-and-down (Figure 2.3(a)). It employed a pantograph mechanism, where the base joints were actuated by two AC motors and chains. It was able to simulate forward walking on an uneven terrain. The horizontal working area of the platform was 80cm. The vertical working area was 20cm. The maximum horizontal speed of the foot pad was 1.5m/s and the maximum payload was 80kg. Initially the system used a Polhemus FASTRACK motion tracker to track the user’s foot positions. However due to the magnetic interference caused by the metals in the motion platform, the FASTRACK was replaced by a custom made string sensor. The string sensor consisted of three sets of tensioned strings connected between the foot pad to the sandals of the user. The system was able to recognise the user’s walking gait by measuring the distance between the foot and the foot pad. The Gait Master 2 is the second prototype. It employs a turntable to enable omni-directional walking (Figure 2.3(b)). Compared to the first implementation, the Gait Master 2 uses motion platforms reassembled from a 6 DOF Stewart platform [81] mounted on a turntable. A three DOF goniometer is connected to each foot to measure back-and-forth and up-and-down motion as well as yaw 2.1. Walking-based Locomotion Interfaces 11 angle. The horizontal working area is 32cm (front-back) x 28cm (left-right) x 20 cm (up-down). The maximum payload is 150kg. Compared to the Gait Master 1, the Gait Master 2 allows the user to change walking direction up to 90 degrees in a single step. However due to the length limitation of the linear actuator, the working area of the Gait Master 2 is very limited. The user can only walk in small steps (32cm). Boian et al [14] developed a locomotion system similar to the Gait Master 2. Their system uses two Steward platform robots for the mobility simulator (Figure 2.3(c)). The system was developed to allow training on various surfaces within a safe clinical environment. Because it is used for therapy purposes, it uses smaller actuators than the Gait Master 2. Safety is a potential concern in Gait Master-type locomotion interfaces [55]. Because of the weights and forces that need to be supported, these type of lo- comotion interfaces can be quite powerful. Programmable foot platforms are essentially powerful robot arms attached to the feet and any malfunction can cause serious injury to the user.

The Cybersphere

The Cybersphere [45] consists of a large, hollow, translucent sphere, 3.5 metres in diameter (Figure 2.4). It is supported by a low-pressure cushion of air to ease the sliding motion of the sphere. Below the sphere there is a smaller secondary sphere equipped with rotation sensors. The movement of the user inside the sphere is tracked using these rotation sensors. Images are projected directly on to the sphere from five projectors. One projector is mounted on the ceiling directly above the Cybersphere and four projectors are mounted on the surrounding walls.

The Virtusphere

The Virtusphere [131] is a simulation platform similar to the Cybersphere, but with different tracking and moving mechanisms (Figure 2.5). The Virtusphere uses active rollers below the large sphere. These rollers are motorised to rotate 2.1. Walking-based Locomotion Interfaces 12

Figure 2.4: The Cybersphere (Fernandes et al, 2003) the sphere in the opposite direction to the user’s movement. Instead of projecting the image onto the sphere’s surface, the Virtusphere uses a wireless head mounted display (HMD) to generate the user’s view. The HMD also generates tracking data for the active rollers placed below the sphere. The user’s experience in using the Virtusphere is difficult to quantify because there is no usability study. There is a potential risk when the sphere picks up too much speed, causing the user to somersault inside and potentially be injured [11].

Figure 2.5: The Virtusphere (Virtusphere Inc.) 2.1. Walking-based Locomotion Interfaces 13

The Circula Floor

The Circula Floor (Figure 2.6) simulates an infinite omni-directional surface using a set of omni-directional movable tiles [63]. Each tile is equipped with a holonomic mechanism that achieves omni-directional motion. An infinite surface is simulated by the circulation of the movable tiles. Position sensors measure the motion of the feet. The tiles move in the opposite direction to the walker’s measured direction, cancelling the motion of the step. This computer-controlled motion of the tiles fixes the walker’s position. The circulation of the tiles can cancel the walker’s displacement in an arbitrary direction. Hence the walker can freely change direction while walking. It has been reported that the major limitation with the Circula Floor approach is the insufficient accuracy of the tiles in tracing the walker’s foot. The walker has to be careful not to miss the tile [63].

Figure 2.6: The Circula Floor (Iwata et al, 2005)

Powered Shoes

Powered Shoes (Figure 2.7) employ active roller skates using rollers connected by a flexible shaft to a backpack unit containing motors and batteries [65]. They have the advantage of being a light weight wearable device but they have severely 2.1. Walking-based Locomotion Interfaces 14 limited walking gait and durability [68]. The roller skates can only move in one direction hence limiting the gait of the user to walking straight only. The flexible shafts are custom made and are not durable for frequent use.

Figure 2.7: Powered Shoes (Iwata et al, 2006)

The String Walker

Iwata et al [68] developed the String Walker (Figure 2.8) using motor-driven strings to maintain the user’s foot position. With eight strings in total, four strings are connected to each shoe and actuated by a motor-pulley mechanism. Each motor is capable of measuring the position and orientation of the shoe via an in-built rotary encoder. The strings pull the shoe in the opposite direction to walking to cancel the walker movement. The device allows the user to change the direction of walking via a motor-driven turntable to which all the motors are mounted. While this mechanism allows the direction of a walk to change by a small angle with each step, it does not allow the user to change the look-at direction while standing still.

The Cybercarpet

Schwaiger et al [112] developed the Cybercarpet (Figure 2.9) as a trial unit for an active ball bearing platform. The unit consists of ball bearings suspended on top of a rotatable treadmill. The device enabled the user to walk at high speeds similar to a treadmill. It is also capable of changing the walking direction 2.1. Walking-based Locomotion Interfaces 15

Figure 2.8: The String Walker (Iwata et al, 2007) of the user using the rotatable treadmill. The Cybercarpet remains in its early prototype phase; it has not been developed as a locomotion device because it is still lacking the necessary tracking mechanism. There are also reports of the soles of the shoes deforming due to the ball sliding mechanism.

Figure 2.9: The Cybercarpet (Schwaiger et al, 2007)

2.1.2 Sliding-based Locomotion Interfaces

The Virtual Perambulator

Iwata and Fujii [67] developed one of the earliest locomotion interfaces called the Virtual Perambulator (Figure 2.10). It employs a strolling mechanism to simulate the walking motion. The walker wears omni-directional sliding devices on the feet, and a hoop is set around the walker’s waist by which the position of the walker is limited. The sliding device consists of a roller skate equipped 2.1. Walking-based Locomotion Interfaces 16

Figure 2.10: The Virtual Perambulator (Iwata and Fujii, 1996) with four castors which facilitates two dimensional movement. A brake pad was put at the toe to generate a backward friction force at the foot. A Polhemus FASTRACK sensor was used for tracking the walker’s motion.

Gait-sensing Disc / Omni-directional Ball-bearing Disc Platform (OBDP)

Huang [56] recently developed the Omni-directional Ball-bearing Disc Platform (OBDP) (Figure 2.11) in the shape of a dish, allowing a natural way for the user to walk inside the virtual environment. The OBDP uses 975 custom-made ball bearing sensors embedded on the surface of the dish to detect the user’s feet. The curvature of the dish and the ball-bearings are said to permit the user’s foot to slip back to the centre of the dish. Each ball bearing sensor (Figure 2.11(c)) is composed of steel balls, an isola- tion axle, a spring and a stainless steel shim. Additionally six small steel balls of 0.25-in diameter are inserted and capped with an isolation mandrel to smooth their rolling. The disadvantages of the OBDP are its expensive construction and mainte- nance cost due to the custom-made ball bearing sensors. There are a large number 2.2. Leg-based Locomotion Interfaces 17

Figure 2.11: Omni-directional Ball-bearing Platform (Huang, 2003): (a) side view, (b) surface of the OBDP, (c) cross-sectional view of the position sensor, (d) snapshot of the raw tracking data. of these and each is relatively fragile with respect to the user’s weight and the force exerted by the foot. The initial prototype of the device [56] only allowed forward walking, however there is good potential for the device to recognise a greater range of walking gaits.

2.2 Leg-based Locomotion Interfaces

The following subsections describe devices that feature energy extraction but do not simulate walking or running activity. They emulate the walking action through other means that involve leg(s) movement. 2.2. Leg-based Locomotion Interfaces 18

2.2.1 Pedal-based Locomotion Interfaces

The Uniport

The Uniport [33] was the first device built for lower body locomotion and exertion (Figure 2.12). The system was developed for the U.S. Army Dismounted Infantry Training Program and was displayed in 1994. The user pedals to simulate walking or running.

Figure 2.12: The Sarcos Uniport (Darken et al, 1997)

The metaphor of the Uniport is that of cycling rather than natural bipedal locomotion. It allows the user to move forward or backward, turn left or right, and experience force feedback when going up or down inclines. The direction of motion is controlled by the Uniport’s seat while movement direction is specified by the twist of the seat, controlled by the user’s waist and thighs. While it maps user exertion to movement in the environment, it is cumbersome in its methods of manoeuvring over short distances [33]. Small motions such as side-stepping or small rotations of the body are difficult if not impossible to perform. 2.2. Leg-based Locomotion Interfaces 19

2.2.2 Step-in-Place Locomotion Interfaces

Walking Pad

Bouguila et al [16] [17] developed a system that enables walking recognition using a walking pad.

Figure 2.13: Walking Pad (Bouguila et al, 2004): (a) first implementation and (b) second implementation

The first implementation (Figure 2.13 (a)) employs a turntable as a walking platform. The user stands on top of this platform and interacts with the virtual environment projected on the large screen. The user can then engage in a virtual walking experience by stepping in place. To change the moving direction or to explore the surrounding environment, the user is required to turn their body about their vertical axis. The turntable will then smoothly and passively rotate in the opposite direction to keep the user continuously oriented toward the screen. The user orientation is tracked via an infrared camera and two coloured markers. Sideward and backward displacement is tracked via four pressure sensors lo- cated on the bottom sides of the turntable. To move sideways for instance, the user needs to move one of their feet outside the inner ring to one of the zones 2.2. Leg-based Locomotion Interfaces 20 where a sensor is located. The second implementation (Figure 2.13 (b)) was developed at the University of Fribourg, Switzerland [17]. Compared to the first one, the second implementa- tion has significantly simpler functionality. It consists of a flat platform embedded with a grid of switch sensors that detect footfall pressure. Based on the data col- lected the system can compute different variables that represent the user’s walking behavior, such as direction, speed, standstill, jumping and walking. The dimension of the flat platform is 45cm x 45cm. This device is incapable of other locomotion features such as sidestepping and backward movement.

The Gaiter

Very similar to the previous system (Step-in-Place), the Gaiter [123] (Figure 2.14) allows the user to walk in place by using magnetic trackers attached to the thighs, waist and head, and a hand grip. Force sensors are attached to the foot pads. The waist sensor gauges the position and orientation of the body, the head sensor records the gaze direction, and the handgrip sensor is for auxiliary purposes. Walking forward, backward, or side stepping is controlled using gestural knee actions.

Figure 2.14: The Gaiter system (Templeman et al, 1999) 2.2. Leg-based Locomotion Interfaces 21

Accelerometer-based Locomotion Interface

Shiratori and Hodgins [113] developed an accelerometer-based locomotion inter- face using the Wiimote controller, a three-axis accelerometer (Figure 2.15). They tested the use of the Wiimote in three different positions; the wrist, the arm, and the leg. The position of most interest is the use of the Wiimote as a leg interface. A Wiimote joystick is attached to each leg and an additional Wiimote is attached to the user’s head. Movement of the avatar is controlled by stepping in place and head inclination is used to control the avatar’s turning.

Figure 2.15: Using Wiimote accelerator in the leg interfaces (Shiratori and Hod- gins, 2008): (a) walking, (b) running, (c) jumping and (d) turning

2.2.3 Other Novel Techniques

Redirected Walking in Place

Redirected walking guides the user along a particular path within the real world by subtly rotating the user’s representation within the virtual environment (VE) [46]. The original idea was conceptualised by Razzaque et al [103] to allow people to move around a CAVE system without ever having to turn to face the missing back 2.2. Leg-based Locomotion Interfaces 22 wall. Engel et al [44] improved this further by using an optimization technique to derive the optimal rotational gains (Figure 2.16).

Figure 2.16: Redirected walking (Engel et al, 2008): (a) virtual environment (b) real walking path

Freeze-Backup, Freeze-Turn, and 2:1-Turn

Williams et al [136] developed three novel locomotion methods that allow users to walk within a restricted space. In the Freeze-Backup method, the computer indicates to the user when they have reached the boundaries of the tracking system and it needs to reset. Then the user takes steps backwards in the physical space while the user’s position in virtual space remains frozen. When sufficient backward steps are taken, the system indicates for the user to stop, the displays are unfrozen, and the user is allowed to walk like normal again. In the Freeze- Turn method, the initial response is the same. The computer indicates to the user when they have reached the boundaries of the tracking system and it needs 2.2. Leg-based Locomotion Interfaces 23 to reset. But instead of walking backward, the user is instructed to turn 180 degrees. Once the reset action is done, the display is unfrozen and the user can continue walking. The 2:1-Turn method is similar to the Freeze-Turn method, but the rotational gain of the yaw angle during this turn is scaled by two. Williams et al [136] study of these three methods indicated that the lowest errors in rotation occurred in the Freeze-Backup method. The worst performance was recorded in the Freeze-Turn method. They also found that the majority of the test participants preferred the Freeze-Backup technique compared to the others.

The Step WIM

The Step WIM [75] is a locomotion technique for moving around the virtual envi- ronment based on the World-In-Miniature (WIM) technique. The WIM method was originally developed by Pausch et al [97]. It provides a hand-held miniature of the scene where the user can move around a pointer to explore the virtual environment. The Step WIM (Figure 2.17) takes a similar approach but instead of making the miniature a handheld object, it displays the scaled map on the floor. The system uses three different types of interface, a foot-based interface, a gaze direction tracker and a body position magnetic tracker. The foot-based interface (Figure 2.17(b)) has two switches embedded on the right slipper placed on the toe and the heel.

Figure 2.17: The Step WIM (LaViola et al, 2001). (a) the scaled map, (b) the foot-based interface and (c) user leaning in local navigation mode

The Step WIM system has three operation modes: local, step WIM and map scaling. Initially the system is on local navigation mode. In this mode the user 2.2. Leg-based Locomotion Interfaces 24 leans or makes relatively small movements. Small leaning movements allow the user to look in a direction that is different than the one in which he or she is moving. From this mode the user can tap on the floor using the toe switch to change to the “step WIM” mode. In the “step WIM” mode a map is displayed on the ground centred on the user’s current standing position within the map. When the user is looking away from the map, another tap on the toe switch will dismiss the map. When the user’s head direction is 25 degrees below horizontal, another tap on the toe switch will move the avatar position to the location on the map. Within the “Step WIM” mode the user can switch in or out of the “map scaling” mode by clicking on the heel switch. In the “map scaling” mode, the user can walk forward to scale up the map, or walk backward to scale it down. The map scaling feature enables the user to control the navigation with greater precision.

The Seven League Boots

Interrante et al [60] developed the Seven League Boots’ technique for traveling long distances. The technique involves determining the user’s intended direction of travel and proportionally scaling the walking speed aligned with that direction. In this system the direction of travel is determined as a weighted combination of gaze direction and direction of previous displacement integrated over a period of time. The method was found to be useful for long distance travel but does not address close distance travelling needs.

The Nintendo Balance Board

The Nintendo Balance Board was introduced in the second half of 2007. It con- tains multiple pressure sensors that are able to measure the user’s centre of bal- ance and weight. In the Nintendo Wii platform the balance board has been used in different metaphors such as surfing, magic carpet riding and 2D transporta- tion. In Virtual Reality research it has also been used as a locomotion interface. De Haan et al [34] used the Wii Balance Board as a 3 DOF navigation controller 2.3. Hand-based Locomotion Interfaces 25 for rotational control of objects and for navigation technique in a virtual envi- ronment. Sch¨oninget al [110] combined the use of the Nintendo Balance Board with a multi-touch display. The system is used to navigate through geospatial domain data. Sch¨oninget al’s [110] user study revealed the user’s preference for the combined use of hands and feet in terms of comfort, smoothness and learnability. Valkov et al [130] developed another combined approach using the multi-touch surface and the Nintendo Balance Board for navigating through a virtual environment.

2.3 Hand-based Locomotion Interfaces

Some virtual reality applications measures the motion of the user’s hands for the purpose of locomotion [23]. These interfaces usually mapped the relative movement of the tracker to the virtual environment at much larger scale. Since the tracking is done at much smaller area, greater tracking accuracy can be achieved.

2.3.1 Joystick-based Locomotion Interfaces

During the year 2009 the video game consoles market was dominated by three big players; Sony with its PS3, Microsoft with its XBox, and Nintendo with its Wii console. The total number of units sold worldwide in 2009 [57, 125, 142] are presented in Table 2.1.

Table 2.1: Number of game consoles sold worldwide. The NintendoWii figure is quoted for April 2009 to March 2010. units sold in 2009 Sony PS3 13 million Microsoft XBox 11.2 million Nintendo Wii 20.53 million

The Sony PS3 is shipped together with a six axis controller (Figure 2.18(a)) by default. This gamepad has two joysticks and eight buttons on its surface. The 2.4. Concluding Remarks 26

XBox game console is also shipped together with a default gamepad which has a similar functionality to the Sony PS3’s gamepad (Figure 2.18(b)).

Figure 2.18: Game console controllers: (a)Sony PS3 Six axis (b)XBox gamepad (c)Nintendo Wiimote and Nunchuk (Amazon Store, 2009)

Nintendo Wii created a breakthrough by releasing the Wiimote controller (Figure 2.18(c)). In addition to the buttons, the Wiimote comes with motion sensors and optical sensors. Its motion sensing is capable of measuring verti- cal and horizontal axis movement and horizontal rotation. The optical sensor is an infra red camera looking at a sensor bar placed above the television set. Additionally the Nintendo Wii comes with the Nunchuk attachment which is a wand shaped controller. The Nunchuk consists of only a single joystick and two buttons.

2.3.2 The Magic Barrier Tape

The Magic Barrier Tape uses a handheld marker for pushing a virtual barrier tape in the virtual environment (Figure 2.19(a)) [27]. In this technique, the user can physically walk within a restricted space marked by a yellow barrier tape in the virtual environment (Figure 2.19(b)). When the user’s position is close to the barrier tape, a red warning tape appears giving an indication to the user to start “pushing” (Figure 2.19(c)).

2.4 Concluding Remarks

This chapter has presented an overview of locomotion interfaces. It has intro- duced the key elements of a good locomotion interface and discussed various 2.4. Concluding Remarks 27

Figure 2.19: The Magic Barrier Tape: (a) User with markers attached on the HMD and the hand (b) The barrier tape got pushed by the user (c) The warning tape (red coloured) showing the boundary of the physical walking space. (Cirio et al, 2009) walking-type locomotion interfaces. Interfaces other than the walking-based locomotion interfaces are interesting approaches, however they lack energy extracting gaits and sensory motor inte- gration. The lack of these two factors reduces the ability of the user to learn and appreciate the virtual environment’s dimensions and distances. For this reason we focus our efforts on walking-based locomotion interfaces. Table 2.2 summarises the advantages and disadvantages of the walking-based locomotion interfaces. Most of these devices have very limited information in terms of cost and speed performance. Therefore, the summary only provides some indicators of the devices’ limitations and superiorities. Multi-directional walk refers to the user’s ability to change walking direction. Motor-based locomotion interfaces provide a better sense of movement at the cost of linear acceleration. Humans tend to walk in non-linear acceleration; walking quickly and then suddenly stopping, and then suddenly walking quickly again. However, active motors can only accelerate and decelerate in linear fashion hence active locomotion interfaces are inherently incapable of responding rapidly to unpredictable changes of acceleration. Safety is a potential concern in Gait Master-type locomotion interfaces. The programmable foot platforms used in the Gait Master are essentially powerful robot arms attached to the feet and any malfunction can cause serious injury to 2.4. Concluding Remarks 28 the user. Compared to motor-based locomotion interfaces, sliding-based locomotion in- terfaces are less prone to safety issues but with some trade offs. Without having motors, the sliding movement of the user’s feet relies on other factors, either the self-imposed force or a gravitational force. So far, there are only two locomotion interfaces under this slider-based cate- gory: the Virtual Perambulator and the OBDP. The Virtual Perambulator relies on the user’s self-imposed force hence has the disadvantages of reduced immer- siveness. On the other hand, the OBDP is very promising since it allows the user to walk in a natural fashion. The combination of the curved shape of the dish and the ball rollers is said to allow the user’s foot to slide naturally under gravitational force. However the OBDP is not an ideal choice because it is very expensive to build and maintain. 2.4. Concluding Remarks 29

Table 2.2: Summary of the walking-based locomotion interfaces Advantages Disadvantages Linear Treadmill relatively inexpensive only forward walking available Omni-directional multi-directional walk expensive Treadmill Gait Master can simulate terrain limited speed height Cybersphere surround image projec- possible hazard when the tion sphere is rolling too fast Virtusphere multi-directional walk possible hazard when the sphere is rolling too fast

Motor-based Circula Floor creates infinite walking limited speed surface Powered Shoes relatively light unit fragile, straight gait only String Walker multi-directional walk safety may be a concern Cybercarpet multi-directional walk safety may be a concern and there is no tracking mechanism in place. Virtual Perambula- relatively inexpensive needs self-imposed force tor to keep the user in place OBDP allows natural walk with- fragile and expensive to

Sliding-based out motor build and maintain Chapter 3

Mechanical Hardware Design for the OmniWalker

In this chapter the details of the laboratory layout and the development of all the necessary mechanical hardware for the OmniWalker, including the walking platform are presented.

3.1 Laboratory Layout

For this project I were given a room sized 2.9m(w) x 5.2m(l) x 2.9m(h). The detailed dimensions of the room and the instruments’ placement is given in Fig- ure 3.1. Figure 3.2 gives a perspective view of the lab’s general layout.

3.2 Walking Platform

3.2.1 Ball Transfer Conveyor

One of the main problems with the OBDP [56] is the use of expensive, easily bro- ken custom made ball bearing units with embedded switches. Hence my primary challenge was to improve the ball transfer units to address the issues of cost and robustness.

30 3.2. Walking Platform 31

Figure 3.1: Lab room layout

Figure 3.2: Perspective view of the lab room design

Initially I planned to manufacture the ball transfer unit ourselves to obtain a higher density of the ball transfer units on the dish platform. I developed my initial design and went ahead with making prototypes (shown in Figure 3.3). The height of each unit is 4.5cm and the diameter is 2.5cm. The prototype has the 3.2. Walking Platform 32 appropriate roll property which can be adjusted by tightening the bottom cap. However, I found the cost of these units to be quite significant since I need to order the manufacture of thousands of them. In addition, due to the nature of the custom manufacturing of the units, I found the quality to be inconsistent. Some units were not able to roll smoothly and some had the top cap a bit too wide and hence they could not hold the ball bearing.

Figure 3.3: The custom-made ball transfer unit

The other alternative I examined is a Lemcol ball transfer conveyor (Fig- ure 3.4) which is widely used in manufacturing lines [12]. These units have an excellent rolling property and excellent manufacturing consistency; I only found a few units that were malfunctioning. However, the problem with the Lemcol ball transfer is the mounting used for holding the unit is about 2cm wide, leaving a significant gap between the units. This gap ranges between 4cm if the units are placed in a matrix pattern to 2cm if they are placed in an interlock pattern. To fix this problem I had to ship all the units to a mechanical workshop to have the mounting part trimmed. The finished product is shown in Figure 3.4(c). In addition to the reduced costs, another benefit of using the Lemcol ball transfer conveyor is its ability to sustain the weight of a relatively heavy ob- ject. Each can support a maximum capacity of 45kg in a 36cm2 space. A foot size of 20cm × 8cm dimension can be supported by a minimum of four of these ball-bearings. Hence it can support a person with a maximum weight of 180kg standing on one foot. 3.2. Walking Platform 33

Figure 3.4: Lemcol ball transfer: (a) Schema diagram, (b) Original form, (c) Trimmed form.

3.2.2 Omni-directional Stroller (Early Designs)

Initially I developed and performed experimentation on an omni-directional stroller. The idea was to improve the Virtual Perambulator developed previously by Iwata and Fujii [67]. Instead of using ordinary rollers, I used omni-directional strollers. These are rollers were expected to allow for a better walking experience.

Figure 3.5: The first prototype of the omni-directional stroller. (a) top-view (b) bottom-view

I built two prototypes for this experiment. The first prototype (Figure 3.5) used a custom made ball transfer unit combined with a shoes attachment made out of 45mm thick wood. On the sides I put six 2mm stainless steel plates for mounting the shoe string to hold the user’s shoes. At the bottom I made five holes to attach the ball transfer. The second prototype (Figure 3.6) uses the Lemcol ball transfer conveyor. Brief experiments showed that the second prototype had better sliding prop- erties than the first one. Inside both of these ball transfer units are several 5mm 3.2. Walking Platform 34

Figure 3.6: The second prototype of the omni-directional stroller. (a) bottom- view showing the Lemcol units (b) top-view showing the shoe strings. ball bearings. The first custom made one could not slide as much because it had less 5mm ball bearings and a much smaller space than the Lemcol unit for them to move around. I decided not to continue on in this direction with the Omni-directional stroller due to safety and usability issues. First, the sliding was a bit too much to allow a user to walk safely in a comfortable manner. When wearing these strollers the feet tend not to stay put because it is very slippery. Secondly, the stroller itself is quite heavy hence impacting the user’s walking experience.

3.2.3 The Dish Platform

Because the omni-directional stroller did not perform as expected, I ventured to build a walking platform in the shape of a dish. The initial idea for my platform originated from the OBDP [56] which uses an arc shaped dish, based on the swing angle of the human walk. Initially I constructed the platform dish as a height-proportional hemispheri- cal dish (that is, where the radius is the distance between waist and foot) made from stainless steel. It has diameter of 1.5m and a depth of 33cm (Figure 3.7(a)). However, upon an initial trial I found this arc shaped dish does not have a suffi- cient sliding property. The foot tends to stop about 30cm from the centre. The user can certainly pull it back to the centre themselves, but this makes the walk feel unnatural. 3.2. Walking Platform 35

I did another test with a smaller dish (Figure 3.7(b)). The smaller dish has a smaller radius and so has a greater angle of inclination. I wanted to test whether it could provide a better feeling of walking. This second dish has a diameter of 75cm and a depth of 20cm. I found similar result as for the larger dish; the foot stopped sliding about 15-20 cm from the centre.

Figure 3.7: Dish platform: (a)1.5m diameter, (b)0.75m diameter.

As a result of these findings I modified the design of the dish curvature to incorporate a flat inclination angle. Specifically, a horizontal cross-section of the dish is polygonal rather than circular in the new design. A less curved inclination angle was shown to work much better since it provides better footing for the feet to slide. I built an inclination test platform (see Figure 3.8) to measure the minimum inclination angle that can provide sufficient sliding of the feet. Brief trials of different inclination angles were performed and a 15 degree angle was found to be the minimum angle. The original dish in Figure 3.7(a) took several months to be built and shipped. There was significant cost and time in getting this first dish completed. Some of the delay was due to the lack of local manufacturing capabilities, meaning I had to source it from overseas. Because of this I did not create the new proposed design using stainless steel material as before. I created the new dish platform with a flat inclination angle using 17mm ply- wood. Compared with other wooden materials, plywood has greater strength and greater resistance to cracking. Prior to submitting the order to the mechanical 3.2. Walking Platform 36

Figure 3.8: Inclination test platform workshop I also produced a simple cardboard prototype and paper patterns to cut the plywood (Figure 3.9(a)).

Figure 3.9: Platform - work in progress: (a) the cardboard prototype, (b) the cut-out plywood dish

3.2.4 Rubber Enclosure

In its original condition the dish was made out of 2mm stainless steel which had quite sharp edges. I enclosed the edge of the dish with layers of protective cov- ering. For the first layer of protection I used the rubber body seal used around car doors. I also inserted a sponge tube with diameter of 0.5cm to fill its void (Figure 3.10(a)). On top I enclosed the rubber body seal with the rubber insula- tion which is used for thermal insulation of air conditioning and hot-water pipe systems. This second layer provides further protection in case the user bumps 3.3. Safety Peripherals 37 into the edge of the dish (Figure 3.10(b)).

Figure 3.10: Rubber cover: (a) First layer (b) Second layer cover

3.2.5 Holding Frame

Instead of using a rotatable orbiting frame (as in the OBDP [56]), I used a non- rotatable frame. Disabling rotation allowed us to use a smaller number of cam- eras, reducing the computation power necessary for tracking. Change of direction is done using gesture recognition which will be described in later chapters. The holding frame in the OmniWalker serves two purposes; to assist the user’s balance during walking and to hold the platform against the ground. As was done for the hanging beams (see section 3.3.1), I also made the frame height adjustable to fit people of different height (Figure 3.11(a)). The holding ring is divided in the middle so it can be pulled apart by sliding the halves outward (Figure 3.11(b)).

3.3 Safety Peripherals

3.3.1 Safety Harness Support Structure

In order to ensure safety, the subject using the OmniWalker platform needs to wear a safety harness supported from above. This serves two purpose; first, to allow the user to safely step onto the walking platform, and secondly as a safety precaution should the user lose their footing on the OmniWalker. 3.3. Safety Peripherals 38

Figure 3.11: Holding frame (a) height adjustment mechanism shown (b) opening method

Initially I planned to construct parallel frames with hanging tracks (Fig- ure 3.12) to which the harness would be attached. Upon investigation I found the material and labor cost to construct the planned parallel design was beyond the project budget.

Figure 3.12: Draft design of the safety platform–using sliding mechanism

I then changed the design to be wall-mounted support structure (Figure 3.13). This design consists of two beams, the first rotates relative to its base that is attached on the wall and the second one rotates relative to the first beam, allowing the user to change the direction they are facing. The second beam is needed as the spreader bar because I used a parachute style harness which has two D-rings on the shoulders. The connector between the beams is designed to allow the second beam to slide along the first beam (Figure 3.14(a)). This sliding mechanism allows further 3.3. Safety Peripherals 39

Figure 3.13: Wall-mounted support structure design for the safety platform adjustment to get the second beam at the right spot above the walking platform. A bolt of diameter 10 mm is placed on the top side of the connector. Once the walking platform is in place, this bolt can be fastened to stop the second beam from moving. The beams are mounted on a base using a rotational joint (Figure 3.14(c)). The mounting base is constructed from 10mm steel plates and a 20mm bolt is used as the pivot. At the middle of the base are two bolts for stopping the first beam from hitting the window. At the end of the first beam that is close to the mounting base, a clamp fitted with two bolts is welded. This clamp provides a resisting force to prevent the beams from swinging wildly.

Figure 3.14: Support structure with the beams connector and its base magnified. (a) magnified beams connector (b) the hanging beam, complete view (c) magnified base with rotational joint 3.4. Conclusion 40

3.3.2 Safety Harness

As mentioned in the previous subsection I used a parachute style harness (also called confined space type harness) instead of a fall arrest type (see comparison in Figure 3.15). The reason I chose the parachute style harness is because it gives the user more control and balance when entering the platform.

Figure 3.15: Safety harness: (a) confined space type (b) fall arrest type (MSA Incorporated) (c) Dee Shackles

Initially I used stainless steel chains (Figure 3.16(a)) for strapping the harness to the hanging beam but then changed it to using an adjustable webbing straps (Figure 3.16(b)) instead. I did this because the chain put significant weight (about 4 kilograms) on the user’s shoulder leading to discomfort. I attached the webbing straps to the beam using Dee Shackles (Figure 3.15(c)).

3.4 Conclusion

At the end of this phase I have developed a new and innovative walking platform. The original idea came from different authors but through some trials and errors I have gained a new understanding of the factors necessary to make the walking platform mimics natural walking gait. The flat inclination of the dish allows the user to slide easily. Together, the dish shaped of the walking platform allows the user to walk naturally but does not slip easily. Next, I developed the soft- ware tracking mechanism to allow the walking platform to be used as locomotion 3.4. Conclusion 41

Figure 3.16: Safety strap: (a) stainless steel chain (b) and (c) adjustable cam- buckle interface for driving/controlling the avatar in a virtual reality simulation. Chapter 4

System Architecture and System Hardware

The walking platform described in the previous chapter allows the user to walk at virtually infinite distance. The part that is still missing is the visualisation and the tracking mechanism used to convert the user’s walking action to the avatar’s walking action in the virtual simulation. This chapter described the software system architecture and the hardware needed to achieve these goals.

4.1 System Architecture

The ultimate aim of the system is for multiple users to be able to interact with each other in real-time, enabling the system to be used in the remote delivery of virtual learning. As shown in Figure 4.1, I developed my system as two separate modules; the tracking and the visualisation modules. The tracking module performs image ac- quisition and computer vision computation. The visualisation module can render the images from either first or third person views. The first person view ren- ders the image as seen from the avatar. The third person view allows external audiences to remotely observe the activities of the user or users. The Virtual Reality Peripheral Network (VRPN) library [122] is used as a

42 4.2. Computer and Tracking Hardware 43

Figure 4.1: System architecture communication plug-in between the modules. It allows us to develop the mod- ules in a publisher-subscriber model, where the consumer can subscribe for the generated tracking data. This allows other visualisation modules to have a third person view of the subject, enabling real-time interactions between multiple users in the virtual world.

4.2 Computer and Tracking Hardware

4.2.1 Camera

In deciding which camera to use I have performed measurements of several cam- eras (see Table 4.1). This comparison by no means covers all the available cameras but is sufficient for my purpose. The price refers to the quote given at the time of purchase which may have changed by the time of writing. Some important factors to consider with the cameras are frames per second (fps), resolution, horizontal viewing angle and cost. High fps allows images of fast moving objects to be acquired without blurring. Most cameras’ image resolution 4.2. Computer and Tracking Hardware 44 ranges from 320x240 (QVGA) to 1920x1080 (2 Megapixels). Greater resolution implies better image detail but at the cost of much higher processing power. Industrial cameras offer the most features at the highest prices. Lumenera and Basler are two of the well-known brands capable of delivering a very high frame rate. Apart from the Lumenera LU085 mentioned in the table, there are other types, such as the Basler A602FC capable of acquiring images up to 100fps at 640x480 pixel resolution. The cost is quite steep; a Basler A602FC costs AUD$2,674. An additional benefit of industrial camera is the wide selection of lenses avail- able. Most industrial cameras have a standard mounting bracket for the lens, hence they are easily replaceable. For example, the Computar H3616FI lens (AUD$200) has a horizontal viewing angle (HVA) of 92.6 ◦. In comparison, the Computar M1614-MP lens (AUD$120) has an HVA of 30.8 ◦. As seen from the table, the PS3 EyeToy is the clear winner in that it is capable of delivering very high fps at a very low price. However, at the time I started the project the PS3 EyeToy driver for Microsoft Windows had not been released, so at the early stages of the project I used the Logitech Fusion and the Logitech Ultra Vision webcams. The Logitech Ultra Vision webcam has a better image quality than the Logitech Fusion webcam but was released several months after. When the driver became available, I used the Sony PS3 EyeToy cameras. The webcams are placed in a triangular position using tripods (see Figure 4.2). There are two reasons for the placement of the three cameras. First is the trade off between the processing power and the number of cameras needed. I minimised the number of cameras so I could maximise the processing power of each camera. The second reason for the placement of the cameras is the space in the lab. Given the maximum viewing angle of the camera and the available distance (about 2-3m after accounting for the size of the walking platform), the best placement is at a slight angle to where the user is facing. 4.2. Computer and Tracking Hardware 45

Table 4.1: Camera comparisons Name Photo Price Horizontal Maximum (in Viewing fps & Res- AUD$) Angle olution

Lumenera (LU085) w/ $1750 92.6 ◦ 60fps, Computar H3616FI 640x480

Vimicro 301 type $20 32 ◦ 30fps, 640x480

Logitech 5000 $90 50 ◦ 30fps, 640x480

Logitech Fusion $160 60 ◦ 30fps, 1280x1024

Logitech Ultra Vision $200 60 ◦ 30fps, 1280x1024

Sony PS3 EyeToy $60 60 ◦ 120fps, 320x240

4.2.2 Controller and Visualisation Hardware

For visualisation I used the Vuzix VR920 head mounted display (HMD) (Fig- ure 4.3(a)) as the primary visual display and the projector (Figure 4.3(b)) as the secondary visual display. The advantage of the Vuzix VR920 HMD is that it comes with an embedded 3 degree of freedom (DOF) head tracker, allowing the user to feel more immersed in the virtual environment. On the other hand the projector does not have head tracking capability but has less tendency to induce motion sickness. I used a Nintendo Wiimote and its Nunchuk controller (Figure 4.4) as an ad- ditional controller. They are used in the usability studies described in Chapter 7. 4.2. Computer and Tracking Hardware 46

Figure 4.2: The cameras positions. Note the camera above the screen.

They are selected because they are wireless and very simple to use.

4.2.3 Markers

Plain coloured marker

In the earlier tracking software prototype [119] I used plain coloured markers (Figure 4.5) made out of coloured paper and rubber balls. These markers come in various colours and were attached to the user’s body joints. The disadvantage of using plain coloured markers is the need to have bright lighting which washed out the screen projector. If the user wears an HMD the lights do not distract as much, but it still makes the visualisation less immersive because the screen images got washed out. 4.2. Computer and Tracking Hardware 47

Figure 4.3: Visualisation peripherals: (a) Vuzix VR920 HMD (b) projector

Figure 4.4: Wiimote Nunchuk

LED marker

In the improved prototype [118] I used ten Light Emitting Diode (LED) orbs (refer to Figure 4.6) as markers on the user’s body joints. They were attached using velcro and flexible bands to make them comfortable to wear. Each LED orb consists of a diffuser and red, green and blue LEDs. Hence, it can produce a range of colours. I used five different colours for upper and lower body parts. The choice of colour for any particular joint is not important as long as they are different. This colour scheme greatly reduces the complexity of the matching and searching procedure which will be described later. 4.2. Computer and Tracking Hardware 48

Figure 4.5: Plain coloured markers

Figure 4.6: LED markers

Ultraviolet light based marker

During the project I have also conducted a brief experiment with ultraviolet light and fluorescent paint (Figure 4.7). Fluorescent paint comes in various colours and illuminates under ultraviolet light. This type of light is not visible to the eye hence does not distract the user. The ultraviolet light which is also called Wood’s lamp consists of an ordi- nary light enclosed by a filter made of nickel-containing glass [82]. Therefore its intensity is not greater than an ordinary “energy-saving lamp”. However, under the University’s “Non Ionising Radiation” procedure [38], all types of ultraviolet generating source is considered a potential hazard. Any work involving ultraviolet light will have to pass through a stringent control of administrative procedures. Part of the safety measures required is to use sunblock and protective eyewear which would degrade the whole experience altogether. Therefore I did not perform any experiment under these conditions. 4.2. Computer and Tracking Hardware 49

Figure 4.7: (a) UV light bulb (b) UV illuminated markers

Calibration Objects

Calibration objects are used to calibrate the cameras. Chessboards are used to calibrated the parallel camera system (for details see Section 5.3). Initially I used an A4 sized chessboard pattern (Figure 4.8(a)) to calibrate the camera. This worked well where the distance between the parallel cameras is less than 50mm. However, because I used a larger working space, the distance between the cameras was further. Therefore, I constructed the second version of the chessboards (Figure 4.8(b)). The second chessboard is of A2 size. To reduce glare I have a plastic coating on its front cover.

Figure 4.8: Chessboard calibration object: (a) A4 size (b) A2 size

Small LED torches are used as calibration objects in the multi camera system. The small LED keychain torch (Figure 4.9(a)) has its top covered with a yellow tack (similar to Bostick Blu-Tack) to reduce its brightness. This keychain torch is used as a calibration wand which is waved within the working space. The other LED torch (Figure 4.9(b)) is attached to the walking rig for ground projection calibration (see subsection 5.4.3). 4.2. Computer and Tracking Hardware 50

Figure 4.9: Calibration markers: (a) LED torch, (b) LED markers

4.2.4 Computer Hardware

In this thesis project I used two PCs (Figure 4.10). The first PC does the main processing and rendering, and the second PC acted as the monitoring console. The specifications of the PCs are described in Table 4.2.

Figure 4.10: PC Workstations: Workstation 1 (right) and Workstation 2 (top- left) attached to the monitor (bottom-left)

Tracking is a combination of image acquisition and processing. I used three PS3 EyeToy cameras, each of which is capable of acquiring very high speed im- ages of up to 120 frames per second (see Table 4.2). Workstation 1 does the main processing for tracking and visualisation. It has two display ports, the first con- 4.2. Computer and Tracking Hardware 51

Table 4.2: Workstation specifications. Workstation 1 Workstation 2 Intel i7-920, 2.67GHz Intel Pentium 4, 3GHz 6 GB RAM 1 GB RAM GeForce GTX 260 GeForce 6600GT Windows 7 Enterprise Windows XP SP2 nected to the main monitor and the second connected to the user’s HMD. What the user sees in the HMD is not visible to the administrator so I use Worksta- tion 2 for monitoring the user’s view in the HMD. Workstation 2 is connected to Workstation 1 using a Gigabit cross-over ethernet cable. The monitoring is done using VNC software. VNC software is a software used to remotely control and monitor another computer in real time. Chapter 5

Tracking Module

5.1 Literature Review on Motion Capture

Tracking human body motion has been one of the most common needs in Vir- tual Reality simulation. Being able to track the user’s body motion allows the simulation to be more intuitive, accurate, responsive and transparent [47,49,127]. Over the past decades various motion capture mechanisms have been de- veloped. The main categories are mechanical, magnetic, inertial, marker-based optical and markerless optical based.

5.1.1 Mechanical Trackers

A mechanical tracking system consists of a serial or parallel kinetic structure composed of links interconnected using sensorized joints [23]. It determines the position and orientation of each link relative to the others based on real-time reading of the tracker joint sensors. Its accuracy depends on the resolution of the joint sensors used. One example of a mechanical tracker is the Gypsy motion capture suit (Fig- ure 5.1). In addition to the joint sensors which measure the relative position of the joints, the Gypsy 7 is equipped with gyroscopes to detect whole body orientation [5]. The minimum cost for one suit is USD$8,000.

52 5.1. Literature Review on Motion Capture 53

Figure 5.1: Gypsy 7 (Animazoo Ltd.)

The advantages of mechanical tracking are greater accuracy of the tracking data [47]. A mechanical tracking system is typically not affected by measure- ment drift, the tendency of the device output to change over time as the error increases. Neither are mechanical tracking systems affected by electromagnetic signal interference. They are also not affected by metallic devices in the vicinity and occlusion from other physical objects. There are some disadvantages of using mechanical tracking. Mechanical track- ing mechanisms are typically obtrusive and encumbering [47]. They do not allow complete freedom of movement and hence reduce the user’s immersive feeling. Burdea and Coiffet [23] also reported weight problems with an exoskeleton suit which could lead to fatigue. The main goal of this project is to create a walking platform that allows a natural walking experience, and the mechanical tracking devices were not be able to deliver it. Therefore I decided not to use mechanical tracking.

5.1.2 Magnetic Trackers

A magnetic tracker is a measurement device that uses the magnetic field pro- duced by a stationary transmitter to determine the real-time position of a moving receiver element [23]. The transmitter consists of three antennas placed orthog- 5.1. Literature Review on Motion Capture 54 onally to each other. These antennas produce three orthogonal magnetic fields alternatively. These fields are received by the receiver and used to determine the position and orientation of the receiver in relation to the transmitter. A significant problem with magnetic trackers is interference from copper and ferro- magnetic metal such as mild steel and ferrite within its vicinity.

Figure 5.2: Flock of Birds magnetic tracker (Ascension Tech Co.)

One of the well known magnetic-based tracker is Ascension Flock of Birds (Figure 5.2). The system is capable of managing of up to 30 receivers. Each receiver can measure its own position and orientation relative to the transmitter. It costs around USD$2,700 for each receiver [47].

5.1.3 Inertial Trackers

Inertial trackers use sensors that measure the rate of change in an object’s orien- tation and/or translation velocity [23]. The Animazoo IGS-190 (Figure 5.3 (a)) is a motion capture suit that uses an inertial gyroscopic mechanism [6]. It uses nineteen inertial sensors attached to a flexible Lycra suit connected to a wireless unit to allow the user free movement. It uses an ultrasonic tracker to track the whole body location. An inertial tracker motion suit is more expensive than a 5.1. Literature Review on Motion Capture 55 mechanical tracker suit. The IGS-190 sells for about USD$40,000. The Inertia Cube (Figure 5.3(b)) is a simpler inertial tracker unit with dimension of 26.2mm x 39.22 mm x 14.8mm. The PC software interface allows up to 32 Inertia Cubes to be used.

Figure 5.3: IGS-190 (Animazoo Ltd.) and Inertia Cube(i-glassesstore.com)

5.1.4 Optical Trackers

Optical trackers use optical sensing to determine the real-time position of an object [23]. Usually an optical tracking system consists of multiple cameras with some are equipped with a marker or markers. The cameras are usually positioned to provide redundancy, preventing the markers from being occluded. The following two subsections described these optical trackers system in more detail. The first subsection discusses the marker-based optical trackers and the second subsection discusses the markerless optical trackers.

Marker-based Optical Trackers

The positions of the markers as seen from the cameras are triangulated to obtain their 3D positions [47]. Determining the correspondence of multiple targets is one of the major problems with this technique. One way of distinguishing the targets is to pulse their outputs in sequence with camera detection. One of the systems that uses this technique is Optotrak (Figure 5.4). Another method is to find the markers’ correspondence via a vendor’s proprietary matching algorithm. 5.1. Literature Review on Motion Capture 56

One example of such a system is Vicon (Figure 5.5). The main advantage of this method is that the markers can be made very lightweight and no wires are necessary and hence the system does not obstruct user movement. However, this comes at the cost of the time delay incurred by the software for matching corresponding markers. Traditionally this technique was only used for motion capture in movies and biomechanics research because the delay prevented its use in real-time. However, as computing power has been progressively improving it is now capable of performing in real-time. The approximate cost of the Vicon system, which comes with 8 infra red cameras, is about USD$200,000.

Figure 5.4: Optotrak System: (a)Vertus cameras (b)6DOF rigid body markers (c)7mm and 11mm IRED markers

Another variation of an optical tracking mechanism is inside-looking-out. In this setup both the camera and the sensor are placed on the user [1]. One example of such a device is the HiBall-3100 produced by 3rdTech (Figure 5.6). The system comes with beacon array modules which are usually fitted on the ceiling and a 6 DOF HiBall sensor. The sensor unit consists of IR-filtering lenses, lateral-effect photodiodes and miniaturized electronic circuits. There are six narrow view lenses arranged in the six sectors of its hemisphere [23]. 5.1. Literature Review on Motion Capture 57

Figure 5.5: Vicon tracking system (UC Merced’s School of Engineering, 2008)

Figure 5.6: HiBall-3100 Tracker (3rdTech Inc.)

Markerless Optical Trackers

Traditionally, optical trackers are marker-based, a system which relies on a device or a marker being visible from a camera. Recent developments in computer vision techniques have created some breakthroughs in markerless human motion tracking. However, most of the work in this area is still not available as off-the- 5.1. Literature Review on Motion Capture 58 shelf products. Wren et al [140] developed the Pfinder tracking system which tracks the human body position based on colour blobs (Figure 5.7). The system was able to run at 10 fps but its results are limited to obtaining a 2D approximate position of body, limbs and hands.

Figure 5.7: Pfinder results (Wren, 1996)

Gavrila and Davis [50] developed a multi-view tracking system using a chamfer matching technique to find similarity between synthesized and real edge images (Figure 5.8). The synthesized images used are from the Humans-In-Action (HIA) database containing 2500 frames in each of four orthogonal views. The results are presented in Figure 5.9. The system is incapable of real-time processing and the user must wear tight fitting clothing. Bregler and Malik [21] developed a motion estimation technique using expo- nential maps and twist motion to recover articulated human body configurations in video sequences (Figure 5.10). The system was able to recover the body parts configuration of several different postures. There are no performance results for this system. Deutscher et al [36] developed a three camera tracking system using an an- nealed particle filter for tracking the human body with 29 DOF. The system was able to recover not only the body parts configuration but also the skeleton con- 5.1. Literature Review on Motion Capture 59

Figure 5.8: Images from the chamfer matching method (Gavrila and Davis, 1996). (a) raw image (b) scene edge image (c) filtered edge image (d) chamfer image

Figure 5.9: Images from the method of chamfer matching results (Gavrila and Davis, 1996)

figuration (Figure 5.11). However it is not capable of performing in real-time (1 hour for processing 5 seconds of video). Dockstader and Tekalp [40] and Dockstader et al [39] developed a distributed computing platform for tracking multiple persons in motion (Figure 5.12). The system uses multiple cameras in which each view is independently processed on a dedicated processor. The corrected state vectors from each view provide input 5.1. Literature Review on Motion Capture 60

Figure 5.10: Images from the method of motion estimation using exponential maps and twist motion (Bregler and Malik, 1998)

Figure 5.11: Images from the method of articulated body motion capture by annealed particle filtering (Deutscher et al, 2000) observations to a Bayesian belief network in the central processor. It uses Kalman filtering to update the 3D state estimates. The system is capable of running in real-time but on a distributed computing platform.

Figure 5.12: Multi camera tracking results from Dockstader and Tekalp(2001)

Agarwal and Triggs [3] developed a method for recovering 3D human body motion from silhouettes extracted from monocular video sequences (Figure 5.13). The tracker estimates the 3D body pose by using a Relevance Vector Machine to 5.1. Literature Review on Motion Capture 61 combine with the learned autoregressive dynamical model with shape descriptors extracted from the image silhouettes. They reported real-time performance but only for limited postures (3 angles for each of 18 body joints) so the tracking is very limited in accuracy.

Figure 5.13: Tracking Human Motion from Silhouettes (Agarwal and Triggs, 2004)

Boulic et al [18] developed a system for recovering full human body motion us- ing Inverse Kinematic constraints. Their methods run in the following sequence: chroma-keying, skin colour segmentation, 2D tracking for each camera view and 3D tracking. Chroma-keying is a method where the object to be segmented stands opposite a plain coloured background (Figure 5.14(a)). Chroma-keying works with the foreground image which contains only the user. Skin colour seg- mentation allows the system to track the user’s palms (Figure 5.14(b)). Given this information the system then uses Inverse Kinematic constraints to search for the best position for elbows and limbs (Figure 5.14(c)). The cameras capture images at 30 frames per second but due to the delay introduced by the tracking, the system is only able to run at 20 frames per second. The performance outcome of the system is promising but is very limited, firstly because the system can only perform upper body tracking. Secondly, due to the use of skin colour segmentation, the user is restricted to wearing long-sleeved clothing. However, there is significant potential in such a setup given its ability to run on an ordinary PC. 5.2. Attempts at Markerless Tracking 62

Figure 5.14: Tracking Human Motion using Inverse Kinematic constraints (Boulic et al, 2005) (a) position of user against plain background (b) palms tracked by skin colour segmentation (c) search result for elbows and limbs

5.2 Attempts at Markerless Tracking

5.2.1 Motivation

Markerless motion capture would allow the user to use the OmniWalker without spending extra time attaching all the markers. Hollerbach [55] highlighted that the advantages of treadmill-based locomotion interfaces over other locomotion systems are the relatively unencumbered setup and the greater freedom of move- ment. In treadmill-based systems such as the Sarcos Treadport [4] there is only a single tracker attached to the user’s body harness. This allows the user to jump on and off the platform quickly. Some of the previously discussed research projects were not able to reach real- time performance due to the moving nature of the tracked object. Gavrila and Davis [50] tracked a person moving within a scene performing a variety of tasks. Similar translational movement was also a requirement in Deutscher et al [36] and Dockstader and Tekalp [40]. With the OmniWalker the user is in a standing posture without real transla- tional movement. The OmniWalker allows the user to perform a real walk but the actual translation movement is cancelled by the sliding motion from the ball transfer units. Therefore, the tracking system will only needs to track the position of the body extremities, such as the hand, elbow, knee and foot. 5.2. Attempts at Markerless Tracking 63

Figure 5.15: Markerless tracking module design

5.2.2 System Design

The system I aimed to develop was motivated by the work described in Gavrila and Davis [50] and Boulic et al [18]. Figure 5.15 is a diagrammatic representation of the system I aimed to develop. I used a chroma keying technique similar to Boulic et al [18] to segment out the background. After the foreground image is obtained, a chamfer matching process is used to find the most similar synthe- sized images. After obtaining a set of “similar” synthesized postures, the best-fit selection process is used to combine all the output from different views and pre select only postures that are agreed by at least two of the views. When there is more than one possible posture the 3D tracking process will select the posture that has the smallest difference from the one found previously. 5.2. Attempts at Markerless Tracking 64

5.2.3 Experiment Results

To test my initial technique, I recorded several minutes of walking actions. The walking actions are performed on the OmniWalker platform with a plain blue coloured linen used as a background (Figure 5.16(a)). Figure 5.16 shows the result of the chroma keying technique. Chroma keying allowed us to restrict the search space, but there was still a significant problem with segmenting out the walking platform. One major constraint with this tech- nique is that the platform cannot be painted. The paint and its residue can easily clog the ball transfer and prevent it from rolling.

Figure 5.16: Chroma keying result using blue coloured background in which the walking platform not able to be segmented out. (a) raw image (b) foreground image (c) foreground mask

I then used the absolute difference technique to segment out the walking platform from the image (Figure 5.17). With this technique I converted the images to grayscale colour. Initially the image of the walking platform was pre- captured as the background model and then the incoming image was subtracted from the background model. The resulting image was converted into a binary image using binary thresholding. This technique enabled us to filter out the walking platform image but also degraded the image of the user’s legs. The codebook foreground segmentation method was developed by Bradski and Kaehler [20]. The aim of this method is to be able to deal with image pixels that may change levels dramatically, e.g. a windblown tree against a bright blue sky. 5.2. Attempts at Markerless Tracking 65

Figure 5.17: Absolute difference technique: (a) background model (b) real-time capture (c) absolute difference result (d) binary thresholding(intensity thresh- old=120)

The algorithm builds a codebook repository containing the three channel value range for each pixel of the background model. This background model repository is continually updated as the pixels change. The update rate is determined by the upper and lower learning threshold. The result of the codebook foreground segmentation method is presented in Figure 5.18. The codebook technique was able to classify the user’s shadow on the walking platform and reflections on the ball transfer as background. However, as the background model was updated over time (Figure 5.18(a)), the foreground elements were increasingly being accumulated into the background model. This resulted in some of the foreground being falsely classified as the background (Figure 5.18(c)).

Figure 5.18: Codebook foreground segmentation method: (a)Background model (b)Real time capture (c)Foreground mask (d)Segmented foreground 5.2. Attempts at Markerless Tracking 66

My prototype was developed in the spirit of a “divide-and-conquer” strategy. The segmentation process ran as two subprocesses (Figure 5.19). The dividing rule was based on the “walking platform mask” clearly visible as the problem with the lower legs in Figure 5.17(c) and (d). Figure 5.20 represents the upper body segmentation subprocess which is only responsible for finding the foreground in the area that is above and around the “walking platform mask”. The lower body segmentation subprocess represented in Figure 5.21 responsible for segmenting the foreground that lies inside the “walking platform mask”. The results from both subprocesses are then combined using the “OR” operation (Figure 5.22).

Figure 5.19: Top level flow diagram of the markerless foreground segmentation

The upper body segmentation subprocess (Figure 5.20) uses chroma keying to pre-segment the foreground. However, the result (ck foreground mask) is still cluttered with the walking platform. I then used the “XOR” operation to filter 5.2. Attempts at Markerless Tracking 67 out the walking platform foreground mask.

Figure 5.20: Subflow process diagram of the upper body segmentation

The lower body segmentation subprocess (Figure 5.21) uses a combination of chroma keying, absolute differencing and binary thresholding techniques. The platform image is pre-captured before the user stands on the platform. After the user stands on the walking platform, I used absolute differencing and binary thresholding to superimpose the feet foreground image. I then used the “AND” operation to extract the foreground limited to within the area of the “platform mask”. Then using dilation and erosion techniques [20] I were able to obtain a clear foreground mask. The results of the segmentation method are presented in Figure 5.22. Overall, the prototype was able to extract the foreground of the images with a very small number of false positives. Greater false positives occurred with a side stepping 5.2. Attempts at Markerless Tracking 68

Figure 5.21: Subflow process diagram of the lower body segmentation gait due to the shadow (Figure 5.22(d)). This problem could be rectified using better light placement around the walking platform. The prototype ran in an average of 11 milliseconds of computation time for each frame making the prototype quite time intensive. The chroma keying pro- cess took approximately 0.7 milliseconds. The process of obtaining lower body 5.2. Attempts at Markerless Tracking 69 segmentation (without the dilation and erosion processes) took approximately 1 millisecond. Each dilation and erosion took approximately 0.5 milliseconds. The distance transform process took approximately 5 milliseconds.

5.2.4 Discussion

In some ways the tracking environment requirements of the OmniWalker are simpler than the requirements of other projects [3], [18], [50], [36], and in other respects the OmniWalker’s requirements are more complex. The OmniWalker has simpler tracking requirements in terms of the user’s possible postures. The user is standing statically within the holding frame. In addition, the tracking en- vironment’s parameters such as lighting and background can be flexibly adjusted. The OmniWalker tracking requirement is more complex than for other projects because of the shininess and reflection from the ball transfer units. The reflection tends to produce significant random noise that is very difficult to handle. The prototype was able to produce good results but at the expense of sig- nificant computation time. The motion capture system developed by Boulic et al [18] used the chroma keying method to extract the foreground. The chroma keying process requires only 0.5 milliseconds, compared to the prototype’s total time of 11 milliseconds. Due to the high computation time of the markerless tracking method, I did not continue down this path. I considered 30 frames per second as the threshold of real-time performance. This threshold comes down to approximately 33 mil- liseconds available duration from the start of one frame to the start of the next. For three images captured from the cameras, my foreground segmentation pro- cess will take about 33 milliseconds on average. Without even considering other processes (see Figure 5.15), the foreground segmentation process has already used up all the available time. Therefore, I did not pursue this approach any further. 5.2. Attempts at Markerless Tracking 70

Figure 5.22: Foreground segmentation results of user in various poses. (a) arms apart in T-stand shape (b) right forward walking gait (c) left forward walking gait (d) side stepping gait 5.3. Marker-based Tracking using a Parallel Camera System 71

5.3 Marker-based Tracking using a Parallel Cam- era System

5.3.1 System Design and Overview

I developed a prototype parallel camera system [119] using two side-by-side Log- itech Fusion cameras (Figure 5.23). The user wore plain coloured markers at- tached to the body joints at the shoulders, elbows, arms, ankles and feet.

Figure 5.23: Parallel camera setups and the generated views

The system design is described in Figure 5.24. The images are captured at a size of 320x240 pixels, with a frame rate of 30 fps. I used Dom Anker’s dscam library [7] to perform software synchronization between the cameras with an average of 16 milliseconds out-of-sync delay between the image acquisition by the first and second cameras. The cameras are calibrated beforehand in order to obtain their positions rel- ative to each other. The calibration process yields matrices that were used to undistort and rectify the images. The markers’ 2D positions are tracked indepen- dently in the 2D tracking module. These 2D positions are then triangulated and tracked further in the 3D tracking module. I used a Kalman filter [69] to reduce the jitter generated by the triangulation method. The 3D tracking results are then used to control the avatar. 5.3. Marker-based Tracking using a Parallel Camera System 72

Figure 5.24: System design of the parallel camera tracking system

5.3.2 Camera Calibration

The stereo imaging concept is based on the understanding that given stereo cam- eras that are perfectly parallel, the depth dimension of the object in front of the camera can be calculated from the disparity between the right and the left cameras (Figure 5.25). The constraints on a parallel camera system are that the cameras must have perfect alignment, be exactly coplanar with parallel optical axis, be a known distance apart, have an equal focal length and have calibrated principal points. A calibration method for parallel camera systems has been developed by Bouguet [15]. It is a refinement of the method presented by Zhang [145] and Tsai [126]. The method generates matrices used for rectifying the stereo im- 5.3. Marker-based Tracking using a Parallel Camera System 73

Figure 5.25: The stereo coordinate system used in OpenCV for undistorted rec- tified cameras (Bradski and Kaehler, 2008) ages and 3D triangulation. Bouguet’s method is implemented as native C++ in OpenCV [20]. OpenCV’s implementation of stereo camera calibration is useful because it integrates with the rest of the modules. The OpenCV’s calibration method uses the chessboard pattern to obtain sev- eral point samples (Figure 5.26). The calibration process yields the following parameters:

• Translation(T ) and Rotation(R) matrices between the first and the second camera.

• Each Camera’s intrinsic matrix (containing focal (f) lengths and principal points (c)).

• Distortion coefficients for each camera.

5.3.3 2D Tracking

There are some subprocesses in 2D Tracking that are run independently for each camera. Figure 5.27 shows the dataflow of these subprocesses. Saturation-value 5.3. Marker-based Tracking using a Parallel Camera System 74

Figure 5.26: Chessboard samples for calibration

(S&V) thresholding is used to isolate the image pixels associated to a marker based on its colour intensity. It is run once for each image. Histogram backpro- jection and a mean-shift tracking process is used to differentiate the markers by colour. It is run for each marker in each view. Histogram backprojection, S&V thresholding and mean-shift tracking are explained in the following subsections. Before the user can start using the system they are required to do a T-stand posture to get the system initialised. I decided to use a T-stand posture because in this posture, all the markers can be seen from all the cameras. The initialisation stage is only run once to initialise all the 2D tracking parameters. These 2D tracking parameters are the colour histograms and the initial positions of each 5.3. Marker-based Tracking using a Parallel Camera System 75 marker. The colour histograms data are used in the histogram backprojection process. The initial tracking positions are used in the mean-shift tracking process.

Figure 5.27: 2D Tracking flow diagram in parallel camera system

Histogram Backprojection

Histogram backprojection is commonly used to generate a probability vector input for the mean-shift algorithm [20]. The basic idea of the histogram backprojection method is to generate a single channel (grayscale) image highlighting the original image’s pixels’ similarity to a given histogram. The histogram by itself refers to collected counts of the underlying data into a set of bins (see example in Figure 5.28). Histograms can be used to represent different things such as the colour distribution of an object, an edge gradient template of an object, and distribution probabilities of an object’s location [20]. In this system, histograms are mainly used to represent the colour distribution of the markers.

Figure 5.28: Histogram example: given a set of coloured regions(a), a histogram with bin size of 5 represents the count for each category of bin (b). 5.3. Marker-based Tracking using a Parallel Camera System 76

The histogram backprojection method computes the ratio of the colour his- togram of an image (Ii) to the colour histogram of the pre-sampled image (Hi) [20]:

Ii Ri = (5.1) Hi Apart from the well-known RGB colour model, there are other colour models. The major colour models are CIE, YUV, HSV, and CMY [51]. Figure 5.29 illustrates these colour models.

Figure 5.29: Colour models: (a) CIE 1931 (b) RGB (c) CMY (d) YUV (U-V colour plane for Y = 0.5) (e) HSV (Rolf G. Kuehni, 2003)

I used the HSV colour model because it isolates the colour hue information from other less important attributes (such as colour saturation and brightness). The HSV colour model represents each colour pixel in three independent variables: hue, saturation and brightness values. It allows us to strictly work on the colour information without worrying about its intensity. Appendix A provides a brief description of the HSV colour model and the conversion algorithm between RGB and HSV. The hue channel is used as the working space for histogram backprojection. The tracking differentiates the marker using colour information. Figure 5.30 shows how histogram backprojection is used to generate a binary image, high- lighting pixels that have closeness to a colour histogram (in this case green). A more in-depth discussion of histogram backprojection can be found in Swain and Ballard [121].

S&V Thresholding

S&V thresholding is used to filter out those pixels with saturation or brightness value less than a minimum threshold (Figure 5.31). The coloured markers I used 5.3. Marker-based Tracking using a Parallel Camera System 77

Figure 5.30: Parallel camera system: histogram backprojection on green colour, user in T-stand posture had strong colour saturation. In the HSV colour model, these pixels mapped to a high amount of “Saturation” and “Value” (S&V). Comparatively, other pixels on the screen have lower amounts of S&V. The result of this thresholding is a binary image representing the position of the markers.

Figure 5.31: SV thresholding (a) enumerated positions of markers (b) position of markers as shown by thresholding.

Mean-shift tracking

I used a mean-shift tracking algorithm to track the 2D position of each marker. The mean-shift algorithm is often used when the motion of a tracked object cannot 5.3. Marker-based Tracking using a Parallel Camera System 78 be described by a motion model. It was originally proposed by Fukunaga [48] and was further developed by Cheng [25] and Comaniciu [29]. The basic idea of mean-shift is finding the nearest peak of a probability distri- bution input vector. The key notion of the mean-shift algorithm is the definition of the multivariate kernel density function, which is defined as: n 1 X 1 fˆ(x) = K{ (x − X )} (5.2) nhd h i i=1 where Xi, ..., Xn is a set of n points on a d-dimensional data. These points are located inside a window of radius h centred at x. The commonly used kernel functions K(x) are the Normal, Uniform, and Epanechnikov kernel functions. The mean-shift algorithm runs as follows:

1. Given a window of a particular shape, n pixels in size, and a starting loca- tion,

2. Calculate the window’s centre of mass.

3. Move the window’s centre to the newly found centre of mass.

4. Repeat steps 2 and 3 until the centre of mass stops moving.

This algorithm is illustrated in Figure 5.32. To improve performance, OpenCV [20] calculates the centre of mass of the image pixel distribution as follows:

M10 M01 xc = , yc = (5.3) M00 M00 where X X X X X X M00 = I(x, y),M10 = xI(x, y), andM01 = yI(x, y) (5.4) x y x y x y where I(x, y) is the image pixel at location (x, y). If I lose track of a marker, I do a contour search of the area surrounding its parent marker. For example, the parent of the elbow’s marker is the marker attached to the shoulder. Out of the contour candidates found, the one closest to the parent’s position is picked and marked as the new tracking position of the tracking window. 5.3. Marker-based Tracking using a Parallel Camera System 79

Figure 5.32: Mean-shift algorithm at work (Bradski and Kaehler, 2008). An initial window is placed over an array of data points at the top right corner and is successively recentered over the local peak of its data distribution until convergence.

The Initialisation process

I attached eleven markers of five different colours to the user’s body joints (Fig- ure 5.33). The colours are chosen based on the probability of markers occluding each other. For example different colours are used for the left and the right arms because they are relatively close and have greater chance of occluding each other.

Figure 5.33: 2D Tracking initialisation with parallel cameras: (a) rectified image (b) post-S&V thresholding (c) contour count

To initiate tracking, the user adopts a T-stand shape (Figure 5.33(a)). The 5.3. Marker-based Tracking using a Parallel Camera System 80 process generates eleven tracking objects which are organised into a tree data structure that reflects the structure of the body. The procedure for initialisation is shown below.

Require: s threshold ≥ 0 and v threshold ≥ 0

1: t image ← doSV T hresholding(s threshold, v threshold, raw image)

2: region total ← regionCount(t image)

3: if region total == 11 then

4: Convert the raw image from RGB to HSV format

5: Generate a colour histogram for each found marker region

6: Store the histogram of each marker

7: Store the current threshold

8: Store the markers position found as the initial position

9: end if

5.3.4 3D Tracking

From the calibration matrices generated in subsection 5.3.2, I can construct the reprojection matrix:   1 0 0 −cx     0 1 0 −cy  =   (5.5)   0 0 0 f    0 0 − 1 (c − c´ )/T Tx x x x where (cx, cy) is the principal point in the left image, f is the focal length of the camera,c ´x is the principal point x-coordinate in the right image, and Tx is the x-translation between the cameras. I can then project a point (x, y) in the left image into 3D space using:     x X         y  Y  Q ×   =   (5.6)     d  Z      1 W 5.3. Marker-based Tracking using a Parallel Camera System 81 where the resulting 3D coordinate is (X/W, Y/W, Z/W), where d is the x- coordinate disparity between the left and the right images.

5.3.5 Kalman Filter

The parallel camera system generates a very high noise rate, making the data unsuitable for use in its raw form as controller input. Similar to Shiratori and Hodgins [113], I used a Kalman filter to reduce the noise in the reconstructed 3D data. The Kalman filter is a recursive discrete data linear filtering algorithm. It was published by R. E. Kalman in 1960 [69] and since then it has been the subject of extensive research and applications [132]. The Kalman filter allows us to maximize the a posteriori probability without the need to keep a long history of the previous measurements. It behaves in a similar way as minimizing the mean of the squared error but without significant computational implications. As an estimator, the Kalman filter consists of two recursive processes [132] (Figure 5.34). The “predict” process consists of the time update equations. They are responsible for projecting forward the current state and error covariance es- timates to obtain the a priori estimates for the next time step. The “correct” process consists of measurement update equations. They are responsible for in- corporating a new measurement into the a priori estimate to obtain a better a posteriori estimate. More detailed discussion on the Kalman filter can be found in Welch and Bishop [132].

I modelled the state of the system xk as three position variables (x, y, z) and three velocities (vx, vy, vz). The measurement is represented as a position 5.3. Marker-based Tracking using a Parallel Camera System 82

Figure 5.34: The recursive cycle of the Kalman filter (Welch and Bishop, 2006) variable z . k     x 1 0 0 1 0 0          y  0 1 0 0 1 0       z     x  z  0 0 1 0 0 1   xk =   ,A =   , zk = z  (5.7)      y vx 0 0 0 1 0 0       z     z vy 0 0 0 0 1 0     vz 0 0 0 0 0 1 Figure 5.35 illustrate the comparison between the raw input and the output of the Kalman filter.

5.3.6 Avatar Control

Once I have the 3D position of all the markers, I use the information to control the avatar. There are many reasons for using an avatar instead of just dragging the view of the user according to the walking translation. First, the ultimate aim of the system is for multiple users (each using an OmniWalker) to interact with each other in real-time. This requires each user to be able to control their avatar’s body parts which are viewed by other users, thereby enabling real-time distance 5.3. Marker-based Tracking using a Parallel Camera System 83

Figure 5.35: Kalman filter output of the raw z-position data learning within a common virtual environment. Secondly, many researchers [104] [105] [86] [117] have shown that the use of a high-fidelity full-body avatar enhances user’s perception in virtual reality. It improves the user’s ability to estimate distances and the user’s awareness of the environment. I divide the control mechanism into two sections; upper and lower body mod- els. The upper body model is controlled in a similar fashion to 3 DOF motion capture, where the avatar’s limbs are translated and rotated according to the relative position of the tracked markers. However, the lower body parts of the avatar cannot be controlled in the same fashion because the movement of the legs (sliding down an inclined plane) is not the same as a normal walk. In order to synthesize a robust and natural-looking walk, I use a control mechanism similar to Raibert and Hodgins [102]. Like them, I divide the walking animation into several stages and interpolate them. Referring to Figure 5.36, the walking states diagram, V z(L) refers to the velocity of the left foot in the z-direction (forward and backward). Similarly V z(R) is the velocity of the right foot forward and backward. V x(L) and V x(R) are the velocities of the left and right foot in sliding right to left. T z and T x are thresholds which are determined heuristically depending on the tracking accuracy. The walking animation time, t, is interpolated to be between 0 to 1.0. The current 5.3. Marker-based Tracking using a Parallel Camera System 84

Figure 5.36: Walking animation states state of the walking animation is determined by the foot location relative to its maximum displacement. This yields the current animation time t. Immediately changing the animation time to t would result in jerky movement, so instead, the animation is allowed to run in accordance with the rendering update time until it reaches t, though this target will change as the foot moves. This introduces a slight lag, but users find this acceptable. To change the walking direction the user needs to adopt a side sliding gait that is similar to the natural way of turning while walking. For example, for turning left the walker would swing the right foot forward then slide it to the right and finally slide it back again. The system will track the sliding motion and rotate the avatar accordingly. Once a change of direction of gait is recognised, instead of continuing with the animation, I run the animation in reverse. 5.3. Marker-based Tracking using a Parallel Camera System 85

5.3.7 Tracking GUI Design and Implementation

The tracking module is implemented using the OpenCV HighGUI library. It was selected because it allows better integration with the underlying OpenCV core library. Some of the static configuration for the Tracking module is read from a con- figuration file. The configuration file contains some parameters that are less frequently changed and require significant setup initialisation. These parameters are listed in Table 5.1.

Table 5.1: Parallel camera tracking module configuration file parameters Name Description ENABLE LIVECAM 0 for video files feed, 1 for live camera feed. ENABLE RECORDING 1 for recording the camera feed (should be used only when the live camera feed is enabled) VIDEO FILE1 The filename of video file #1 used when the ENABLE LIVECAM parameter is set to 0 VIDEO FILE2 The filename of video file #2 used when the ENABLE LIVECAM parameter is set to 0

The tracking GUI is designed to allow real time monitoring of the tracking states. It consists of nine different categories (Figure 5.37). The first on the top-left is the tracking visualisation. It renders a simple avatar and the position of the markers in 3D. This allows the administrator to check whether all the markers are tracked correctly. The user can change the viewing angle via the keyboard button: “1” for side view, “2” for top view and “3” for front view. Next are two separate windows (titled camera1 and camera2) showing the camera capture from both cameras. 5.3. Marker-based Tracking using a Parallel Camera System 86

Figure 5.37: GUI design of the tracking module

On top of the “camera1” window are tracking parameters. The parameters are implemented with a slider with preset values. They are “position”, “cur- rent node”, “initial parent”, “min S”, “min V”, and “search radius”. On top of the camera2 window are two sliders, “rewind/stop/play” and “init”. On top of the “mean-shift tracking input” is another slider named “left/right”. This “left/right” slider is used to switch the monitoring panel between the left camera or the image from the first video file and the right camera or the image from the second video file. The “position” and “rewind/stop/play” parameters are only used when the input is configured to read from video files. The “position” slider displays the image position relative to all the frames in the video file. The “position” slider is also draggable for moving across all the frames. The “run” parameter has three options for backward play (rewind), pause (stop), and forward play. Changing the slider to “rewind” or “play” allows continuous play of the video files in respective direction. The “view” and “current node” parameters are used to select the currently active camera and marker node respectively. The “view” parameter slider has 5.3. Marker-based Tracking using a Parallel Camera System 87 three options mapped to camera 1, 2, and 3. The “node” slider has 11 options mapped to marker 1 to 11. The selected “view” and “node” parameters determine from which camera and which marker will be used by the “mean-shift tracking input” and “histogram backprojection” window. Together with the “initial parent” parameter, the “current node” parameter is used to manually assign a tracker to a marker. For example, to assign the left green marker in camera 1 as tracker #3, the administrator can set the “current node” parameter to a value of 3, then clicking and dragging the mouse on the shoulder area pixels in the “camera1” window. The system will update the his- togram for camera #1 to tracker #3 automatically. Further, if the user would like to assign the red marker attached to the left elbow as tracker #2 with the left shoulder (previously assigned to tracker #3) as its parent, the user can set the “current node” parameter to a value of 2 and set the “initial parent” parameter to a value of 3. Clicking and dragging the mouse on the left elbow area will assign the marker as tracker #2 with tracker #3 as its parent. The “init” slider is an on/off switch for enabling the initialisation subprocess. When enabled, the initialisation process will run, seeking the T-stand posture. Once a match is found, the slider will return to the “off” default state. The “min S” and “min V” set the minimum threshold values used in the S&V thresholding method. The result of adjusting these parameters is displayed in real-time in the “S&V thresholding mask” window. The “search radius” refers to the radius of the area used in contour finding when the mean-shift tracking loses track of a marker. The search area is displayed in the “mean-shift search area display” window for debugging purposes. The “colour histograms” window displays the colour histograms presampled during the initialisation phase. In total there are 11x2 windows, displaying the colour histogram of each marker from each camera. 5.3. Marker-based Tracking using a Parallel Camera System 88

5.3.8 Experimental Results and Discussion

The prototype was tested on computer hardware with a 2.4GHz Intel Core2 Duo processor, 3 GB DDR2 RAM, and NVidia GeForce8600 GT graphics card. The system was capable of performing in real-time with an image acquisition rate of around 30fps. Image processing and 3D projection produced total latency between 20 and 50ms. Some of the latency (15 to 30ms) was incurred by the mean-shift tracking process, depending on the quality of the markers’ images. The parallel camera system was the first prototype demonstrating the use of computer vision in tracking the user’s walking motion on the OmniWalker. It was not only able to capture the walking gait, but also the user’s full body motion. There are some advantages of using a parallel camera system. These advan- tages are:

• Simple camera setup.

• Camera calibration can be done relatively quickly. Twelve snapshots of the chessboard images took about 30 seconds to complete.

• Low CPU overhead for triangulating the markers’ 3D position.

However, there are also some disadvantages of using the parallel camera sys- tem. They are:

• The single direction of camera look-at means there is a higher chance of losing track of the markers due to occlusion.

• Camera calibration sometimes needs to be repeated several times to get a small error rate.

• The use of the Kalman filter to reduce noise introduced significant lag to the avatar control module. The user can detect some delay between when their leg starts to pull backward and when the avatar control module starts to recognize the gait motion. 5.4. Marker-based Tracking using a Multi Camera System 89

Another disadvantage of this first prototype was the construction of the markers. The use of plain coloured markers means that the user could not wear bright coloured clothing. This problem could be solved by giving the user overalls to wear. However this adds further to the intricacies of using the system. In the next prototype (Section 5.4) the markers are improved by using LED orbs.

5.4 Marker-based Tracking using a Multi Cam- era System

5.4.1 System Design and Overview

Figure 5.38 shows images from the tracking module using multicamera system and LED orbs. The input images are fed from three PS3 EyeToy cameras placed in a triangular formation in front of the user. LED orbs instead of plain coloured markers are used.

Figure 5.38: Tracking from three acquired images into 3D tracking points using a multicamera system (3 cameras) and LED orbs (actual images are darker than shown here).

Figure 5.39 shows the design of our tracking module. The images are acquired synchronously between three cameras at 60fps at 320x240 pixels resolution. Syn- chronization is done at the software level, resulting in an average of about 6 milliseconds out-of-sync delay between images acquired by the cameras, which is negligible for our tracking needs. These cameras are pre-calibrated using Svo- 5.4. Marker-based Tracking using a Multi Camera System 90

Figure 5.39: System flow diagram boda’s method [120], which will be discussed in Subsection 5.4.3. Compared to the parallel camera system(Section 5.3) the new system does not use the Kalman filter. This is because the triangulated 3D position in the multicamera system does not have as high an error rate as the 2 camera parallel system. The markers’ 2D positions are tracked independently in a 2D tracking module for each camera. These 2D positions are then triangulated and tracked further in the 3D tracking module.

5.4.2 2D Tracking

There are subprocesses in the 2D Tracking process that are run independently for each camera. Figure 5.40 shows the dataflow of these subprocesses. Intensity thresholding, histogram backprojection, and mean-shift tracking are the three main subprocesses that continuously run to track the 2D positions of the markers. The initialisation phase is a one-off subprocess executed at the beginning of the tracking process. It yields three outputs (intensity threshold, colour histograms, and the markers’ initial position) that are used in the subsequent three main 5.4. Marker-based Tracking using a Multi Camera System 91 subprocesses.

Figure 5.40: 2D Tracking flow diagram. Dashed lines represent one-off processes that are run during initialisation.

Intensity Thresholding

Intensity thresholding is used to filter out pixels with intensity less than the minimum threshold. It is simpler than the previous S&V thresholding technique described in Subsection 5.3.3. Intensity thresholding uses mainly the intensity (Value) parameters from the HSV converted image. Essentially, this technique does a binary thresholding on the combined maximum of the three RGB channel layers:

dsti = max(Ri,Gi,Bi) > T ?1 : 0 (5.8) where dsti is the resulting pixel value and T is the threshold parameter. Part of the reason why intensity thresholding works sufficiently well in this system is because LED orbs are used as markers. In the previous parallel camera system (Section 5.3) I used plain coloured markers. These markers relied on their high colour saturation to be distinguishable from their surroundings. Compar- atively, the LED markers’ pixels have greater intensity than their surroundings. Figure 5.41 shows some intensity thresholding results on different threshold val- ues. 5.4. Marker-based Tracking using a Multi Camera System 92

Figure 5.41: Intensity thresholding examples

Histogram backprojection

The histogram backprojection technique used in the second prototype was essen- tially the same as for the first prototype. However, since I used LED markers the results were different. Figure 5.42 presents an example of backprojection on a magenta colour.

Figure 5.42: Multi camera system: histogram backprojection on a magenta colour 5.4. Marker-based Tracking using a Multi Camera System 93

Mean-shift tracking

The mean-shift algorithm I used is the same as for the first prototype system. However, the result was better for the second prototype because the intensity thresholding technique yields better results than the S&V thresholding technique (Figure 5.43).

Figure 5.43: Mean-shift tracking inputs generated from intensity thresholding and histogram backprojection

The initialisation process

Compared to the parallel camera system, the multi camera system uses ten LED orb markers (Figure 5.44). The colours of the markers are selected in a similar 5.4. Marker-based Tracking using a Multi Camera System 94 manner to the colours of the markers for the first prototype; colour choice is based on the probability of the markers occluding each other.

Figure 5.44: 2D Tracking initialisation on multi camera system. All feature points are found at the threshold value of 50.

The use of intensity thresholding enables automation of the initialisation pro- cess. In the previous system, during the initialisation phase the administrator chose heuristically the appropriate level for the S&V threshold values. Automat- ing this process is difficult because there are two parameters that need to be adjusted, each with a possible value between 0 and 255. Iterating through these values would generate 2552 different combinations. The intensity thresholding technique only uses one parameter (value). So auto initialisation is possible be- cause I only iterate through a small range of values. The procedure for initialisation is shown below.

1: for min t = ST ART T HRESHOLD to END T HRESHOLD do

2: t image ← intensityT hresholding(min t, raw image)

3: region total ← regionCount(t image)

4: if region total == NUMT RACKP OINT S then

5: Convert the raw image from RGB to HSV format (see Appendix A)

6: Generate colour histogram for each found marker region

7: Store the histogram of each marker

8: Store the current threshold 5.4. Marker-based Tracking using a Multi Camera System 95

9: Store the markers position found as the initial position

10: end if

11: end for

Some of the parameters considered in the initialisation procedure are shown in Table 5.2.

Table 5.2: Initialisation parameters in multi camera system. NUMTRACK- POINTS variable refers to the number of markers need to be tracked. Variable name Values START THRESHOLD 50 END THRESHOLD 240 NUMTRACKPOINTS 10

5.4.3 3D Tracking and Camera Calibration

Process flow

Compared to the parallel camera system shown in Figure 5.24, the multi camera system has additional processes (Figure 5.45). Ground projection is a process to map a point from a world coordinate system to a local coordinate system. The projection matrix that maps between the two coordinate system is generated by the ground calibration process. These additional processes are possible because multi camera system has greater accuracy compared to the parallel camera sys- tem.

Camera Placement and Calibration

I placed the cameras in triangular formation in front of the user. The cameras are placed a certain distant apart from each other as shown in Figure 5.46 to be able to see all the markers with minimum occlusion. There are certainly other alternatives to the positioning of the cameras as well as the number of cameras. However, our system is constrained by the size of the 5.4. Marker-based Tracking using a Multi Camera System 96

Figure 5.45: Multicam process flow diagram

Figure 5.46: Cameras placement room and the number of cameras I have. I calibrated the cameras using Svoboda et al’s self calibration toolkit [120]. The toolkit is freely available as a Matlab library package. 5.4. Marker-based Tracking using a Multi Camera System 97

The calibration requires images of pre-recorded scenes of a dark room with the calibration wand waved through the working volume. I used a small LED torch as the calibration marker-wand (Figure 4.9(a)). Although small in size, the LED was still too bright, it generated large pixels in the captured image. Therefore I covered the tip with yellow tack (similar to Bostik Blu-Tack) to reduce the brightness of the LED. Figure 5.47 shows the 3D projected points location used in calibration and its error.

Figure 5.47: Results of multi camera calibration using Svoboda et al’s calibration toolkit (a) the track of the marker’s wand (b) mean and standard deviation of 2D error

3D Triangulation

Once the cameras were calibrated, I used Hartley and Zisserman’s [53] method of Direct Linear Triangulation (DLT) to get the 3D positions of the markers. The DLT method allows us to obtain the 3D positions of the markers from two 2D positions of the markers using the pre-calibrated camera matrix. The DLT method finds the 3D position as the unit singular vector correspond- 5.4. Marker-based Tracking using a Multi Camera System 98 ing to the smallest singular value of A:   xp3T − p1T    3T 2T   yp − p  A =   (5.9)  0 03T 01T  x p − p    y0p03T − p02T where p and p0 refer to the camera matrix for the first and second camera re- spectively, piT refers to the i-th row of p and x and y refer to the inputted (x, y) 2D coordinates. Given three camera inputs there are three possible combinations of 2D features points which could be used. The system always selects the first two views from the left and top cameras. If either of the first two views has lost track of the markers the system then uses the 2D feature points from the third view.

Ground Projection and Calibration

The output from the calibration process was an arbitrary coordinate system based on the distribution of sample points. The purpose of this second calibration process is to generate a projection matrix to map the triangulated points to the pre-measured coordinate system.

Figure 5.48: LED markers placed on the rig to generate the ground projection matrix. 5.4. Marker-based Tracking using a Multi Camera System 99

I used three LED lights attached to the holding frame. The distance between LED markers on the left is 50cm (Figure 5.48). Given this measurement I can then map the triangulated coordinate system to ours with one unit measure- ment mapped to one cm. I also translate the point of origin to the bottom left LED marker. This reprojection allows us to work with the 3D points in a more convenient way.

5.4.4 Avatar Control

The avatar control mechanism is similar to the one in the previous prototype (Subsection 5.3.6). The difference is that in this new system, instead of speed, it allows measurement of the actual foot position relative to the centre of the dish (point of origin). This additional capability allows me to measure user’s feet real position instead of just relative speed. This resulted in more precise movement of the avatar in the virtual environment.

5.4.5 The Tracking GUI Design and Implementation

As it was with the previous parallel camera system (Subsection 5.3.7), the tracking module is implemented using the OpenCV HighGUI library. Again the module is configured to read from a configuration file. The parameters for the configuration file are listed in Table 5.3. The tracking GUI consists of eight separate windows (Figure 5.49). The first from the left is the tracking visualisation. It renders a simple avatar and the position of the markers in 3D. This allows the administrator to check whether all the markers are tracked correctly. The view perspective can be ad- justed by clicking and dragging the mouse cursor on the window. Three separate windows (titled camera 1, camera 2, and camera 3) show the camera capture from all three cameras. On top of each window is a slider showing the threshold amount for intensity thresholding. The intensity threshold amount is different for each camera. The values are decided during the initialisation 5.4. Marker-based Tracking using a Multi Camera System 100

Table 5.3: Tracking module configuration file parameters for multi-camera sys- tem. This configuration mechanism allowed the system to use different input mode: live camera feed, video camera, or live camera with recording capability. Name Description ENABLE LIVECAM 0 for video files feed, 1 for live camera feed. ENABLE RECORDING 1 for recording the camera feed (should be used only when the live camera feed is enabled) VIDEO FILE1 The filename of video file #1 used when the ENABLE LIVECAM parameter is set to 0 VIDEO FILE2 The filename of video file #2 used when the ENABLE LIVECAM parameter is set to 0 VIDEO FILE3 The filename of video file #3 used when the ENABLE LIVECAM parameter is set to 0 phase. The tracking parameters window can be used to adjust the parameters view, node, position, run, init and intensity threshold. Adjustment is made with sliders with preset values. The “view” and “node” parameters are used to select the currently active camera and marker node respectively. The “view” parameter slider has three options mapped to cameras 1, 2 and 3. The “node” slider has 10 options mapped to markers 1 to 10. The selected “view” and “node” parameters determine which camera and marker will be used by the “mean-shift tracking input” and “his- togram backprojection” windows. Additionally, the “node” parameter can be used to manually assign a tracker 5.4. Marker-based Tracking using a Multi Camera System 101

Figure 5.49: GUI design of the tracking module to a marker. For example, to assign the red marker orb in camera 1 as tracker #3, the administrator can set the “node” parameter to a value of 3, then click and drag the mouse on the red pixels in the “camera1” window. The system will update the histogram for the camera #1–tracker #3 automatically. The “position” and “run” parameters are only used when the input is config- ured to read from video files. The “position” slider displays the image position relative to all the frames in the video file. The “position” slider is draggable for moving across all the frames. The “run” parameter has three options for backward play, pause, and forward play. It allows automatic play of the video files. The “init” slider is an on/off switch for enabling the initialisation subprocess. When enabled, the initialisation process will run, seeking the T-stand posture. Once a match is found, the slider will return to the “off” default state. The “intensity threshold” sets the minimum threshold used in the binary mask window. The “binary mask” window is for debugging, allowing the administrator 5.4. Marker-based Tracking using a Multi Camera System 102 to view the result of different intensity thresholds for each camera in real-time. The “colour histogram” window displays the colour histogram presampled during the initialisation phase. It consists of 10 by 3 subwindows. The columns are mapped to the three camera views. The rows are mapped to the markers tracked from each camera.

5.4.6 Experimental Results

Figure 5.50 shows the comparison between the new prototype (using the 3-camera system) and the earlier prototype (using parallel cameras). Due to the high noise rate the earlier system relied on the Kalman Filter to smooth out the reconstructed 3D position of the foot. In comparison, the current prototype generates very little noise. On average the Kalman filter introduces about a 20ms delay between the real peak and the Kalman-filter generated peak, hence the current prototype does not use the Kalman filter.

Figure 5.50: Kalman filter output from the current prototype (top) and the earlier prototype (bottom). Dashed line represents the original 3D position trajectory.

Figure 5.51 shows the x-axis (sideways direction) and z-axis (front-back di- rection) displacement during forward walking and rotation. The differentiating 5.4. Marker-based Tracking using a Multi Camera System 103 factor between forward walking and rotation is the maximum x-axis displacement of the foot occurs after the maximum in z-axis displacement in rotation. In or- dinary walking, the x-axis and z-axis displacement are linearly correlated; the peaks occur at, and end at, approximately the same time.

Figure 5.51: X-axis and Z-axis position of the right foot during forward walk- ing gait (a) and during left rotation gait (b). The X-axis represents the image acquisition rate of 2ms in-between

In a few cases the left camera lost track of one or two markers due to occlusion. However, since the user does not need to rotate or move around, these markers were always visible from the other two cameras. Consequently, no loss of tracking ever occurred in the 3D tracking. In the 2D tracking module, when a view lost track of a marker, it recovered instantly when the marker became visible again. The system has been tested with a 2.66 GHz Intel Core i7-920 processor, 6GB DDR3 RAM, and NVidia GeForce GTX 260+ graphics card. The tracking system can perform in real-time with an average processing time of about 3ms per frame. Preliminary tests on a few users indicated user acceptance of the range of walking gaits and body controllability. Chapter 6

Visualisation Module

6.1 Literature Review

6.1.1 Overview of Graphics APIs, Graphics Rendering Engines and Game Engines

A graphics API provides a software interface to the graphics hardware [107]. It allows a developer to make calls to the underlying graphics functionalities. In the early days, each hardware vendor would implement its own API. Over time, these APIs emerged as common APIs, abstracting function calls so that calls can be made regardless of the specific hardware details. Two of the most well-known graphics API that provide such abstraction are OpenGL and Direct3D (part of DirectX library). OpenGL was introduced in the 1992 by Silicon Graphics (SGI) as a free and open standard. Originally, the development of the API itself was governed by the OpenGL Architecture Review Board (ARB), a committee formed by companies such as SGI, Microsoft, NVidia, and many others. Since Sept 2006, the ARB became the OpenGL Working Group. Features are added when consensus is reached to ensure the API remains elegant and not vendor specific. Direct3D was sort of OpenGL’s competitor in the market. It was developed solely for the Microsoft platform (Microsoft Windows series and XBox). Early

104 6.1. Literature Review 105 versions of DirectX was touted as the inferior of OpenGL. Over time starting with the DirectX 8, the position has reversed slightly [2]. A is a higher level API providing reusable algorithms and func- tionalities that are essential to create a game [107]. A game engine usually in- cludes different modules such as rendering pipeline, character animation, physics engine, networking and audio APIs. A graphics engine is a higher level graphics API that mainly deals with high performance 3D graphics rendering [92] [95]. This focus on graphics rendering implies that graphics engine is not meant only for games, but also for visual simulation, virtual reality, scientific visualisation and modelling. Graphics Rendering Engine and Game Engine are terms that sometimes in- terchangeable. Almost all game engines ship with a graphics rendering engine included. However, this composition does not always hold true, some graphics rendering engine, such as Ogre3D [92] also ship with some added modules, such as Character Animation, which arguably may not be part of the core graphics rendering engine. About 300 game engines have been built to date [37], some of which are still actively developed and some have became obsolete due to the lack of interest. These game engines also vary in licensing arrangement ranging from strict com- mercial licence, to free open source, and to a hybrid arrangement in which only commercial distribution attracts some royalty fees. In this project I evaluated only those game engines that are free (and open source), actively developed, and popular (being used in many commercial projects). Free and open source nature of the game engine are dictated by budgetary con- straints.

OpenSceneGraph(OSG)

The OpenSceneGraph [95] is an open source high performance 3D Graphics toolkit written in C++ and OpenGL. It has been widely used in many other Graphics Engine, such as Delta3D [79] and VR Juggler [58]. The OSG is a 6.1. Literature Review 106 strictly rendering engine only, it does not come complete with other components such as character animation or physics engine. Substantial additional code is needed to make it work with other components.

OpenSG

Not to be confused with OpenSceneGraph, OpenSG is an entirely different scene- graph project. However, both grew from the demise of the SGI & Microsoft’s Fahrenheit project in the 90s and is a strictly rendering engine only. Compared to the OSG, the main advantages of the OpenSG are its multi threading and clustering support [96].

Crystal Space and CEL

Crystal Space is a graphics rendering engine originally developed by Jorrit Ty- berghein in 1997 [31]. Since then, it has been developed by the community as an open source project. It is capable of using OpenGL, SDL, X11, SVGALib, or DirectX API as the underlying graphics renderer. The plug ins framework in CrystalSpace is based on Shared Class Facility (SCF) initially developed by Andrew Zabolotny back in 1999. It uses Smart Pointers concept to keep track of plug ins and its object instances. Its framework design is arguably the hardest to master compared to other game engines. CEL (or Crystal Entity Layer) is a set of plug ins and applications built on top of the Crystal Space SDK [32]. CEL was meant to be stand-alone application with contents modifiable via Phyton and/or XML script. Some of the recommended plug ins for CS [31] are listed in Table 6.1.

Ogre3D

Ogre3D is another rendering engine written in C++ [92]. Its development effort started in 2003 and it has grown significantly since then. It is capable of using either OpenGL or DirectX as the underlying graphics API. Ogre3D has more up 6.1. Literature Review 107

Table 6.1: Crystal Space common plug ins Components Library name Rendering [Native] Audio OpenAL Physics ODE, Character Animation Cal3D GUI CEGUI Input Devices [Native] to date and extensive documentation compare to Crystal Space SDK. It also has an active community support via public forum. Some of the common plug ins for Ogre3D are listed in Table 6.2.

Table 6.2: Ogre3D common plug ins Components Library name Rendering [Native] Audio OpenAL Physics ODE, Bullet, Newton, PhysX Character Animation [Native] GUI [Native], CEGUI Input Devices OIS

Irrlicht

Similar to Crystal Space, Irrlicht was developed in C++ and capable of running on multiple platform: Windows, , OSX, Solaris, and others [61]. It also has extensive documentation and tutorials. There are some claims that Irrlicht is much easier to learn and performs much better than other game engine, but this claims have not been substantiated. 6.1. Literature Review 108

Delta3D

Delta3D was developed by US (Navy) NETC Learning Strategies Division to develop serious games and military simulations using best-of-breed open source components [79]. In the past, most of the military virtual simulations were cre- ated using proprietary platform. This created an environment where any further improvements on these virtual simulations are lock-in on particular vendors. With this in mind, Delta3D was developed in order to reduce vendors lock-in through the use of proprietary platforms. Table 6.3 lists the Delta3D underlying components.

Table 6.3: Delta3D underlying components Components Library name Rendering OSG Audio OpenAL Physics ODE Character Animation ReplicantBody/Cal3D GUI FLTK, glGUI Input Devices PLIB, InterSense

6.1.2 Physics Engines

Physics Engine defines everything from how objects move when they fall due to gravity to what happens when they hit each other [80]. Some of the well known free physics engines are Newton, ODE, Bullet, and PhysX. (ODE) is one of the most used physics engines. It has been used in some well known game engines such as Crystal Space, Delta3D, VJuggler, and Ogre3D. The ODE has a broad range of parameters that can be modified to simulate real physics. Due to this feature, it has tendency to be slower than the other physics engines. Newton Physics Engine is developed by Newton Game Dynamics with more 6.1. Literature Review 109 focus on real-time physics simulation. Due to this reason, the library is smaller in size (ver. 1.53 archive is about 14Mb in size), fast and stable. It also tends to be easier to use because the developer only needs to know the basic principles of physics. Bullet Engine is one of the most well known physics engine. It has been used extensively in movies such as Toy Story and A-Team. However, at the time when I developed the software (early 2008) the library had not been well documented. PhysX is an early effort of using hardware acceleration for physics calculation. It is currently owned by NVidia after several acquisitions. PhysX runs well on a CUDA-enabled NVidia Graphics Processing Unit (GPU) and could possibly run on other GPUs such as ATI. However, recently there has been effort by NVidia to lock-in its API, therefore the API has not been well-received among developers.

6.1.3 Character Animation Library

Character Animation Library handles the manipulation of characters in games [94]. There are two common ways to simulate a character: mesh-based and skeleton- based (Figure 6.1).

Figure 6.1: Character animation: (a)mesh-based (b)skeleton-based

Mesh-based character animation stores the meshes in its animated key frames. 6.1. Literature Review 110

Some examples of this are MD2 and MD3 animation format created by id Soft- ware. Skeleton-based character animation maps the character’s mesh to a skeleton which consists of some joints and bones. The animation keyframes are stored as sequences of bones locations. Some examples of this are Cal3D and BVH (used by most motion capture hardware). The benefit of using skeleton-based character animation is the ability to control the avatar’s body joints in real-time.

6.1.4 GUI Library

GUI Library provides an easy API for developing windowing and widgets within the graphics rendering engine [93]. Some of the well known GUI library are Crazy Eddies Gui (CEGUI) [30] and QuickGUI [101].

6.1.5 Input Devices Library

Input Devices Library handles the interfacing to various human interface devices such as keyboard, mouse, joypad, and other game controllers [52]. OIS [139] is one of the well known and free input library provided by WreckedGames and shipped together in Ogre3D. WiiYourself! [98] is an input library for some of Nintendo Wii devices, including Wiimote and Balance Board. It is developed as free native C++ implementation.

6.1.6 Modelling Software

Modelling software are used to produce three-dimensional geometry for everything in the virtual game world such as weapons, vehicles and characters [52]. Given a wide range of available modelling software, in this project I evaluated only three of them; Autodesk 3DSMax, Blender3D, and MilkShape3D. Currently, Autodesk 3DSMax is the state of the art in modelling software. It has been used extensively in several movies and commercial games [9] [8]. However, it comes at a price tag of US$ 3,495 for the basic version. 6.2. Components of Choice 111

Some of the well-known free alternatives to Autodesk 3DSMax are Milk- Shape3D and Blender3D. MilkShape3D is offered as shareware that can be used freely for the first 30 days before the customer must purchase the licence. Blender was initially developed by Ton Roosendaal in 1988 within a company called NaN [13]. However, due to disappointing sales the company was closed down and resurrected under the Blender Foundation. Currently, Blender3D is the state of the art of free modelling software.

6.2 Components of Choice

Given the broad range of components I required, I evaluated the components mostly based on the available (community) support and its design framework. Table 6.4 lists all the selected components. The justifications for the selection are given below.

Table 6.4: Components of choice Components Library name Rendering Ogre3D Physics Newton Character Animation [Ogre3D Native] GUI [Ogre3D Native] Input Devices OIS and WiiYourself Modelling Software Blender3D, MilkShape3D, Char- acterFX and 3DSMax

I selected Ogre3D as the main development platform due to its excellent doc- umentation and design. Almost all the provided application samples and docu- mentation are still applicable to the most recent release. In comparison, I could not find similar documentations for Crystal Space and OSG. A lot of sample ap- plications are also not well documented. Compared to Irrlicht, due to its strong community support, Ogre3D comes with more tutorials and sample applications 6.3. Modelling the Avatar 112 throughout. Additional consideration in selecting Ogre3D is its well developed character animation library. Earlier, I developed my prototype using Cal3D (and OpenGL) and found some deficiencies with the modelling software support. Much of the export from other modelling software has to be done manually by hand-editing the file. In comparison, Ogre3D has much better import mechanisms from other well know modelling software, such as Blender3D and Autodesk 3DSMax. I used Newton Physics as the main physics engine due to the simplicity of its interface. I do not have to tune many different parameters to achieve realistic looking physical behavior. My early test has also shown that it performed faster than the ODE library. I selected ODI as the interface for reading inputs from keyboard and mouse since it is the default input library shipped together with Ogre3D. I used Blender3D as the main scene development due to its features and cost. There is an excellent set of tools for exporting and importing objects from/to other software, such as Autodesk 3DSMax and Ogre3D mesh filetype. I also used 3DSMax for importing the previously available object files into Blender3D. This will be described in a later section (Modelling the Virtual En- vironment).

6.3 Modelling the Avatar

Modelling an avatar from scratch is an enormous undertaking. For this project I modified a ninja avatar model provided from Psionic3D [100]. The original model was created using CharacterFX and MilkShape3D. I modified the original ninja avatar using Blender3D (Figure 6.2). The ninja avatar file comes with skeleton model and some action keyframes listed in Ta- ble 6.5. The reason for selecting such an avatar animation is the natural movement provided. Animating human motion is an arduous tasks, it involves very time intensive tweaking and refining to make it look natural. 6.3. Modelling the Avatar 113

Figure 6.2: The first import of a ninja avatar model

I modified the avatar to become a mine worker avatar model. The modifica- tions include:

• Removed the sword mesh 6.3. Modelling the Avatar 114

Table 6.5: Avatar key frames animation (Psionic3D) Key Action frames 1-14 Walk (normal) 15-30 Stealth Walk 32-44 Punch and swipe sword 45-59 Swipe and spin sword 60-68 Overhead twohanded downswipe 69-72 Up to block position (play backwards to lower sword if you want) 73-83 Forward kick 84-93 Pick up from floor (or down to crouch at frame 87) 94-102 Jump 103-111 Jump without height (for programmer controlled jumps) 112-125 High jump to Sword Kill 126-133 Side Kick 134-145 Spinning Sword attack (might wanna speed this up in game) 146-158 Backflip 159-165 Climb wall 166-173 Death 1 - Fall back onto ground 174-182 Death 2 - Fall forward onto ground 184-205 Idle 1 - Breathe heavily 206-250 Idle 2 251-300 Idle 3

• Modified the body mesh to be less wrapped and have working cloth shape

• Created ears (the original ninja had the ears covered up). 6.4. Modelling the Virtual Environment 115

• Attached helmet, cap lamp & battery, SCSR and gum boots.

• Created new UV texture

• Adjusted the keyframe to fit with the tracking model (in Chapter 5, sec- tion 5.3.6), i.e. keyframe 0-7 will start when the left foot starts moving forward and ends just before the right foot steps forward. Similar adjust- ment was also done for the next keyframe (keyframe 8-14).

The completed avatar can be seen in Figure 6.3.

Figure 6.3: Completed mine worker avatar model

The completed avatar was then exported using an Ogre3D export script pre- viously installed into Blender3D (Figure 6.4). Some of the parameters considered are listed in Table 6.6.

6.4 Modelling the Virtual Environment

The modelling was mainly done using Blender3D. However, some of the imported models had previously been made using Autodesk 3DSMax. 6.4. Modelling the Virtual Environment 116

Figure 6.4: Exporting avatar model into OgreMesh file

Table 6.6: OgreMeshExport script parameters Parameters Values Action name Default Action Keyframe start 14 Keyframe start 23 Export Materials Scene.material Coloured Ambient No Copy Textures Yes Rendering Materials Yes Game Engine Materials No Custom Materials No Fix Up Axis to Y No Require Materials Yes Skeleton name follow mesh Yes Apply Modifiers No OgreXMLConverter Yes

For those models that were originally developed in 3DSMax, I exported them first into .OBJ file and then import them into Blender3D (Figure 6.5). The pro- cess of exporting/importing from 3DSMax to Blender3D was not as effortless as 6.5. Software System Build 117 expected. Some significant reworking, especially of the objects’ textures, needed to be done.

Figure 6.5: Importing 3DSMax file to Blender3D and Ogre3D

There are two export scripts in Blender3D for exporting 3D objects into Ogre3D. The first export script is called ”Ogre Scene” export. It exports the scene XML file containing the name of the mesh objects, its filename, location, and scale. The second export script is called ”Ogre Mesh” export, which is used in the previous subsection for exporting the avatar model. Although the scripts allow us to automate some of the necessary tasks, some manual work is still needed for the material file. Recall from Figure 6.4 each export will create a new material file instead of inserting additional lines. Hence after I export the mesh objects for the environment, I need to edit the generated material file and insert those lines from the avatar’s material file.

6.5 Software System Build

6.5.1 Class Design and Implementation

Figure 6.6 illustrates my program’s class design. Since the number of classes are quite large, I only illustrate some of the most relevant ones. SimApp and SimFrameListener are the two of the main classes. SimApp handles all the setup routines for setting the environment scene, which includes lighting, camera, 3D objects and collision detection. All the 3D objects are loaded from the scene file which is configured in the resources.cfg config file. 6.5. Software System Build 118

Figure 6.6: Class diagram of the visualisation module

Reading and parsing of the scene XML file is handled by DotSceneLoader class which makes use of the TinyXML library. TinyXML library is one of the most widely used XML parsers in a graphics rendering engine due to its small size and free distribution licence. After setting up all the scene objects (World, Body, SceneNode, and Anima- tionState objects), SimApp creates an instance of SimFrameListener and passes the pointers to all those scene objects. World and Body classes are part of the OgreNewt library. They are used for automatically controlling the physical be- havior of the encapsulated SceneNode object. SimFrameListener handles the continuous update of the frame rendering, in- cluding pooling the signal from all the input devices. VRPN classes (Tracker Remote and Connection) are used to communicate with the tracking module. OIS is the library used for pooling the input from the keyboard and mouse. For communi- cation with Wiimote joystick, I used WiiYourself [98] library (wiimote class). The Overlay class is used to create the OmniWalker widget on the top left of the screen (Figure 6.9). The purpose of this widget is to give the user feedback 6.5. Software System Build 119 as to the location of the feet and the walking gait interpretation whether it is forward walking, turning, or standing idle. This widget is necessary because it allows the user to understand the system’s interpretation of the walking gait in real time. IWR class refers to VR920 SDK library provided by Vuzix for reading the head tracking data.

Figure 6.7: OmniWalker widgets consists of 7 images; one for the (transparent) overlay and six images for each foot tile

The OmniWalker widget consists of multiple layers of images laid on a static background image (Figure 6.7). The actual walking platform has a radius of 150cm with point of origin on its left hand side holding bar (Figure 6.8(left). On the other hand, the widget’s overlay is of size 170 x 170 pixels with point of origin on the top left. The mappings between the actual view and the overlay are given by the following mapping functions:

Lx = −Fx + 178 (6.1)

Ly = (−Fy ∗ 1.5) + 85 (6.2) where Lx and Fx refer to the x position of the foot overlay tile and the 3D position 6.5. Software System Build 120

Figure 6.8: The comparison between the coordinate system given by the tracking module and the coordinate system of the Ogre3D overlay. Violet arrows indicate the forward facing direction.

of the real foot, respectively. Similarly, Ly and Fy refer to their y-position map. Both mapping functions are applied the same way for the right and the left foot.

6.5.2 System Prototype

Figure 6.9 shows my visualisation prototype. The module accepts multiple in- put from keyboard, mouse, joystick, VR920 HMD and the OmniWalker tracking module. Top-left of the screen is the widget showing recognized walking ges- ture fed from the Tracking module. From top to bottom are idle(red), forward walk(green), and rotation(yellow). The system can also rendered the view either from the first person or the third person view (Figure 6.10) 6.5. Software System Build 121

Figure 6.9: Visualisation module implementation with all the available input devices.

Figure 6.10: Screenshot of the user’s first-person view and third-person view. Chapter 7

User Evaluation

7.1 Literature Review

7.1.1 Usability Evaluation of Locomotion Interface

The usability evaluation of most of the novel locomotion interfaces has been quite limited. For some of the locomotion interfaces, such as the GaitMaster [64], the OBDP [56], Powered Shoes [65] and the String Walker [68], the result of formal usability tests were not reported. Most of these devices place greater emphasis on the tracking performance measure only. For the Virtual Perambulator [67], the experimental results presented were the proportion that failed rhythmical walking and turning. Also reported were the proportion that could run spontaneously at the first trial. For the OBDP [56] results presented were limited to the tracking outcome. The very limited usability test results from novel locomotion interfaces may be due to the limited versatility and robustness of these new devices. For the GaitMaster [64] it was reported that the 300ms tracking delay caused the system to be speed limited hence only suitable for a few walking scenarios such as mobility therapy [14]. The OBDP [56] has also stopped short from doing any usability tests due to the high tracking delay. This is possibly due to its gait reasoning process and noise filtering mechanism which produces a system frame rate of 17

122 7.1. Literature Review 123 fps. The Powered Shoes device also suffered frequent breakdowns due to the lack of durability of its flexible shafts [68].

Figure 7.1: WalkingPad: (a) simulated environment labyrinth (b) experiment result for 2 walking trials (Bouguila et al, 2004)

Some formal usability studies have been completed for other locomotion inter- faces that emulate walking motion. Bouguila et al [17] employed five participants to use the WalkingPad to walk through a virtual labyrinth for a few minutes (Fig- ure 7.1). The experiment evaluated the number of collisions with the labyrinth’s wall in narrow and wide spaces. Iwata and Matsuda [62] employed three partic- ipants to evaluate the Virtual Perambulator on distance estimation tasks (Fig- ure 7.2). They reported better performance for real walking than flying (moving with joystick). They also found worse performance with a more complex walking path (the closed square path walk). Shiratori and Hodgins [113] evaluated the performance of the accelerometer-based interface using four test tracks selected at random (Figure 7.3). The test was performed with 15 participants, measuring three parameters; straight track completion, test track completion and immersion feel. Shiratori and Hodgins found that compared to the joystick, participants that used the Wiimote accelerator based locomotion interface had significantly fewer failures on the straight track completion and test track completion. There were also significantly better scores on the immersive feeling questionnaire when the participants used the accelerator based locomotion interface. 7.1. Literature Review 124

Figure 7.2: Virtual Perambulator: (a) Straight path walk (b) Closed square path walk (Iwata and Matsuda, 1992)

Figure 7.3: Accelerator-based locomotion interface test tracks (Shiratori and Hod- gins, 2008)

Some usability research has been performed on treadmill based locomotion interfaces. Mohler et al [87] investigated the influence of visual motion on speed estimation using a treadmill. They found a tendency for users to underestimate speed based on fast visual motion. Similar studies on the influence of visual motion cues on distance estimation have been done by Pelah et al [99], Mohler et al [88], Durgin et al [43] and Banton et al [10]. Yano et al [141] investigated the use of the linear treadmill in a shared walk environment. The system was aimed at assisting rehabilitation patients in a group therapy setting. Souman et al [114] 7.1. Literature Review 125 developed and evaluated a treadmill control algorithm for changing the velocity of the treadmill in response to a change in walking speed. The algorithm aimed to improve the speed control of the treadmill during sudden stopping so that the stopping was less obtrusive. Lichtenstein et al [76] developed a new tracking mechanism for controlling the speed of a linear treadmill and compared it to the self-propelled mode where the user can change the walking speed by pushing the tread with their feet. Iwata [66] performed distance estimation tests on the Torus treadmill, one of the omni-directionally based locomotion interfaces. He employed 18 participants divided into three test groups to use the three locomotion modes; traveling by walking, traveling using a motion base and traveling by joystick. He found the mean error distance for the group using the Torus treadmill was lower than the mean error distance for other two groups. Using analysis of variance (ANOVA) with an alpha value of 0.05, the mean error distance showed a significant difference between the Torus treadmill and the other two groups (Figure 7.4).

Figure 7.4: Mean error distances for the three locomotion modes (Iwata, 1999) 7.1. Literature Review 126

7.1.2 Distance Perception in Virtual Reality

Based on the frame of reference used, there are generally two categories of visual perception of distance; egocentric and exocentric distance perception [124]. - centric distance refers to the distance from the observer to a particular location. Exocentric distance refers to the distance between two remote points as viewed by the observer. Studies on distance perception in the virtual environment have been surveyed extensively by Ziemer et al [146]. Several studies on distance estimation in HMD- based virtual environments have found that people tend to underestimate abso- lute distances to a target. Initial investigations by Lampton et al [74] suggested that users are less accurate in estimating distances in virtual environments com- pared to the real world. Witmer and Kline [137] conducted experiments to assess the contributions of various distance cues in virtual environments. They found that participants tend to underestimate distances in both environments but more significant underestimation occurred in the virtual environment setting. They also found different texture densities on the wall and floor have no impact on the participants’ performance. However, traversing the actual distance to be es- timated does improve the participants ability to estimate that distance. Similar results have also been found by Knapp and Loomis [72], Durgin et al [43], Loomis and Knapp [78], Thompson et al [124], Sahm et al [106], Willemsen et al [134], Willemsem et al [135], Ziemer et al [146] and Willemsen et al [133]. Thompson et al [124] investigated the degree to which image quality affects egocentric distance judgments in virtual environments. Their testing was per- formed using three different methods of graphical rendering; 360 degree panoramic images, low-quality textured environments and wireframe rendering (Figure 7.5). Compared to the participants ability in estimating distances in the real world, all three viewing methods resulted in significant distance compression. Thomson et al found the amount of compression using each of the three methods was very similar (Figure 7.5). Knapp and Loomis [73] examined the relationship between distance underes- 7.1. Literature Review 127

Figure 7.5: Distance judgments comparisons from different graphical rendering methods (Thompson et al, 2004) timation and the limited field of view available when using an HMD. They found no significant impact of HMD limited field of view on the distance estimation task. Willemsen et al [134] investigated whether the mechanical aspects of HMDs such as mass and moments of inertia are responsible for the distance compression effects. They found the mechanical hardware of an HMD does account for some distance compression but this is not as significant as the remainder of the distance compression that occurs between the virtual and real worlds (Figure 7.6). Ziemer et al [146] examined whether the order in which people experienced real and virtual environments through HMDs influenced their distance estimates. They found that when estimates were made in the real environment first they were significantly more accurate than when they were made in the virtual environment first. Ziemer et al also conducted their experiments on a Large Screen Immersive Display (LSID) and found distance compression not only occurred using HMDs but also with the LSID. In one of the trials the error rates were 55% and 74% for 7.1. Literature Review 128

Figure 7.6: Judged distance in direct walking, comparison between real HMD in a virtual world and mock HMD in the real world (Willemsen et al, 2004) virtual then real and real then virtual, respectively. Similar results for this type of two-way mixed trial were found by Interrante et al [59]. Some researchers have investigated the effect of the presence of an avatar on distance judgments in virtual environments. Mohler et al [86] found significant differences in distance judgments between those with an avatar and those without an avatar (Figure 7.7). Ries et al [105] investigated the effectiveness of avatar self-embodiment. They assessed participants’ abilities to estimate egocentric dis- tances under four different conditions; no avatar, a fully tracked high fidelity avatar, a dot point avatar and a stiff avatar. Of the four conditions they found that only the full-bodied high fidelity avatar led to significant improvement in distance judgement over the baseline no-avatar condition (Figure 7.8).

7.1.3 Measuring Presence in Virtual Reality

Bystrom et al [24] introduced a virtual environment interaction model with im- mersion, presence, and performance as its key factors. The notion of presence 7.1. Literature Review 129

Figure 7.7: Distance judgment results comparing the use of an avatar with no avatar (Mohler et al, 2008) was introduced in 1980 by Minsky [85] capturing the idea of “being there” in teleoperation and virtual reality endeavors. The term “presence” has been quite widely defined. Loomis [77] defined pres- ence as the impression of being in the remote or simulated environment. Zahorik and Jenison [143] defined presence as part of how one can support an action in the environment, be it virtual or real. They argued that reality is grounded in action instead of just the appearance of how things look and sound. A more thorough discussion of the various definitions of presence has been covered by Draper et al [42]. There have been two different schools of thought on how one could measure presence. Witmer and Singer [138] developed a presence questionnaire (PQ) con- sisting of 32 questions each on a 1 to 7 scale. Each question is linked to a factor category such as control, sensory, auditory, haptic and interface quality. Usoh et al [129] developed a different questionnaire called the Slater, Usoh, and Steed (SUS) questionnaire with the focus on; the sense of being in the virtual environ- ment, the extent to which the virtual environment becomes the dominant reality and the extent to which the virtual environment is remembered as a “place”. The SUS questionnaire put emphasis on measuring the extent to which the user could 7.2. Usability Test 130

Figure 7.8: Comparison of the difference in participants’ average relative error in the virtual and real world(Ries et al, 2009) not discriminate the virtual and the real environments. Witmer and Singer’s questionnaire has found greater use in research than the SUS questionnaire [28]. This may be because Witmer and Singer’s questionnaire is more extensive and considers a broader definition of presence factor, allowing a more detailed analy- sis.

7.2 Usability Test

In this study I want to compare the user’s performance using a joystick with their performance using the OmniWalker. I carried out the experiments under two different operating modes; forward walking mode and free exploration mode.

7.2.1 Distance Estimation in Forward Walking (Experi- ment 1)

Method

In this experiment I compared user performance using a joystick with using the OmniWalker in estimating distance of a forward walk. I built a synthetic envi- ronment consisting of a textured floor and a single 1m high by 2m wide, thin, red 7.2. Usability Test 131 wall to indicate the destination point. There were 11 runs with the distance to the red wall being different in each run. The distances used were 2, 6, 10, 14, 18, 22, 26, 30, 34, 38 and 42 metres presented in a random order. Figure 7.9 shows the first person view for a person with eye height of 170cm. Figure 7.10 shows the third person design view of the environment used.

Figure 7.9: Screenshots of the user’s first person view in the first experiment: (a) 6m (b) 18m (c) 30m (d) 42m.

Participants

For this experiment I recruited seven participants; two female and five male. The participants ranged in age from 20 to 25. Each participant was compensated for their time which was approximately 1.5 hours. 7.2. Usability Test 132

Figure 7.10: Third person view of the walking track (18m distance)

Apparatus

In addition to the OmniWalker system, I used WiiMote’s Nunchuk joystick for performance comparison. The WiiMote was selected because the Nunchuk is the simplest controller; no extra buttons or control sticks. The WiiMote was connected using the WiiYourself library [98].

Procedure

Each participant began by reading the aim of the experiment and consenting to the experimental conditions. Next, I measured the eye height of each participant and used it to scale the avatar such that the avatar’s eye height was exactly the same as the participant’s. Each participant was informed of this. The experiment started by measuring the participant’s performance using the Nunchuk joystick. After briefing the participant on how to operate the joystick, the experimenter loaded the scene and placed the destination distances in ran- dom order. The users were not told what the possible distances were. For each of the 11 distances, the participant could move toward but not away from the destination point as many times as they wished until they estimated the distance. 7.2. Usability Test 133

Next, the participants donned a harness and attached the tracking markers. The participant was briefed on how to operate the OmniWalker and repeated the dis- tance estimation tasks. Figure 7.11 shows the usability test participant using the OmniWalker and using the Nunchuk joystick.

Figure 7.11: Usability test participant: (a) using the OmniWalker (b) using the Nunchuk joystick

Results

Figure 7.12 shows the mean of the distance estimates. The means of the absolute errors are the vertical differences between physical distance (baseline) and esti- mated distance. Figure 7.13 shows the relationship between physical distance and relative errors. The data collected can be found in Appendix B and the results are discussed in Section 7.3. 7.2. Usability Test 134

Figure 7.12: Distance estimation absolute error results from Experiment 1. The error bars represent standard error of the mean of the estimated distances.

7.2.2 Assessing Presence and Distance Estimation in Free Exploration Mode (Experiment 2)

Method

In the second experiment I compared not only user performance but also the user’s perception of using a joystick and of using the OmniWalker in navigating the virtual environment. In this experiment the user was not restricted to forward walking but was also encouraged to explore the environment and locate objects by themselves. Two types of virtual environment were used in the study; a) a synthetic environment consisting mainly of virtual boxes without any hint of object sizes, and b) an underground coal mine environment which contained some common objects, such as a transport vehicle and the surfaces of the underground coal mine opening. In the first scene (Figure 7.14), the environment consisted of five plain-coloured blocks of which three are the same colours and dimensions. I put a texture on the floor to provide movement hints to the user. In the 7.2. Usability Test 135

Figure 7.13: Distance estimation relative error result from Experiment 1. The error bars represent standard error of the mean of the relative errors. Negative value means the user underestimated the actual distance walked. second scene (Figure 7.15), the environment consisted of a high-fidelity model of underground coal mine tunnels which was imported from a previously developed Hazard Awareness Training Module [116]. Inside the virtual mine environment I also placed a personnel transport vehicle and lifeline (a marked cord that is attached at the side of the opening to indicate the direction of the exit). The design of the roof bolts and wire mesh is in accordance with common coal mining industry practices.

Participants

For this study I recruited ten participants, three female and seven male. The participants ranged in age from 22 to 30. Each participant was compensated for their time which was 1.5 to 2 hours.

Apparatus

The apparatus was the same as for the previous experiment. The only difference is that in this experiment the joystick allowed the user to rotate. 7.2. Usability Test 136

Figure 7.14: Plain scene: (left) top view, (right) perspective view

Procedure

As in the first experiment, the participant began by reading the consent form and having eye height measured. The experimenter then entered the user’s eye height into the simulation’s configuration. The experiment began by using the joystick in the plain scene (Figure 7.14). The experimenter asked the participant to perform some distance estimation tasks after they spent about 2 minutes exploring the environment. During the 2 minutes the participant was encouraged to explore the virtual world and get used to the joystick control. The participant was asked to estimate the width of the red block (2m) and the distance between the blue block and the green block (17m). I then loaded the second scene, which consisted of the underground mine (Figure 7.15), and the participant repeated the initial exploration task again using only the joystick. This time the participant was asked to estimate the width of the roadway (5m) and the distance from one intersection to the next one (31m). The participant was free to walk around a number of times to get the best estimate. Once these tasks were completed, the participants filled out a presence questionnaire (Table D.1 in Appendix D). The participants performed the same tasks again using the OmniWalker and completed another presence questionnaire after completing the tasks in both of 7.3. Discussion 137

Figure 7.15: Underground coal mine scene: (a) top view, (b) first-person view showing personnel transport vehicle and lifeline the scenes. The questionnaire I developed (see Table D.1 in Appendix D) is a subset of the Witmer and Singer’s Presence Questionnaire [138]. I took out some questions that are specific to sound, display and moving objects, which are not relevant since the simulation does not incorporate those elements.

Results

Figure 7.16 shows the relative error of participant’s distance estimation results in different environments. The data can be found in Appendix C. Figure 7.17 shows the results of the presence questionnaire for the joystick and the OmniWalker. The detailed results can be found in Appendix E.

7.3 Discussion

As found in previous research [54] [78] [105], people typically tend to underesti- mate distances in first person views. My results support this - all participants tended to underestimate distances (Figure 7.12, Figure 7.13 and Figure 7.16). In the first experiment, there is substantial benefit in using the OmniWalker 7.3. Discussion 138

Figure 7.16: Distance estimation relative error result (PS: plain scene, CM: coal mine, Red Width: width of red block, B&Y dist: distance between blue and yellow blocks) compared to the joystick. The errors in estimating distances were much larger with a joystick—users underestimated distances by 50–60% with a joystick com- pared with 0–10% with the OmniWalker (Figure 7.12). However, in the second experiment only the longest distance measurement (31m) follows the same pattern as found in the first experiment. There is a statistically significant difference in the errors when estimating the longer walk covering 31 metres distance (t=-4.58; n=10; p=0.0013). In the other three mea- surements, the users tended to estimate based on looking at the object relative to other objects’ dimensions. For example, in the first scene, some users walked to the red block which has the height of 1 metre, then moved a distance away from the object and estimated the width of it based on the 1 metre. This process was also used by participants for the longer distance (17m) estimation. Similarly, when the users were asked to estimate the width of the roadway in the coal mine scene, they tended to estimate based on what they saw instead of walking from one side to the other. In the presence questionnaire, I found no significant difference between using a joystick and the OmniWalker. Referring to Figure 7.17, question 1; How much 7.3. Discussion 139

Figure 7.17: Presence Questionnaire result showing 95% confidence intervals for the mean result. were you able to control events? is the only item that reveals a significant dif- ference, with joystick users saying they had more control. Anecdotally, this is because the joystick allowed the users to walk very fast when they wished to. The users responses to the other presence questionnaires were varied therefore I was not able to make conclusion as to which option produced better feeling of immersiveness and presence in the virtual environment. During the interview, the feedbacks from the users were very supportive of the direction taken by the project, especially on how the users can actually walk instead of just sitting idle. However, some users feel that their responses could be much better had the tracking and control mechanism being improved in terms of delay and accuracy. Chapter 8

Conclusion and Future Work

8.1 Conclusion

This thesis has presented the development of the OmniWalker locomotion inter- face. It is composed of three main components; mechanical hardware, tracking and visualisation software. I have designed and tested many different alterna- tives, some failed but some have worked quite well. All these alternatives are documented in this thesis to serve as reference for future research. The main contributions and conclusions from this work are summarised as follow:

1. Mechanical Hardware:

• I have constructed and tested ball transfer based walking platforms in different shapes and sizes. Initially, I used hemispherical dishes with different degrees of curvature. My tests indicated an insufficient inclination angle towards the centre of the dish for a user to walk comfortably. I improved the design to use a flat inclination angle and conical dish with a polygonal cross-section which has worked well.

• I have also tested combinations of “roller skates” on a hemispherical dish bases. For these trials, I constructed two different kinds of roller skates, one using custom made ball transfer units and the other using commercially available ball transfer units. Both have their own unique

140 8.1. Conclusion 141

advantages, however, this concept is unacceptable due to its tendency to cause the user to fall.

• I have also designed and constructed the support and holding platforms to ensure the safety of the users using the OmniWalker. These pieces of mechanical hardware were designed from the ground up, taking into consideration the need for the devices to be usable by people with different heights and weights.

2. Tracking Module:

• I have presented my early work on a markerless tracking mechanism for the OmniWalker. My early testing was very promising but the lack of the necessary performance with the current “commodity” computer hardware rendered the concept one for the future.

• I have built a new tracking mechanism employing inexpensive mark- ers and webcams. Human motion capture systems are relatively ex- pensive given the customized hardware and software. In this thesis project, I have presented the design of my tracking system, capable of real-time frame rates using only relatively inexpensive commodity computer hardware.

• The method for the OmniWalker’s gait recognition has also been pre- sented. Walking gait using the OmniWalker cannot be mapped directly to the virtual avatar due to the sliding action aspect of the walking motion. I have developed an algorithm and implemented that algo- rithm for gait recognition. The algorithm performs in real-time with very small delay between gait cycles.

3. Visualisation Module:

• I have presented the design of a visualisation module that integrates well with the OmniWalker tracking. The software module is built 8.1. Conclusion 142

on top of free and open source library software to allow inexpensive, flexible and rapid system development.

• I have also developed some interface widgets for enhancing the user’s experience while using the OmniWalker. These widgets give the user real-time feedback regarding the system’s internal state. This allows the user to learn the system quickly without formal training.

4. Usability Testing:

• I have performed usability studies on distance estimation in two dif- ferent exploration modes; forward walking and free exploration. I pre- sented the results that compared using a joystick with using the Omni- Walker. In the first experiment the errors in estimating distances were much larger with a joystick. Although their relative errors were rela- tively constant over different distances, there is still a significant gap (about 20%) between the joystick errors and the OmniWalker errors. In the second experiment the differences in the results were not as sig- nificant due to the user’s ability to freely explore the environment and gain hints of object sizes. However, I found that over long distances, the users made significantly less error when using the OmniWalker.

• I have also presented the study on the comparison of the users’ feeling of presence using a joystick and using the OmniWalker. The Wiimote joystick was used in this study because it is very popular within the gaming industry because of its simplicity and ease-of-use. Based on the overall users’ responses, the OmniWalker was rated as having similar ease-of-use and sense of presence as the joystick.

The above listed contributions have addressed the original goal of this thesis project, that is to develop a locomotion system that is relatively inexpensive and robust for regular usage yet still capable of delivering superior immersiveness to the user. However, some aspects of these goals are yet to be achieved trough further work that are listed in the next section. 8.2. Future Work 143

8.2 Future Work

Several research areas continuing the work presented in this thesis project are possible.

1. Walking Platform:

• Building new hardware is a very time consuming task and can have a significant cost. Given the limited time and budget for this the- sis project, there were only a few different hardware designs I could construct and test. The current walking platform could be improved by using different materials that have better elasticity to allow for a better walking motion.

• The walking platform should be made modular for easy portability. The current setup uses heavy materials that are welded to each other.

2. Tracking Module:

• The tracking system can be enhanced by using more cameras to pre- vent occlusion. With the current setup there are some “blind spots” where the camera can lose track of the markers, especially those at- tached to the hands. This enhancement will allow more robust tracking of whole human body motion.

• The gait recognition module can be enhanced to recognise other walk- ing gaits such as backward stepping. The current system does not recognize backward stepping due to the occlusion of the foot marker. Enhancement of this aspect will require more cameras which in turn will require greater tracking space.

• Markerless tracking is the “way of the future”. The current limitation of the markerless tracking system is the performance of the hardware. This could be improved in different ways; for example with computer clustering and graphical processing unit (GPU)-based computer vision libraries such as OpenVIDIA. 8.2. Future Work 144

3. Visualisation Module:

• One of the initial goals of the system was to be able to use telepresence, where remotely-located users can use the OmniWalker platform and be immersed together in a single shared virtual environment. This thesis project has incorporated this goal throughout the development cycle. However, due to time constraints it has yet to be implemented.

• The animation of the avatar can be improved by incorporating the user’s actual walking motion. Possible alternatives are pre-building the avatar walking animation using the user’s actual walk captured before the session. In a multi user environment this will allow other users to see an avatar that walks like the real person in real life.

• Using a Large Immersive Screen Display (LISD) could also give better immersive feeling for the user. The current setup uses an HMD with a limited viewing angle hence it seems to the user like “looking through binoculars” rather than observing the wide real life view.

4. Usability Test

• When using a joystick to move around the virtual environment, people tend to move around more quickly than their usual walking speed in real life. This means less attention is given to the surrounding details, such as potential trip hazards on the floor. Further usability tests using a high fidelity environment with many real-life-like objects could reveal further understanding in this area.

• Some mining safety training simulations expose the trainees to real hazards such as smoke and heat. This training gives first-hand expe- rience to the trainee as to how adverse events could actually happen. However, this type of training is normally limited to a given space which allows trainees to walk less than 50m. In this case, the Omni- Walker can enhance the experience by allowing the trainee to walk a 8.2. Future Work 145

further distance in a strictly controlled environment. Since the trainees are individually monitored within a constrained space, different vari- ables such as smoke and heat can be delivered in a controlled way, preventing potentially hazardous situations such as over heating or smoke build up. References

[1] 3rdTech, “Hiball-3100 wide-area, high-precision tracker and 3d digitizer,” 2010, http://www.3rdtech.com/HiBall.htm.

[2] F. Abi-Chahla, “OpenGL 3 and DirectX 11: The War is Over,” 2008, http://www.tomshardware.com/reviews/opengl-directx,2019.html.

[3] A. Agarwal and B. Triggs, “Learning to track 3D human motion from sil- houettes,” in Proceedings of the 21 st International Conference on Machine Learning, 2004.

[4] J. H. andYangming Xu, RobertChristensen, and S. Jacobson, “Design spec- ifications for the second generation Sarcos Treadport locomotion interface,” in Haptics Symposium, Proc. ASME Dynamic Systems and Control Divi- sion, vol. 69, no. 2, 2000, pp. 1293–1298.

[5] Animazoo, “Gypsy 7 brochure,” 2010, http://www.animazoo.com/media/ pdf/Animazoo GYPSY7 Brochure.pdf.

[6] ——, “Igs-190 brochure,” 2010, http://www.animazoo.com/media/pdf/ Animazoo IGS190 Brochure.pdf.

[7] D. Anker, 2008, http://tech.groups.yahoo.com/group/OpenCV/files/ dscam.zip.

[8] Autodesk, “Autodesk 3ds max - customer showcase,” http://usa.autodesk. com/adsk/servlet/pc/index?siteID=123112&id=14838218.

146 REFERENCES 147

[9] ——, “Autodesk 3ds max featured customer story,” http://www.resources. autodesk.com/med/Autodesk 3ds Max/Customer Stories.

[10] T. Banton, J. Stefanucci, F. H. Durgin, A. Fass, and D. Proffitt, “Perception of walking speed in virtual environments.” in Presence: Teleoperators and Virtual Environments, vol. 14, no. 4, 2005.

[11] barryoneill, “Virtusphere #fail,” http://www.youtube.com/user/ barryoneillpers/.

[12] Blackwood, Blackwoods Catalog 2007 release. J Blackwood & Son Limited, 2007.

[13] Blender, “History,” http://www.blender.org/blenderorg/ blender-foundation/history/.

[14] R. F. Boian, M. Bouzit, G. C. Burdea, and J. E. Deutsch, “Dual Stewart platform mobility simulator,” in Proceedings of the 26th Annual Interna- tional Conference of the IEEE EMBS, 2004, pp. 4848–4851.

[15] J. Y. Bouguet, “Camera calibration toolbox for matlab,” 2008, http:// www.vision.caltech.edu/bouguetj/calib doc/.

[16] L. Bouguila, M. Ishii, and M. Sato, “Realizing a new step-in-place loco- motion interface for virtual environment with large display system,” in Proceedings of the workshop on Virtual Environments, 2002, pp. 197–207.

[17] L. Bouguila, F. Evequoz, M. Courant, and B. Hirsbrunner, “Walking-pad: a step-in-place locomotion interface for virtual environments,” in Proceedings of the 6th international conference on Multimodal interfaces. ACM, 2004, pp. 77–81.

[18] R. Boulic, J. Varona, L. Unzueta, M. Peinado, A. Suescun, and F. Perales, “Real-time ik body movement recovery from partial vision input,” in In Proc. of the 2nd International Enactive Interfaces, 2005. REFERENCES 148

[19] D. A. Bowman, E. Kruijff, J. J. LaViola, and I. Poupyrev, 3D User Inter- faces: Theory and Practice. Addison-Wesley, 2005.

[20] G. Bradski and A. Kaehler, Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly Media, Inc., 2008.

[21] C. Bregler and J. Malik, “Tracking people with twists and exponential maps,” 1998.

[22] F. P. Brooks, “Walkthrough – a dynamic graphics system for simulating virtual buildings,” in Proceedings of the 1986 Workshop on Interactive 3D Graphics. ACM, 1987, pp. 9–21.

[23] G. C. Burdea and P. Coiffet, Virtual Reality Technology. John Wiley & Sons, Inc., 2003.

[24] K. E. Bystrom, W. Barfield, and C. Hendrix, “A conceptual model of the sense of presence in virtual environments,” in Presence: Teleoperators and Virtual Environments, vol. 8, no. 2, 1999.

[25] Y. Cheng, “Mean shift, mode seeking, and clustering,” in IEEE Trans. Pattern Analysis Machine Intelligence, vol. 17, 1995, pp. 790–799.

[26] R. Christensen, J. M. Hollerbach, Y. Xu, and S. G. Meek, “Inertial force feedback for a locomotion interface,” in Proceedings ASME Dynamic Sys- tems and Control Division, vol. 64, 1998, pp. 119–126.

[27] G. Cirio, M. Marchal, T. Regio-Corte, and A. Lecuyer, “The magic barrier tape: a novel metaphor for infinite navigation in virtual worlds with a restricted walking workspace,” in Proceedings on Virtual Reality Software and Technology, 2009, pp. 155–162.

[28] CiteSeerX, “Citations on Measuring¨ Presence in Virtual Environments: A Presence Questionnaire¨article,taken on 19 July 2010,” http://citeseerx.ist. psu.edu/showciting?doi=10.1.1.33.7626. REFERENCES 149

[29] D. Comaniciu, V. Ramesh, and P. Meer, “Real-time tracking of non-rigid objects using mean shift,” in Computer Vision and Pattern Recognition, 2000. Proceedings on IEEE Conference, vol. 2, 2000, pp. 142–149.

[30] Crazy Eddie GUI, “Crazy Eddie GUI API,” http://sourceforge.net/ projects/crayzedsgui/.

[31] Crystal Space, “Features - Crystal Space 3D,” http://www.crystalspace3d. org/main/Features.

[32] ——, “Introduction to Crystal Entity Layer,” http://www.crystalspace3d. org/main/CEL.

[33] R. Darken, W. Cockayne, and D. Carmein, “The omni-directional treadmill: A locomotion device for virtual worlds,” in Proceedings of UIST’97, 1997, pp. 213–221.

[34] G. DeHaan, E. J. Griffith, and F. H. Post, “Using the Wii balance board as a low-cost VR interaction device,” in Proceedings of the 2008 ACM Sym- posium on Virtual Reality Software and Technology, 2008, pp. 289–290.

[35] B. Denby, D. Schofield, D. J. McClarnon, M. W. Williams, and T. Walsha, “Hazard awareness training for mining situations using virtual reality,” in Proceedings 27th APCOM International Symposium, London, UK 19-23 Apr. 1998, 1998.

[36] J. Deutscher, A. Blake, and I. Reid, “Articulated body motion capture by annealed particle filtering,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, 2000, pp. 126–133.

[37] DevMaster.Net, “3D engines listing,” http://www.devmaster.net/engines/ list.php.

[38] H. Director, “Ohs 711 - non ionising radiation,” http://www.hr.unsw.edu. au/ohswc/ohs/pdf/Non ionising radiation procedure.pdf. REFERENCES 150

[39] S. L. Dockstader, N. S. Imennov, and A. M. Tekalp, “Markov-based failure prediction for human motion analysis,” in Proceedings of the Ninth IEEE International Conference on Computer Vision, 2003, p. 1283.

[40] S. L. Dockstader and A. M. Tekalp, “Multiple camera tracking of interacting and occluded human motion,” in Proceedings of the IEEE, 2001, pp. 1441– 1455.

[41] G. Domik, C. J. C. Schauble, L. D. Fosdick, and E. R. Jessup, “Tuto- rial: Color in scientific visualization,” 1999, https://csel.cs.colorado.edu/ ∼csci4576/SciVis/SciVisColor.html.

[42] J. V. Draper, D. B. Kaber, and J. M. Usher, “Telepresence,” Human Fac- tors, vol. 40, no. 3, pp. 354–375, 1998.

[43] F. H. Durgin, K. Gigone, and R. Scott, “Perception of visual speed while moving,” Journal of Experimental Psychology: Human Perception and Per- formance, pp. 339–353, 2005.

[44] D. Engel, C. Curio, L. Tcheang, B. Mohler, and H. Bulthoff, “A psychophys- ically calibrated controller for navigating through large environments in a limited free-walking space,” in Proceedings of the 2008 ACM Symposium on Virtual Reality Software and Technology, 2008, pp. 157–164.

[45] K. J. Fernandes, V. Raja, and J. Eyre, “Cybersphere: The fully immersive spherical projection system,” Communications of the ACM, vol. 46, 2003.

[46] T. Field and P. Vamplew, “Generalised algorithms for redirected walking in virtual environments,” in International Conference on Artificial Intelligence in Science and Technology, 2004.

[47] W. Frey, M. Zyda, R. McGhee, and W. Cockayne, Off-The-Shelf, Real- Time, Human Body Motion Capture for Synthetic Environments. Technical Report NPSCS-96-003. , Computer Science Department, Naval Postgrad- uate School, Monterey, California, 1996. REFERENCES 151

[48] K. Fukunaga and L. Hostetler, “The estimation of the gradient of a den- sity function, with applications in pattern recognition,” in IEEE Trans. Information Theory, vol. 21, 1975, pp. 32–40.

[49] D. M. Gavrila, “The visual analysis of human movement: A survey,” in Computer Vision and Image Understanding. Academic Press, 1999, pp. 82–98.

[50] D. M. Gavrila and L. S. Davis, “3-D model-based tracking of humans in action: a multi-view approach,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1996, pp. 73–80.

[51] R. C. Gonzalez and R. E. Woods, Digital Image Processing. Prentice-Hall, 2002.

[52] J. Gregory, J. Lander, and M. Whiting, Game Engine Architecture.AK Peters, 2009.

[53] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. Cambridge University Press, 2003.

[54] D. Henry and T. Furness, “Spatial perception in virtual environments: eval- uating an architectural application,” in Proceedings of the Virtual Reality Annual International Sysmposium, 1993, pp. 33–40.

[55] J. M. Hollerbach, “Locomotion interfaces,” in Handbook of Virtual Envi- ronments: Design, Implementation, and Applications, 2002, pp. 239–254.

[56] J.-Y. Huang, “An omnidirectional stroll-based virtual reality interface and its application on overhead crane training,” in IEEE Tran. Multimedia, vol. 5, no. 1, 2003, pp. 39–51.

[57] S. C. E. Inc., “PlayStation3 Worldwide Hardware Unit Sales,” http://www. scei.co.jp/corporate/data/bizdataps3 sale e.html.

[58] Infiscape, “VR Juggler,” http://www.vrjuggler.org/. REFERENCES 152

[59] V. Interrante, L. Anderson, and B. Ries, “Distance perception in immersive virtual environments, revisited,” in Proceedings of IEEE Virtual Reality 2006, 2006, pp. 3–10.

[60] V. Interrante, B. Ries, and L. Anderson, “Seven league boots: A new metaphor for augmented locomotion through moderately large scale im- mersive virtual environments,” in Proceedings of IEEE Symposium on 3D User Interfaces, 2007, pp. 167–170.

[61] Irrlicht, “ Features,” http://irrlicht.sourceforge.net/features. html.

[62] H. Iwata and K. Matsuda, “Haptic walkthrough simulator: Its design and application to studies on cognitive map,” in Proceedings of ICAT 1992 Conference, 1992, pp. 185–192.

[63] H. Iwata, H. Yano, H. Fukushima, and H. Noma, “Circula floor,” in IEEE Computer Graphics and Applications, vol. 25, no. 1, 2005, pp. 64–67.

[64] H. Iwata, H. Yano, and F. Nakaizumi, “Gait Master: A versatile locomotion interface for uneven virtual terrain,” in Proceedings of IEEE Virtual Reality 2001 Conference, 2001, pp. 131–137.

[65] H. Iwata, H. Yano, and H. Tomioka, “Powered shoes,” in SIGGRAPH 2006 Conference DVD, 2006.

[66] H. Iwata, “The Torus Treadmill: Realizing locomotion in ves,” in IEEE Computer Graphics and Applications, vol. 9(6), 1999, pp. 30–35.

[67] H. Iwata and T. Fujii, “Virtual perambulator: A novel interface device for locomotion in virtual environment,” in Proceedings of IEEE 1996 Virtual Reality Annual International Symposium, 1996.

[68] H. Iwata, H. Yano, and M. Tomiyoshi, “String walker,” in International Conference on Computer Graphics and Interactive Techniques, no. 20, 2007. REFERENCES 153

[69] R. E. Kalman, “A new approach to linear filtering and prediction problems,” Transaction of the ASME—Journal of Basic Engineering, pp. 35–45, 1960.

[70] M. S. Kizil, “Virtual reality applications in the australian minerals indus- try,” in International Symposium on Computer Applications in the Minerals Industries Held under auspices of the South African Institute of Mining and Metallurgy, 2003.

[71] M. S. Kizil, A. Kerridge, and M. Hancock, “Use of virtual reality in mining education and training,” in Proceedings of the 2004 CRCMining Research and Effective Technology Transfer Conference, 2004, pp. 15–16.

[72] J. M. Knapp and J. M. Loomis, “The visual perception of egocentric dis- tance in virtual environments,” Ph.D Thesis, University of California at Santa Barbara, 1999.

[73] ——, “Limited field of view of head-mounted displays is not the cause of distance underestimation in virtual environments,” Presence, vol. 13, no. 5, pp. 572–577, 2004.

[74] D. R. Lampton, M. J. Singer, D. McDonald, and J. P. Bliss, “Distance estimation in virtual environments,” in Proceedings of the Human Factors and Ergonomics Society 39th Annual Meeting, 1995, pp. 1268–1272.

[75] J. J. LaViola, D. A. Feliz, D. F. Keefe, and R. C. Zeleznik, “Hands-free multi-scale navigation in virtual environments,” in Proceedings of the ACM symposium on Interactive 3D graphics, 2001, pp. 9–15.

[76] L. Lichtenstein, J. Barabas, R. L. Woods, and E. Peli, “A feedback- controlled interface for treadmill locomotion in virtual environments,” in ACM Transactions on Applied Perception, 2007, pp. 1–17.

[77] J. M. Loomis, “Distal attribution and presence,” Presence: Teleoperators and Virtual Environments, vol. 1, pp. 113–119, 1992. REFERENCES 154

[78] J. M. Loomis and J. M. Knapp, “Visual perception of egocentric distance in real and virtual environments,” in Virtual and Adaptive Environments, 2003, pp. 21–46.

[79] P. McDowell, R. Darken, J. Sullivan, and E. Johnson, “Delta3D: A com- plete open source game and simulation engine for building military training systems,” in Proceedings of the Interservice/Industry Training, Simulation and Education Conference, 2006, pp. 143–154.

[80] M. McShaffry, Game Coding Complete. Paraglyph Press, 2003.

[81] R. C. Merkle, “A new family of six degree of freedom positional devices,” 1994, http://www.zyvex.com/nanotech/6dof.html.

[82] Merriam-Webster Dictionary, “Wood’s lamp,” http://www. merriam-webster.com/medical/wood%27s%20lamp.

[83] Minerals Council of Australia, Safety Performance Report of the Australian Minerals Industry. Minerals Council of Australia, 2008.

[84] Mining Industry Skills Centre Inc., “Project canary–translating risk knowl- edge to safe behaviour,” 2010, http://www.projectcanary.com.

[85] M. Minsky, “Telepresence,” in Omni, June, 1980, pp. 45–51.

[86] B. J. Mohler, H. H. Bulthoff, W. B. Thompson, and S. H. Creem-Regehr, “A full-body avatar improves egocentric distance judgments in an immer- sive virtual environment,” in Proceedings of the 5th Symposium on Applied Perception in Graphics and Visualization, 2008, p. 194.

[87] B. J. Mohler, W. B. Thompson, S. Creem-Regehr, H. L. Pick, W. Warren, J. J. Rieser, and P. Willemsen, “Visual motion influences locomotion in a treadmill virtual environment,” in Symposium on Applied Perception in Graphics and Visualization, 2004, pp. 19–22. REFERENCES 155

[88] B. J. Mohler, W. B. Thompson, S. H. Creem-Regehr, and P. Willemsen, “Calibration of locomotion resulting from visual motion in a treadmill-based virtual environment,” in ACM Transactions on Applied Perception, 2007, pp. 1–15.

[89] M¨unchen Technical University–Faculty of Mechanical Engineering, “Cyber- walk,” 2009, http://www.amm.mw.tum.de/index.php?id=250&L=1.

[90] NIOSH Office of Mine Safety and Health Research Training Exercise, “Un- derground coal mine map reading training,” 2009, http://www.cdc.gov/ niosh/mining/products/product165.htm.

[91] H. Noma and T. Miyasato, “Design for locomotion interface in a large scale virtual environment. ATLAS: ATR locomotion interface for active self motion,” in Proc. ASME Dynamic Systems and Control Division, 1998, pp. 111–118.

[92] Ogre3D, “FAQ: Getting started: Is a game engine?” 2009, http:// www.ogre3d.org/tikiwiki/Getting+Started#Is Ogre a game engine .

[93] Ogre3D, “Ogre3D addons,” 2009, http://www.ogre3d.org/developers/ addons.

[94] ——, “Ogre3D features,” 2009, http://www.ogre3d.org/about/features.

[95] OpenSceneGraph, “OpenSceneGraph Website,” http://www. .org/projects/osg.

[96] OpenSG, “OpenSG Comparison,” http://www.opensg.org/wiki/ Comparison.

[97] R. Pausch, T. Burnette, D. Brockway, and M. E. Weiblen, “Navigation and locomotion in virtual worlds via flight into hand-held miniatures,” in Proceedings of ACM SIGGRAPH, 1995, pp. 399–400. REFERENCES 156

[98] B. Peek, “Native C++ wiimote library,” 2007, http://wiiyourself.gl.tter. org/.

[99] A. Pelah, B. Secker, A. Bishop, and C. Askham, “A wide-field simulator for studying the visuo-motor interactions in locomotion,” in Journal of Physiology, 1998, pp. 13–14.

[100] Psionic3D, “Free3D models,” http://www.psionic3d.co.uk/?page id=25.

[101] QuickGUI, “OgreWiki:QuickGUI,” http://www.ogre3d.org/tikiwiki/ QuickGUI.

[102] M. H. Raibert and J. K. Hodgins, “Animation of dynamic legged locomo- tion,” in Proc. ACM SIGGRAPH 1991, vol. 25, 1991, pp. 349–358.

[103] S. Razzaque, D. Swapp, M. Slater, M. C. Whitton, and A. Steed, “Redi- rected walking in place,” in Proceedings of the Workshop on Virtual Envi- ronments, 2002, pp. 123–130.

[104] B. Ries, V. Interrante, M. Kaeding, and L. Anderson, “The effect of self- embodiment on distance perception in immersive virtual environments,” in Proceedings of the 2008 ACM Symposium on Virtual Reality Software and Technology, 2008, pp. 167–170.

[105] B. Ries, V. Interrante, M. Kaeding, and L. Phillips, “Analyzing the effect of a virtual avatar’s geometric and motion fidelity on ego-centric spatial perception in immersive virtual environments,” in Proc. of the 16th ACM Symposium on Virtual Reality Software and Technology, 2009, pp. 59–66.

[106] C. S. Sahm, S. H. Creem-regehr, W. B. Thompson, and P. Willemsen, “Throwing versus walking as indicators of distance perception in real and virtual environments,” in ACM Transactions on Applied Perception, vol. 1, no. 3, 2005, pp. 35–45.

[107] D. Sanchez and C. Dalmau, Core Techniques and Algorithms in Game Pro- gramming. New Riders Publishing, 2003. REFERENCES 157

[108] D. Schofield, “Do you learn more when your life is in danger?” in New Technologies for Safety Training Conference on Virtual Reality in Training, 2006.

[109] D. Schofield, B. Denby, and R. Hollands, “Mine safety in the twenty-first century: The application of computer graphics and virtual reality,” in Mine Health and Safety Management, 2001, pp. 153–174.

[110] J. Sch¨oning,F. Daiber, A. Kruger, and M. Rohs, “Using hands and feet to navigate and manipulate spatial data,” in Proceedings of the 2009 CHI, 2009, pp. 4663–4668.

[111] School of Computing–University of Utah, “Locomotion interfaces,” http: //www.cs.utah.edu/research/areas/ve/Locomotion.html.

[112] M. Schwaiger, T. Thummel, and H. Ulbrich, “Cyberwalk: Implementation of a ball bearing platform for humans,” in Conference on Human Computer Interaction 2007, 2007.

[113] T. Shiratori and J. K. Hodgins, “Accelerometer-based user interface for the control of a physically simulated character,” in ACM Transactions on Graphics, vol. 27, 2008.

[114] J. L. Souman, P. R. Giordano, I. Frissen, A. D. Luca, and M. O. Ernst, “Making virtual walking real: Perceptual evaluation of a new treadmill control algorithm,” in ACM Transactions on Applied Perception, 2010, pp. 11–14.

[115] A. Squelch, “Application of virtual reality for mine safety training,” in Proceedings Minesafe International 2000, 2000.

[116] P. Stothard, R. Mitra, and A. Kovalev, “Assessing levels of immersive ten- dency and presence experienced by mine workers in interactive training simulators developed for the coal mining industry,” in SimTecT 2008 Con- ference Proceedings, 2008. REFERENCES 158

[117] E. A. Suma, S. Babu, and L. F. Hodges, “Comparison of travel techniques in a complex, multi-level 3D environment,” in IEEE Symposium on 3D User Interfaces, 2007, pp. 147–153.

[118] M. Suryajaya, C. Fowler, T. Lambert, P. Stothard, D. Laurence, and C. Daly, “Development and evaluation of OmniWalker for navigating im- mersive computer based mine simulations,” in SimTecT 2010 Conference Proceedings, 2010, pp. 209–214.

[119] M. Suryajaya, T. Lambert, and C. Fowler, “Camera-based OBDP loco- motion system,” in Proc. of the 16th ACM Symposium on Virtual Reality Software and Technology, 2009, pp. 31–34.

[120] T. Svoboda, D. Martinec, and T. Pajdla, “A convenient multi-camera self- calibration for virtual environments,” PRESENCE: Teleoperators and Vir- tual Environments, pp. 407–422, 2005.

[121] M. J. Swain and D. H. Ballard, “Indexing via color histograms,” in Pro- ceedings of the Third International Conference on Computer Vision, 1990, pp. 390–393.

[122] R. M. Taylor, T. C. Hudson, A. Seeger, H. Weber, J. Juliano, and A. T. Helser, “VRPN: A device-independent, network-transparent VR peripheral system,” in Proceedings of the 8th ACM Symposium on Virtual Reality Software and Technology, 2001, pp. 55–61.

[123] J. N. Templeman, P. S. Denbrook, and L. E. Sibert, “Virtual locomotion: Walking in place through virtual environments,” in Presence: Teleoperators and Virtual Environments, vol. 8, 1999, pp. 598–617.

[124] W. B. Thompson, P. Willemsen, A. A. Gooch, S. H. Creem-Regehr, J. M. Loomis, and A. C. Beall, “Does the quality of the computer graphics matter when judging distances in visually immersive environments?” in Presence: Teleoperators and Virtual Environments, vol. 13, no. 5, 2004, pp. 560–571. REFERENCES 159

[125] T. Thorsen, “Wii sales near 71 million, DS almost 129 million,” http:// www.eurogamer.net/articles/ms-shifts-10-3m-xbox-360s-in-fy2010.

[126] R. Y. Tsai, “A versatile camera calibration technique for high accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses,” in IEEE Journal of Robotics and Automation, vol. 3, 1987, pp. 323–344.

[127] L. Unzueta, M. Peinado, R. Boulic, and A. Suescun, “Full-body perfor- mance animation with sequential inverse kinematics,” in Graphical Models, 2008, pp. 87–104.

[128] M. Usoh, K. Arthur, M. C. Whitton, R. Bastos, A. Steed, M. Slater, and F. P. Brooks, “Walking > walking-in-place > flying, in virtual environ- ments,” in SIGGRAPH 1999, 1999, pp. 359–364.

[129] M. Usoh, E. Catena, S. Arman, and M. Slater, “Using presence question- naires in reality,” Presence, pp. 497–503, 2000.

[130] D. Valkov, F. Steinicke, G. Bruder, and K. H. Hinrichs, “Travelling in 3D virtual environments with foot gestures and a multi-touch enabled WIM,” in Proceedings of Virtual Reality International Conference (VRIC 2010), 2010.

[131] Virtusphere, “Products,” http://virtusphere.com/product.html.

[132] G. Welch and G. Bishop, “An introduction to the Kalman filter,” in Report: TR 95-041, Department of Computer Science, University of North Carolina at Chapel Hill, 2006.

[133] P. Willemsen, M. B. Colton, S. H. Creem-Regehr, and W. B. Thompson, “The effects of head-mounted display mechanical properties and field-of- view on distance judgments in virtual environments,” in ACM Transactions on Applied Perception, vol. 6, 2009, pp. 8:1–8:14.

[134] P. Willemsen, M. B. Colton, S. H. Creem-Regher, and W. B. Thompson, “The effects of head-mounted display mechanics on distance judgments in REFERENCES 160

virtual environments,” in Applied Perception in Graphics and Visualization, vol. 73, 2004, pp. 35–38.

[135] P. Willemsen, A. A. Gooch, W. B. Thompson, and S. H. Creem-Regehr, “Effects of stereo viewing conditions on distance perception in virtual en- vironments,” in Presence, vol. 17, 2008, pp. 91–101.

[136] B. Williams, G. Narasimham, B. Rump, T. P. McNamara, T. H. Carr, J. Rieser, and B. Bodenheimer, “Exploring large virtual environments with an HMD when physical space is limited,” in Proceedings of the 4th sympo- sium on Applied Perception in graphics and visualization, 2007, pp. 41–48.

[137] B. G. Witmer and P. B. Kline, “Judging perceived and traversed distance in virtual environments,” Presence, vol. 7, no. 2, pp. 144–167, 1998.

[138] B. G. Witmer and M. J. Singer, “Measuring presence in virtual environ- ments: A presence questionnaire,” in Presence, vol. 7, 1998, pp. 225–240.

[139] WreckedGames, “OIS Development Website,” http://www.wreckedgames. com/.

[140] C. Wren, A. Azarbayejani, T. Darrel, and A. Pentland, “Pfinder: real-time tracking of the human body,” in Second IEEE International Conference on Automatic Face and Gesture Recognition, 1996, p. 51.

[141] H. Yano, H. Noma, H. Iwata, and T. Miyasato, “Shared walk environment using locomotion interfaces,” in CSCW 2000, 2000, pp. 163–170.

[142] W. Yin-Poole, “MS Shifts 10.3m XBox360s in FY2010,” http://www. eurogamer.net/articles/ms-shifts-10-3m-xbox-360s-in-fy2010.

[143] P. Zahorik and R. L. Jenison, “Presence and being-in-the-world,” in Pres- ence: Teleoperators and Virtual Environments, vol. 7, no. 1, 1998.

[144] C. A. Zanbaka, B. C. Lok, S. V. Babu, C. V. Babu, A. C. Ulinski, and L. F. Hodges, “Comparison of path visualizations and cognitive measures relative REFERENCES 161

to travel technique in a virtual environment,” in IEEE TVCG, vol. 11, no. 6, 2005, pp. 694–705.

[145] Z. Zhang, “A flexible new technique for camera calibration,” in IEEE Trans- actions on Pattern Analysis and Machine Intelligence, vol. 22, 2000, pp. 1330–1334.

[146] C. J. Ziemer, J. M. Plumert, J. F. Cremer, and J. K. Kearney, “Estimating distance in real and virtual environments: Does order make a difference,” in Percept Psychophys, vol. 71, no. 5, 2009. Appendices

162 Appendix A

The HSV colour Model

Figure A.1: The HSV hex cone

The HSV (Hue, Saturation, Value) colour model (Figure A.1) is one of the perceptual colour spaces [41]. It is designed to mimic the way humans perceive colour. The HSV colour model is defined in three parameters: hue, saturation, and value. The hue component is an angular measurement running in a counter- clockwise direction. The value starts from 0 which represents the red colour. The primary (red, green, and blue) and secondary (yellow, cyan, and magenta) colours

163 164 are located at the vertices of the hexagon. The saturation component describes the colour intensity where zero indicates the white colour and one indicates the fully saturated colour. The value component indicates the brightness of the colour which varies from zero (black) to one (white). The algorithm for converting pixel colours between RGB to HSV is presented below.

1: V ← max(R, G, B)

2: if V 6= 0 then

3: S ← (V − min(R, G, B))/V

4: else

5: S ← 0

6: end if

7: if V == R then

8: H ← (G − B) ∗ 60/S

9: end if

10: if V == G then

11: H ← 180 + (B − R) ∗ 60/S

12: end if

13: if V == B then

14: H ← 240 + (R − G) ∗ 60/S

15: end if

16: if H < 0 then

17: H ← −H + 360

18: end if

Because we only used 8-bit images throughout the system, the HSV values are scaled between 0 and 255 as follows:

H ← H/2 S ← S ∗ 255 V ← V ∗ 255 Appendix B

Results from Forward Walk Test (Experiment 1)

Table B.1: Distance estimation results from experiment 1. User Input 2m 6m 10m 14m 18m 22m A Joystick 1 3 4 6 9 11 OmniWalker 2 5 11 13 18 21 B Joystick 1 2 3 5 7 10 OmniWalker 2 4 10 11 16 20 C Joystick 1 3 5 6 10 11 OmniWalker 2 6 12 16 21 22 D Joystick 1 2 4 4 7 6 OmniWalker 2 6 9 14 16 19 E Joystick 1 2 3 4 6 8 OmniWalker 2 7 9 19 22 23 F Joystick 1 2 3 5 7 8 OmniWalker 2 5 8 11 13 17 G Joystick 1 3 4 8 12 12 OmniWalker 2 6 8 12 14 19

continued on the next page.

165 166

User Input 26m 30m 34m 38m 42m A Joystick 14 15 18 19 20 OmniWalker 25 31 36 35 40 B Joystick 11 12 14 17 17 OmniWalker 25 25 35 35 35 C Joystick 16 17 18 20 24 OmniWalker 27 36 38 34 44 D Joystick 8 9 11 13 13 OmniWalker 21 28 29 35 38 E Joystick 8 10 12 12 14 OmniWalker 25 30 35 24 32 F Joystick 13 15 20 25 28 OmniWalker 20 24 31 31 36 G Joystick 20 22 28 32 35 OmniWalker 22 25 29 34 36 Appendix C

Distance Estimation Results in Free Exploration Mode (Experiment 2)

167 168

Table C.1: Distance estimation results from experiment 2. Joystick OmniWalker Plain scene HA scene Plain scene HA scene D1 D2 D3 D4 D1 D2 D3 D4 Actual 2 17 5 31 2 17 5 31 Distance A 1.2 9 1.2 9 1.8 15 2.5 20 B 2 15 4.5 20 1.5 13 4.5 20 C 1.2 20 3 10 3 15 4 30 D 0.5 10 3 13 0.8 13 4 20 E 0.5 20 7 15 1.2 13 6 40 F 2 12 5 20 2.4 20 4 29 G 2.5 20 5 10 2 15 4 30 H 1 20 5 15 1.5 20 5 30 I 1.8 10 3.5 7 2 18 3 28 J 1 6 8 30 2 16 5 30 Appendix D

The Presence Questionnaire

The Presence questionnaire list. Numbers refer to the original Witmer and Singer’s question number. Answers on a scale of 1 (very able) to 7 (completely unable).

Table D.1: Presence questionnaire used in experiment 2 No. Question Text 1 How much were you able to control events? 2 How responsive was the environment to actions that you initiated? 3 How natural did your interactions with the environment seem? 7 How natural was the mechanism which controlled movement through the environment? 12 How much did your experiences in the virtual environment seem consistent with your real-world experiences? 14 How completely were you able to actively survey or search the en- vironment using vision?

169 170

No. Question Text 18 How compelling was your sense of moving around inside the virtual environment? 22 To what degree did you feel confused on disoriented at the beginning of breaks or at the end of the experimental session? 23 How involved were you in the virtual environment experience? 24 How distracting was the control mechanism? 25 How much delay did you experience between your actions and ex- pected outcomes? 26 How quickly did you adjust to the virtual environment experience? 27 How proficient in moving and interacting with the virtual environ- ment did you feel at the end of the experience? 29 How much did the control devices interfere with the performance of assigned tasks or with other activities? 30 How well could you concentrate on the assigned tasks or required activities rather than on the mechanisms used to perform those tasks or activities? 31 Did you learn new techniques that enabled you to improve your performance? 32 Were you involved in the experimental task to the extent that you lost track of time? Appendix E

Presence Questionnaire Results

171 172

Table E.1: Presence questionnaire results using a joystick in experiment 2. The results are on a 7 point scale. No. A B C D E F G H I J 14676655624 24546555443 33456546524 74645435214 12 4 1 4 5 4 3 4 4 3 2 14 4 4 4 4 5 4 5 3 2 4 18 6 3 5 4 5 2 5 3 2 4 22 1 5 1 2 3 5 3 2 2 6 23 5 6 4 6 3 3 5 4 4 4 24 1 2 1 4 3 5 3 5 4 4 25 2 2 1 2 2 4 3 6 4 3 26 7 6 6 6 5 2 4 5 4 3 27 4 6 6 4 5 5 4 5 4 4 29 2 2 5 3 4 4 2 6 5 3 30 7 7 6 5 5 3 6 5 3 4 31 7 6 4 6 3 4 4 5 2 3 32 7 1 3 6 2 2 2 5 1 5 173

Table E.2: Presence questionnaire results using the OmniWalker in experiment 2. The results are on a 7 point scale. No. A B C D E F G H I J 1 6152254546 2 6162354346 3 6162455466 7 6153255666 12 6 1 4 1 3 6 5 3 5 6 14 6 2 6 2 3 5 4 6 6 7 18 6 1 5 2 4 4 4 5 6 6 22 1 3 1 3 4 5 4 2 2 3 23 6 4 4 3 3 5 5 6 6 6 24 3 7 1 6 3 4 4 2 4 2 25 2 7 1 6 4 5 4 3 3 2 26 7 1 7 5 3 3 5 6 5 6 27 6 1 6 2 4 4 5 6 5 6 29 1 7 7 4 4 5 4 2 2 2 30 6 2 6 2 4 3 5 6 7 5 31 7 2 5 2 4 5 5 6 4 6 32 7 1 5 1 2 3 4 5 5 2