Vision-Based Human Interaction Devices in a 3D environment
Using Nintendo Wiimote, Webcam and DirectX
Thomas Luel Florent Mazzone
Vision, Graphics and Interactive Systems
Aalborg University
June 1, 2009 AALBORG UNIVERSITY DEPARTMENT OF ELECTRONIC SYSTEMS
Niels Jernes Vej 12 – DK-9220 Aalborg East Phone 99 40 86 86 TITLE : Vision-Based Human Interaction Devices in a 3D Environment th PROJECT PERIOD : 10 Semester, February 2009 to June 2009 PROJECT GROUP : 1021
ABSTRACT
This report documents the development of new
Human Interface Devices (HID) and their use in a
3D environment to enhance the immersion factor of a modeling tool such as an architecture PARTICIPANTS: designer software. The study focuses on studying and comparing two common devices used in vision-based motion tracking: the wiimote THOMAS LUEL developed by Nintendo and a webcam. New means of interaction between the user and software, going beyond the mouse and keyboard bounds, are explored and described to interact FLORENT MAZZONE efficiently and intuitively with desktop objects.
In this system, the user’s fingers and head are infrared lighted by adapted gloves and frames fit SUPERVISOR : with infrared diodes. The wiimote, developed by Nintendo, provides a high accuracy camera sensor ZHENG -HUA TAN whereas the webcam has a good detection range.
Both are adapted answers to the application requirements. A 3D DirectX architect software, PUBLICATIONS : 3 Master Builder, has been implemented, allowing NUMBER OF PAGES : 99 the user to create his own buildings using his fingers and to view the result in 3D by moving his FINISHED : JUNE 1, 2009 head.
The resulting projects and the users tests revealed that infrared vision-based tracking fits perfectly to the tracking requirements and that handling objects by using both hands is much more interactive for the user than using common human device interfaces such as the mouse.
This report can be published or reproduced without permission provided that the source is mentioned. Copyright ©2009, project group 1021, Aalborg University
Preface
Report purpose
This report is part of the 10 th semester project of the “Vision, Graphics and Interactive Systems” Master program. It has been written in 2009 at the Aalborg University Department of Electronic Systems and presents the Master thesis of the group VGIS 09gr1021.
The project aim is to design and implement new means of interaction such as fingers and head tracking between the user and a 3D architecture modeling software, in order to improve the ease of use of this type of application. The overall project has been documented in this report, which describes the design choices, implementation, testing and improvements of the solution.
Report overview
Seven main sections compose this report. The Pre-Analysis describes common hardware and software technologies used to improve the user’s immersion in an application and to track his motions. The Analysis part defines more precisely the technologies used in the project and presents the specification of requirements for the ideal system. The Design section describes the different modules developed for this project. A first sub-section presents the hardware components developed for the fingers and head tracking. The tracking software is then described in a second part. Finally, the development of the three dimensions game and of the user interface using DirectX9 is detailed. The Implementation part describes the development of the different modules designed, the problems encountered during this phase and some modifications of the initial design. The Tests part describes how the usability tests have been made to validate the application design choices and the results obtained. The Tests section then presents all the modifications of the application made to enhance the interaction between the user and the software and to correct some bugs discovered during the test period. Finally, the Conclusion will explain the project achievement and the improvement perspectives.
A list of the figures included is available; each figure caption is preceded by his chapter number separated by a dot from its order number in the section. The appendix contains the references and technical documents. A number in parenthesis marks each reference quoted.
Technology notice
The CD-Rom includes the binaries with documented source codes of MasterBuilder.exe, the 3D environment, and of MotionTracking.dll, the motion-tracking library. The CD-ROM also contains a MotionTracking.dll sample application, the webcam driver, a tool to handle the camera properties, a documentation of the MotionTracking.dll external methods (.HTM and Microsoft Compressed HTML) detailed installation instructions and the electronic components datasheets.
The source code has been developed in C# using .Net framework 2 using Microsoft Visual Studio 2005 development environment. The report has been written with Microsoft Office 2007, the schematic and diagrams have been drawn with Microsoft Visio 2003 and the quality of the screen or hardware pictures has been improved using Paint.NET. Acknowledgements
We would like to thank all the people who helped us to achieve our project, and more particularly:
• Ben Krøyer and Peter Boie Jensen, Electronic Systems Department assistant engineers, for their help and good advice during the electronic components development.
• Lars Bo Larsen, our semester coordinator for his support.
• Mette Billeskov, semester secretary, for her support regarding the administrative issues.
• Our supervisor Zeng-Hua Tan, our project supervisor, for his help, management and advice.
• All the testers who helped to check, validate and improve the design choices.
• The computer sciences workshop for their help regarding the computers and network issues.
• The components workshop that helped us to order all the needed components.
Table of contents
Part I Introduction ...... 1 Presentation ...... 2 Problem statement...... 3 Fundamental requirements...... 3 Limitations ...... 3 Part II Pre-Analysis ...... 5 1 “Immersion” factor in software and games ...... 6 2 Motion tracking ...... 10 Part III Analysis ...... 13 1 Head Tracking ...... 14 2 What is Fingers tracking? ...... 17 3 Installation layout ...... 21 4 Choice of the Application ...... 24 5 Choice of the programming language ...... 27 6 Specification of requirements ...... 32 Part IV Design ...... 33 1 General Structure ...... 34 2 Tracking ...... 36 3 3D Game Design ...... 55 4 User Interface ...... 68 Part V Implementation ...... 81 1 Infrared Frames ...... 82 2 Infrared Gloves ...... 83 3 MotionTracking DLL ...... 85 4 Implementation of the 3D Environment ...... 87 Part VI Tests...... 91 1 Tests Preparation ...... 92 2 Results ...... 94 3 Improvements ...... 96 Part VII Conclusion ...... 99 Project Achievement ...... 100 Future Improvements...... 102 Part VIII References ...... 103 Part IX Appendixes ...... 109 List of figures
Figure 2-1: Picture showing how to use the wiimote[10]...... 7 Figure 2-2: Vuzix 3D headset[12]...... 8 Figure 2-3: ELITE Simulation solutions: the all-in-one simulator [14]...... 8 Figure 3-1 : An example of infrared emitters glasses adapted from safety glasses [23]...... 16 Figure 3-2 : Glasses built with reflective materials made by TrackIr [24]...... 16 Figure 3-3 : Posture Acquisition Glove with inertial devices detecting linear and angular motion, rotation and acceleration of the hand [27] ...... 18 Figure 3-4 : Image based data glove [28] ...... 19 Figure 3-5 : “Inverse Kinematic Infrared Optical Finger Tracker” project [29] ...... 20 Figure 3-6 : Wiimotes disposition layout (Head-tracking) ...... 22 Figure 3-7: Johnny Chung Lee presenting his head-tracking system using wiimotes [30]...... 24 Figure 3-8: NaturalPoint’s TrackIR3. [32] ...... 25 Figure 4-1 : Project structure ...... 34 Figure 4-2 : WiiMotes technical details[49] ...... 36 Figure 4-3 : Wiimote device [50] ...... 37 Figure 4-4: Diode tested characteristics...... 38 Figure 4-5 : Diodes tested ...... 38 Figure 4-6 : Infrared diodes circuit without boosting schematics ...... 38 Figure 4-7 : Infrared diodes circuit without boosting (with LD 242-3) ...... 39 Figure 4-8 : Coupling diodes solution ...... 39 Figure 4-9 : Components used ...... 40 Figure 4-10 : Switch based boost diode solution ...... 41 Figure 4-11 : Permissive pulse handling capability [51] ...... 41 Figure 4-12 : Chip output voltage ...... 42 Figure 4-13 : Transistor voltage between the drain and the source ...... 42 Figure 4-14 : Diode voltage (yellow) and diode current (blue) using a 10µH coil [56]...... 43 Figure 4-15 : PR4401 based coupling diodes circuit design ...... 44 Figure 4-16 : Sunnyline webcam specifications ...... 45 Figure 4-17 : Infrared diodes viewed by the camera at a 4 meters range ...... 45 Figure 4-18 : Fingers tracking infrared diode circuit design ...... 45 Figure 4-19 : HeadTracking tools class diagram ...... 46 Figure 4-20 : Wiimotes horizontal and vertical views ...... 48 Figure 4-21 : Camera class and its related classes ...... 50 Figure 4-22 : Motion Tracking Sequence diagram ...... 51 Figure 4-23 : An example of model to imitate ...... 56 Figure 4-24 : example of using transformations to place an object at the right position...... 57 Figure 4-25 : Cube A must be moved to the right. The program will check if the space corresponding to this movement is available...... 60 Figure 4-26 : when checking the space for the movement, the program detects that there is a potential collision with Cube B ...... 60 Figure 4-27 : The user is trying to put cuboid A side by side with cuboid B. A collision is detected for this movement...... 61 Figure 4-28 : The movement has been reduced. A new space is to be checked for availability...... 62 Figure 4-29 : Example of gravity effect - One of the cubes is moved by the user...... 63 Figure 4-30 : Example of gravity effect - Gravity effect is triggered on the 3 cubes...... 64 Figure 4-31 : Example of gravity effect - End of the gravity effect...... 64 Figure 4-32 : The shadow is drawn on the ground ...... 65 Figure 4-33 : The shadow is drawn on the ground, on the side and the top of the other stone...... 66 Figure 4-34 : The scene is a square, where objects can be placed. The axes have been added to the image in red...... 68 Figure 4-35 : Objects have been added to the scene...... 69 Figure 4-36 : Menu of the game ...... 69 Figure 4-37 : Left menu ...... 70 Figure 4-38 : Right menu ...... 70 Figure 4-39 : Center menu ...... 71 Figure 4-40 : The two parts of the center menu ...... 71 Figure 4-41 : "Handling" tab ...... 71 Figure 4-42 : "Options" tab ...... 72 Figure 4-43 : Step 1 of the creation of a new object ...... 75 Figure 4-44 : Red Parallelepiped because it is outside the borders of the scene...... 75 Figure 4-45 : Green parallelepiped ...... 76 Figure 4-46 : A new object has been added to the scene...... 76 Figure 4-47 : The cursor is blue because the object is hold by the user...... 77 Figure 4-49 : Delete icon ...... 78 Figure 4-48 : Delete icon disabled ...... 78 Figure 4-50 : resizing icon ...... 78 Figure 4-51 : The user is changing the "selection time" value...... 79 Figure 4-52 : Exit pop-up ...... 79 Figure 5-1 : Infrared frames...... 82 Figure 5-2 : Infrared frames left diodes circuit ...... 82 Figure 5-3 : Infrared glove diode circuit (side down) ...... 83 Figure 5-4 : Infrared glove right glove (side up) ...... 84 Figure 5-5 : Infrared diode and wires sewed to the glove ...... 84 Figure 5-6 : MotionTracking dll integration example ...... 85 Figure 5-7 : Online MotionTracking API documentation ...... 86 Figure 5-8 : Project structure updated ...... 89 Presentation 1
Part I Introduction
This chapter presents the origins of the project. The existing technologies, the needs, the problem statement and the delimitations of the project are described in the following parts. The final sub- section motivates the motions that led us to choose this project.
2 Presentation
Presentation
Interactions between a user and personal computer software are a main issue for developers since the invention of the first personal computers in the seventies. The basic command-line interface provided by the first operating systems such as Microsoft MS-DOS revealed rapidly their lack of intuitiveness for the user. For that reason, graphical operating systems such as Mac OS, IBM OS/2 or Microsoft Windows were designed in the eighties, allowing an occasional user to interact efficiently with a computer. The graphical operating systems soon revealed the limits of the keyboard unable to handle intuitively graphical objects. The widespread use of the graphical user interface led human interfaces devices (HID) designers to introduce the mouse (invented by Douglas Engelbart in 1964 [1]) which is much more adapted to the needs. This HID was so successful that it hid the development of others interfaces devices such as fingers tracking devices. Nevertheless, it appeared recently that the situation changed, now that the mouse device shows its limits especially in some applications as it does not provide a human logic way of interacting. When the application needs to have a good immersion factor, motion tracking devices such as fingers or head tracking are much more adapted. They provide an easy and intuitive way for the user to handle objects or to move in a three dimensions environment. Moreover, this spares the users the need and time to learn the necessary handling techniques before using the software.
Motion tracking has been used for years in computer sciences since it can be implemented using numerous well-known technologies: accelerometers, inertial, magnetic or vision-based tracking. However, these technologies were so expensive to implement that the use of motion tracking was limited during decades to major companies or laboratories. Fortunately, the improvement of the computer processing power and, above all, decreasing prices of accurate sensors enabled user interaction designers to implement motion tracking in generalized applications such as games or immersion applications. Many different cheap devices such as the wiimote (Nintendo: camera & accelerometer), Play Station 3 (Sony: camera [2]) or other specific cameras have been brought with success to the mass market for dedicated applications. To date the commercialized programs focus on tracking the head, the hand or the fingers position but could be easily modified to track other features and be used in others applications.
To date numerous cheap motion-tracking solutions have been designed and commercialized. Many devices rely on the vision-based tracking such as the wiimote implemented by Nintendo. However, it seems that most of them are used for gaming purpose and that their capabilities are not often used in others software. Moreover, no comparison of the tracking capabilities of these devices has been made. This lack of information can be disturbing for the developer improving the immersion of software using these tools.
Problem statement 3
Problem statement
The purpose of this project is to compare and implement motion tracking using two generalized devices, the wiimote and a webcam, in order to demonstrate the advantages of vision-based tracking in an architecture modeling software.
The wiimote and webcam capabilities for tracking will be compared. The advantages and drawbacks of each solution will be presented to allow developers to use the best components in their project. Moreover, a new architecture modeling software in three dimensions will be implemented to show the intuitiveness and efficiency of fingers tracking for this type of application and the need of head tracking to visualize in details a scene.
Fundamental requirements
• Track simultaneously at least two fingers and the head position of the user. • Provide a good accuracy to execute all the application features at distances of 2 to 4 meters from the screen. • Allow the user to run the application without using a keyboard and a mouse. • Present to the user a three dimensions environment. • Perform a user testing with neophyte testers to validate the intuitiveness and efficiency of the fingers object handling.
Limitations
• The solution language is English. • The number of available different objects in the architecting software is limited as the aim of the project is to experiment new means of objects handling and not to provide a fully functional architecting software. • The vision-based tracking used is limited to a room with controlled light such as a living room or classroom.
4 Limitations
Limitations 5
Part I Pre-Analysis
This part describes the main concepts of the project. A study of the importance of the immersion in games and other kind of software is done, and then an insight of motion tracking techniques will be presented.
6 “Immersion” factor in software and games 1 “Immersion” factor in software and games
There are many factors, which make games interesting to the player: purpose of the game, interesting story line, etc. In some games, the purpose is to draw the player into another world, to make him “believe” that the game is a world itself, with its own rules and own goals. The factor responsible for this is called “Immersion”.
1.1 How could we define Immersion?
Immersion is defined as the “suspension of disbelief” [3]. To achieve this state, several criteria that will enhance the “reality” of the game have to be achieved. A game has a good “immersion factor” if the graphics are good, the movements of the characters are realistic, the physics effects (such as an object that would fall on the ground) match the reality (or at least the reality of the game), and other criteria as well [4]. When those criteria are met, the player can focus more easily on the goal of the game, and the gaming experience is much better. We will now list the most important criteria [5].
• Good background and foreground graphics will make the game looks beautiful and, more important, realistic. The user will easily identify the elements of the game, recognizing them by associating them with elements of the real world. • Realistic graphic effects are also important. Small details can make the game more realistic: movement of the dust, interaction of the weather (rain, thunder, snow). • Sound and music play also an important part in the immersion. They allow the user to “feel” the game world with something more than just vision. The music can create the atmosphere or the feeling that the user should experience. The more detailed is the sound, the better the immersion of the game is. • The means of interaction with the game are also a very important point. For example, a joystick is better than a mouse for a flight simulator, because it seems nearer to the real commands of a plane, and more intuitive.
When those criteria are met, the immersion into the game is easier, and the game experience is better.
1.2 Why is immersion so important in games?
When a game has a good immersion factor, the player feels drawn into the game world, and the actions performed in the game seem real [6]. The illusion created is strong and the user will have a “Immersion” factor in software and games 7 stronger experience. With a good immersion, it is easier for the player to interact with the elements of the game as he identifies them more easily and can use his intuition to interact with them. That is why, since the beginning of video games, developers have tried to improve all those areas described above, pushing the limits further and further.
1.3 What about other kinds of software?
Other kinds of software can also profit from a good immersion. Software application where the user needs to look at something, or needs to get the feeling of something, can be used more efficiently if it creates a good illusion of reality.
Applications that simulate an environment would be part of this category of software. If the application is immersive, the user will identify objects and know their characteristics with more accuracy. Such applications could be any kind of software helping the user to design an object or a set of objects. Those applications belong to the “computer aided design” type of software [7].
1.4 What are the new ways to improve the immersion factor?
They are many ways to improve the immersion in a game or in software. In this report, we will give only a few examples [8][9].
• The Wii console: the main idea of this console is to associate real movements of the player to movements or actions in the game. For example, a tennis game will associate movements of the Wii remote to movements of the racket. The player will find the game nearer to a real tennis match than if it was played with a keyboard. This creates a kind of immersion.
Figure I-1: Picture showing how to use the wiimote[10].
• 3D sound: by using special sound effect, 3D sound gives the user the feeling that he can locate the place where the sound comes from [11]. • 3D glasses: There are several ways to use 3D glasses. Most of the time, two images are projected on the same screen, and the 3D glasses force the user to see only one image at a time, switching very quickly between the two images. This creates the impression of 3D. 8 “Immersion” factor in software and games
• 3D headsets: They must be worn on the head. They create images right in front of the user’s eyes.
Figure I-2: Vuzix 3D headset[12].
• Simulation equipment: mainly used by armies and companies, it is a set of tools simulating a flight for example [13]. All the commands normally available in the plane or the helicopter are also available in the simulation equipment. The user has a good impression of immersion in this case.
Figure I-3: ELITE Simulation solutions: the all-in-one simulator [14].
• 3D cave: this is a big cube-shaped box. To play a game, the user must go into the box where images are projected on every face of the cube. In addition, the user can wear 3D glasses, in order to see the images as if it was real 3D. This kind of tools gives a very impressing experience of immersion, because the user really feels like he is inside another world. He can also move a bit inside the box, and this freedom in his movements adds to the immersion. Of course, this is not a product available for mass market mainly because of its cost and of the space needed to use it.
Of course, there are other ways to enhance the immersion feeling of a game. It would be too long to describe them all. What is important to note is that this kind of tools are becoming more and more “Immersion” factor in software and games 9 used, and cheaper and cheaper. The success of the Wii console is a proof itself. People want the games to match reality, to simulate it. This need for immersion also begins to appear for some software as well.
10 Motion tracking 2 Motion tracking
2.1 What is motion tracking ?[15]
Motion tracking consists of set of technologies developed to detect and track objects movements throughout a defined period.
This technique is used in wide range of products and more particularly in:
• Surveillance purposes such as human aggressive motion detection [16], human gestures detection in a specified environment, robots motion checking in factories for example. • Virtual reality systems to observe and adapt the virtual environment “on the fly”.
2.2 How to track motion ?
The type of motion tracking system has to be chosen depending on the application developed, the types of motion to handle, the tracking quality, the processing power and the period between each motion detection. We will present in the following part the two main parts of a motion detection system:
Hardware: It usually consists in one or more sensors such as cameras, a gyroscope or pressure detector that monitor periodically a specific characteristic of an environment such as light, noise, pressure and that send the data retrieved to a dedicated software to analyze the motion and its evolution in the time. Depending on the needs, the system may implement multiple sensors to enhance the detection accuracy or the numbers of motion types tracked.
Software: Using the information retrieved by the hardware sensors such as a picture raw data, 3D acceleration signals, the software first improves the data by removing the noise, analyzes them and describes the motion (type, speed, shape…). The most advanced systems are able to recognize a specific part of an object and to follow its motion in the time (used in head or fingers detection projects for example). Amongst the software based motion detection system, computational vision has a huge potential. Although it requires huge power processing and that visual cues are complex to analyze [17], it has been in constant development since the 70’s because it enables to remove or reduce the equipments installed on the object to track. Most of the system developed use comparison between a signal and a model to extract the features of a picture such as contours, known shapes stored in a database or a human face. After having detected the objects to track, the computational Motion tracking 11 vision system has to follow their motions on the next frames. Numerous ways of processing can be chosen.
We will present in the following part some of the most common ones [17] :
• Brightness detection : In case the background and the object to track have different brightness, the new position of the object can be easily detected using pixel comparison. This method is inaccurate in complex background frames. • Background detection: In case the camera is fixed during the motion tracking, the position of the object can be detected with pixel subtraction processing the difference of a background picture stored in the system (with no object to track) and of the frames to analyze. • Optical flow detection : The best way to check the position of objects in a picture is to process the optical flow between each frame. If a threshold value of pixels having a flow value is exceeded then the system has to search the region of theses pixels and can determine the region of the moving object. In order to track and identify two objects overlapping on a frame, the system has to determine the depth of each pixel using stereo vision (a pair of cameras). Most advanced systems use a combination of the different methods presented and predict the next motions to reduce the regions to analyze.
We will use in this project a combination of a webcam and a wiimote to follow the fingers and head motion of the user. We know that the system developed will be usually deployed in an indoor room such as a living room with controlled light and that there will be only one user. For this reason, we will be have the opportunity to use any of the previously presented algorithms to follow the head and fingers motions of the user. To limit the background brightness overlapping with the user we will develop a solution based on artificial infrared user lighting and detection for the head and fingers tracking.
The main ideas have been described in this part, and the Analysis part will define in details the technologies used and the main choices of the project. 12 Motion tracking
Motion tracking 13
Part II Analysis
In this part, all the technical fields related to our project are analyzed. The two motion tracking techniques used in the project, head tracking and fingers tracking, are presented. The main choices concerning the installation layout and the software development are described, in order to define the summary of requirements.
14 Head Tracking 1 Head Tracking
Amongst the application scope of human motion detection, we can distinguish head tracking. We will present this technology in the following section.
1.1 What is head tracking?
“Head tracking” includes all the methods used to detect the head position and its related motions. Depending on the needs, systems may track the user’s head using only two dimensions and neglecting the depth value (if the user is sitting or standing at a constant depth from the detection system), or three dimensions (if the users is walking during the tracking).
1.2 What for?
Head tracking has numerous uses in virtual reality:
• Adjust the camera viewpoint depending on the position of the head to a screen (used in flight, driving simulations). • Track the localization of a human being on a large surface. • Reduce the computing power needed for other motion detection. (Knowing an exact head position reduces the amount of works for an eye detection system for example) [18]. • Head recognition / pattern matching
1.3 How to do it?
Before the widespread use of electronic systems, motion detection was made using mechanical tools and equipments such as wires. Using such complex equipments, only experiments in laboratory were possible and the wires restricted the user’s movements. By enabling the development of light, sober and smart motion detection, the globalization of electronic systems and their huge processing revolutionized head tracking technologies: Depending on the needs, we can differentiate two main types of technologies [19]:
• Passive head tracking system : a hardware part that consists of one or more detectors retrieving periodically data describing an environment and a processing part analyzing these data to find the head position. It also follows the head of the person moving in the specified volume (Face recognition provided by OpenCV [20] software for example). Head Tracking 15
Infrared or visible spectrum cameras and more rarely ultrasonic sensors are used to retrieve the environment data because they are harmless for the user, simple and easy to link within a system (S-Video, FireWire, USB interface). This kind of system requires a lot of processing power to retrieve accurate information about the head motions especially in big volumes. However, we can hide the system to the user tracked. For this reason, it will be more used for security matters (house safe alarm…), to track users that do not move a lot (face detection when the user is sitting at a fixed distance from a screen for example).
• Active head tracking system: similar to the passive system except that it includes an emitter system carried by the user (infrared diodes, specific shapes lights…) enabling the detection system to find almost instantly an accurate localization of the user in a bigger volume without using a lot of processing power . This system is very accurate but the user needs to be “aware” of the tracking and to wear a specific device. This type of tracking system is the best solution to restrict the use of “traditional” HID 1 and to offer new degrees of freedom to a user. For this reason, this type of tracking system has numerous applications in games [21], medicine [22], etc.
1.4 Infrared glasses
In order to test the infrared camera embedded in the wiimote, we suggest developing a head tracking software. This device is able to process up to four artificial infrared sources. For this reason, we decided to analyze the different kinds of infrared emitters, the most adapted to this receiver and the different ways to fix it to provide the best head tracking. Even if infrared sources tracking technology is widespread on the motion detection market and even if there are numerous documents describing the infrared receivers, we noticed that the documentation about the types of infrared sources used was inadequate. Infrared emitters manufacturers do not give any information about the frequency, power, and diodes they are using and the developers using wiimotes chose their diodes using empirical methods. For this reason, we will study in the following part the different infrared sources manufactured/adapted and the materials to get to build our own fully adapted solution for head tracking.
We noticed that all the solutions offered had a common design. Using a headset or a pair of frames, at least two infrared diodes are positioned at each side of the head. More accurate versions, with three infrared sources, add a diode on the top and center of the head allowing the receiver to triangulate the user localization. Most of the wiimote users chose to modify safety glasses built in with visible spectrum diodes at each side, removing the diodes and replacing them with infrared diodes.
1 HID : Human Interface Design such as a mouse or a keyboard. 16 Head Tracking
Figure II-1 : An example of infrared emitters glasses adapted from safety glasses [23].
In case the user is sitting near a camera with a built in infrared light, manufacturers produce a less complex version. Some infrared lights are fixed on the infrared camera and directed toward the user’s head. Materials positioned at the glasses edges replace the infrared diodes by reflecting infrared light to the camera. The use of reflective materials reduces the weight of the device (no battery).
Figure II-2 : Glasses built with reflective materials made by TrackIr [24].
What is Fingers tracking? 17
2 What is Fingers tracking?
“Fingers tracking” includes all the methods used to detect the fingers position and their related motions. As for the head tracking solutions, we may find some systems that give a three dimensions or two dimensions position of the fingers. We may find fingers tracking in many electronic devices such as the touch screens that use two dimensions tracking by neglecting the depth value.
2.1 What for?
Fingers’ tracking is one of the oldest motion tracking solutions developed. As it was integrated in many electronic devices and has numerous purposes, we will present two important applications of fingers tracking:
• Replacement of the usual interfaces devices : The fingers tracking is in this case used to replace common “Human Interface Devices” ( HDI ) such as a mouse, a keyboard or a joystick in a user interface. Most of the time, the fingers tracking is only used as a complementary solution by improving the intuitiveness and ease of use of a user interface. For example, the touch screen installed in the Tablet PC offers the choice to use fingers tracking or a mouse for the same tasks.
• Development of new interacting ways between a user and an interface : Whereas the use of common HDIs limit the possible actions of the user (Mouse => 2D…), the fingers tracking allow developers to invent new “human” ways of user-computer communication. For example, we can integrate more than one cursor on the same screen using numerous fingers, work in 3D spaces or even track complex gestures of the user. The user is then able, by doing some specific gestures, to move a virtual object in a virtual world in which using HDIs such as a mouse, a keyboard would be almost impossible. For example, the tracking system of the augmented reality chess game [25] and of the Intelligent Touch Screen Tables implement only one interface device: the user’s hands.
18 What is Fingers tracking?
2.2 How to do it?
Tracking systems conceivers explored numerous different ways to detect and follow fingers motions. Depending on the complexity of the gestures to track, the system may track only the position of the fingers or describe the whole morphology of a hand.
• 2D fingers tracking: This tracking follows only the width values and height values of the fingers positions on a leveled surface as the depth between the finger and the origin is constant. It is used in tactile screens where the user chooses the position to point by pressing the screens with one or more fingers.
• 3D fingers tracking: The system retrieves a position of the fingers moving in a three dimensions space and may be able to reconstitute the hand morphology. To track the fingers position, the user usually wears a glove that implements a tracking hardware.
As the user must be able to look at the screen and walk at the same time, we have to consider tracking his fingers at any depth. For this reason, adapted techniques of fingers tracking are presented in the next subsections:
2.2.1 Inertial tracking [26]
This solution uses some inertial sensors such as accelerometers or gyroscopes to capture the variations of the fingers locations in the time. The sensors are usually embedded in the gloves and must be calibrated. Almost all the gestures can be captured and the solution is less independent to the environment than others are (no electromagnetic or light perturbations). However, it needs an interface to transmit the data (wire or wireless) to an acquisition system and its price is more expensive than a light based tracking solution.
Figure II-3 : Posture Acquisition Glove with inertial devices detecting linear and angular motion, rotation and acceleration of the hand [27]
2.2.2 Magnetic tracking [26]
This system is built using some coils on the emitter and the receiver to detect the magnetic flux between the fingers and process the position of the fingers. The tracking has a very good accuracy but is more expensive than the other solutions and is vulnerable to magnetic or electrical fields. The “cave” built by Aalborg university used this technology to track the fingers gestures. What is Fingers tracking? 19
2.2.3 Image tracking
This system uses a camera to film the fingers of the hand. The user has to wear a glove where specific images such as bar codes or shapes are fixed on each extremity of the fingers. The shot pictures are then processed. The tracking software first finds the position of the hand on the grabbed image, tries to match each figure image with its images library and processes the exact position of the fingers. This system has a good accuracy and is cheap. By adding some cameras on different angles, the software is able to retrieve “stereo” pictures of the scenery and gives a 3D position of fingers. However, this solutions works only in a controlled environment (with few infrared light emission) if the user is not too far from the camera.
Figure II-4 : Image based data glove [28]
2.2.4 Light tracking
This solution is quite similar to the image tracking system: a camera films the hand of the user and the user wears a glove where light sources such as diodes replace the images or shapes on the extremities of the fingers. The tracking software has to find the position of the lights on the grabbed pictures and process the exact position of the diodes with pairing methods. As visible spectrum light is too much dependant on the natural light of the room, infrared sources are used most of the time. This solution is able to track very complex gestures of the user as the “Inverse Kinematic Infrared Optical Finger Tracker” shown in Figure II-5. However, the hand of the user has to remain always in the field of view of the user. It is also quite cheap to develop.
20 What is Fingers tracking?
Figure II-5 : “Inverse Kinematic Infrared Optical Finger Tracker” project [29]
In order to implement the fingers tracking in the project, we use the light tracking method. This solution is cheap and the excellent accuracy of the magnetic tracking system is not needed. As the objects displayed on the screen are quite big, the user can select them without problem with a pointer accuracy of ±3 pixels. Furthermore, the infrared light tracking methods is quite similar to the infrared head tracking explained in the previous part and we can share realization techniques and reduce the development length.
Installation layout 21 3 Installation layout
In order to show the immersion provided by the system, the user is allowed to move on a large surface (around 1.5 × 2 square meters). These surface parameters condition all the other settings of the installation:
• Minimal depth between the user and the screen. • Minimal range of the infrared diodes. • Size, resolution of the screen. • Coordinates accuracy needed.
3.1 Minimal depth between the user and the screen
Knowing the field of view of the wiimotes (≈44 ° measured for the project wiimotes) and the range width (2m wanted, we use 2.5 meters for the calculation), we process the minimum depth between the user and the wiimote using the following formula :