CALIFORNIA STATE UNIVERSITY, NORTHRIDGE

Utilizing Smart Watch Motion Sensors in Human Computer Interaction via Pattern Detection

A thesis submitted in partial fulfillment of the requirements

For the degree of Master of Science in Computer Science

By

Danial Moazen

May 2015

1

The thesis of Danial Moazen is approved:

SIGNATURE PAGE

______

Professor Vahab Pournaghshband Date

______

Professor Gloria Melara Date

______

Professor Ani Nahapetian, Chair Date

California State University, Northridge

ii

Contents

SIGNATURE PAGE ...... ii

LIST OF TABLES ...... v

LIST OF FIGURES ...... vii

ABSTRACT ...... viii

1. INTRODUCTION ...... 1

2. RELATED WORK ...... 3

2.1. Wearable Computing ...... 3

2.1.1. Smart Watch...... 4

2.2. Gesture Recognition...... 5

2.2.1. Letter Recognition ...... 5

2.2.2. Challenges of Gesture Recognition ...... 6

3. SYSTEM OVERVIEW ...... 8

3.1. Hardware Overview ...... 8

3.2. Software Overview ...... 10

3.3. Weighted Moving Average Algorithm ...... 13

3.4. Angle Calculation ...... 14

3.5. Session Detection Algorithm ...... 16

3.6. Letter Detection with Machine Learning ...... 17

4. APPROACH ...... 18

iii

4.1. Inertial Navigation ...... 18

4.2. Pattern Detection ...... 21

4.2.1 HMM...... 21

4.2.2 DTW ...... 23

5. RESULTS ...... 24

6. CONCLUSION ...... 29

7. REFERENCES ...... 30

iv

LIST OF TABLES

Table 3.1:‎ Comparison of the devices’ specifications used in this project; Nexus 7 as the handheld device and Gear Live as the wearable device...... 9

Table 3.2:‎ Comparing the number of data (average for 5 times measurement) updated and transferred between devices while connection is continues and while the connection is only there during the sessions. The words are the 5 top most searched food related words in the U.S. in 2014...... 17

Table 4.1:‎ Gyroscope average data of 10000 samples...... 19

Table 4.2:‎ Linear Accelerometer average data of 10000 samples...... 19

Table 5.1:‎ Detection accuracy of HMM algorithm for 5 non-similar letters. The size of letters is 12 inches by 12 inches. The test is done 100 times for each letter...... 24

Table 5.2:‎ Detection accuracy of DWT algorithm for 5 non-similar letters. The size of the letters is 12 inches by 12 inches. The test is done 100 times for each letter...... 25

Table 5.3:‎ Detection accuracy of HMM algorithm for 5 non-similar letters. The size of the letters is 6 inches by 6 inches. The test is done 100 times for each letter...... 25

Table 5.4:‎ Detection accuracy of DTW algorithm for 5 non-similar letters. The size of the letters is 6 inches by 6 inches. The test is done 100 times for each letter...... 26

Table 5.5:‎ Detection accuracy of HMM algorithm for 5 similar letters. The size of the letters is 12 inches by 12 inches. The test is done 100 times for each letter...... 26

Table 5.6:‎ Detection accuracy of DTW algorithm for 5 similar letters. The size of the letters is 12 inches by 12 inches. The test is done 100 times for each letter...... 27

v

Table 5.7:‎ Accuracy percentage of HMM algorithm for the detection of lowercase

English alphabet. The test is repeated 20 times for each letter. The size of the letters is 12 inches by 12 inches...... 28

vi

LIST OF FIGURES

Figure 3.1‎ : Hardware schema; handheld device, wearable device and the Bluetooth connection...... 8

Figure 3.2:‎ Software Overview. Wear Application and Mobile Application ...... 10

Figure 3.3:‎ Comparing the signal with and without weighted moving average filter. The blue line shows the acceleration signal on the z-axis while writing letter a without any filter applied, while the red line shows the same signal for the very same movement while the filter is applied...... 14

Figure 3.4:‎ Smart watch’s fixed frame of reference...... 14

Figure 3.5:‎ Comparing the acceleration data before and after applying the rotation of the frame of reference while the device is moving up and down repeatedly and rotating around z-axis 90 degrees. 3.5.a, on the top, shows the acceleration without applying the rotation on the frame of reference and 3.5.b, on the bottom, shows the acceleration after applying the rotation on the frame of reference...... 16

Figure 4.1:‎ Comparing the velocity signal before and after applying the velocity filter.

Figure 4.1.a on the right is the Velocity signal before applying velocity filter. While

Figure 4.1.b is velocity signal after applying the velocity filter...... 20

vii

ABSTRACT

Utilizing Smart Watch Motion Sensors in Human Computer Interaction via Pattern Detection

By

Danial Moazen

Master of Science in Computer Science

Wearable computing is one of the fastest growing technologies today. Smart watches are poised to take over at least of half the wearable devices market in the near future. Smart watch screen size, however, is a limiting factor for growth, as it restricts practical text input. On the other hand, wearable devices have some features, such as consistent user interaction and hands-free, heads-up operations, which pave the way for gesture recognition methods of text entry. This thesis proposes a new text input method for smart watches which utilizes motion sensor data and machine learning approaches to detect letters written in the air by a user. This method is less computationally intensive and less expensive when compared to computer vision approaches. It is also not affected by lighting factors which limit computer vision solutions. The AirDraw system prototype developed to test this approach is presented. Additionally, experimental results close to

71% accuracy are presented.

viii

1. INTRODUCTION

Wearable computing is one of the fastest growing areas among today’s technologies [23].

According to a Yahoo Finance’s article published on April 2015 [24] based on a Business

Insider Article [25], the global wearable market will grow at the dramatic rate of 35% annually for the next 5 years. Among different types of wearable devices such as glasses, wests and helmets, wrist based wearable devises make up a great proportion of today’s wearable devices. Forecasts suggest that smart watches, as one type of the wrist based wearable devices, is alone going to take over the half of the wearable computing market by 2018 [11][24][25].

In the area of gesture detection and recognition, most of the existing systems rely on computer vision [8]. Computer vision approaches requires various steps such as image enhancement, image segmentation and etc. to be practical [9]. The need for multiple camera to make up a 3D model, need for optimal lighting and need for intense image processing are some of the drawbacks of these types of systems [8]. The shortfalls of using computer vision for gesture recognition call for alternative approaches. One of these alternatives is leveraging motions sensors such as accelerometers.

Wearable computing is a very promising area for gesture detection user interfaces.

Constant user interaction, hands-free and heads-up operations are among its promising features. Comparing smart phones and smart watches clarifies this point. Often smart phones are in hand only during direct interaction. Otherwise, they are placed on a table or in a pocket. On the other hand, smart watches are on the wrist at least during most of the day, even when there is no direct interaction with the user. Therefore, the watch still can be used to collect information about activities, etc. One type of application that can take

1

advantage of this constant interaction is health-monitoring applications. Another application is what proposed in this thesis; writing in the air, hands-free and heads-up.

Many scenarios come to mind when it comes to hands-free and heads-up writing applications. Sandip Agrawal et al. survey some of them [1]. Making quick notes while waiting behind red light, interaction with a system while giving first aid services and avoiding using the small keyboards are among these scenarios.

Besides these applications, this method of input can be used as the primary text input method for smart watches. One of the limiting factors in smart watch’s market growth is their small screen and therefore difficulties in interacting with them [14].

This thesis proposes a new text input method for smart watches in the form of an Android

Wear application, AirDraw. The method utilizes the motion sensors on the smart watches as the resource for arm movement information in 3D, and then tries to match the motion pattern with some already known motion patterns and detects the letter that user has intended to write in the air.

The system uses a machine learning algorithm to find a match for the pattern. Two machine learning algorithms implemented in this system are Hidden Markov Model and

Dynamic Time Wrapping. These approaches are compared with one another and the results are provided for each one in the result section.

2

2. RELATED WORK

This chapter is an overview of wearable technologies and existing gesture recognition systems.

2.1. Wearable Computing

Wearable computing is one of the successful attempts in the field of ubiquitous computing. It brings computers into the people’s everyday life. The history of wearable computing goes back to 1955, when Edward O. Thorp invented the first to predict roulette [10]. Some of the advantages of wearable computers are ability of hands-free, heads-up operations, mobility, and consistency of user interaction. Not only do these characteristics help resolve some existing problems, but they also introduce new opportunities and applications.

Fitbit newest product, Sugar, is a wrist based device. Taking advantage of a 3-axis accelerometer, a gyroscope, a compass, an ambient light sensor, and the optical heartbeat sensor, it is able to detect, recognize and measure several daily activities including steps and sleep patterns [12].

Abhinav Parate et al. in their project try to recognize the smoking gesture from other gestures, some of which are very similar to smoking such as food intake. The project utilizes a low-power 9-axes inertial measurement unit (IMU) on a wristband. IMU provides 3D orientation of the wrist by using the data from accelerometer, gyroscope and compass [2]. Yujie Dong et al. also made a wrist based wearable device using an expensive sensor ($2,000 US) to measure the food intake [20]. The advantage of the proposed approach in this thesis is in the fact that an off-the-shelf device is used which is

3

cheaper, easier to access, and also usable for other applications besides the purpose of this thesis.

2.1.1. Smart Watch

Smart watches have a significance among wearable devices since they are getting more popular [11, 13]. Major tech companies such as Samsung, LG, Sony and recently Apple produce their own smart watches, each one with certain capabilities and functionalities.

The term “Smart” comes from the fact that these devices can sense their environment and act accordingly. However, in the most cases, the action is limited to collecting and/or sending the data.

The user interaction difficulties, due to smart watches’ small screens, have encouraged the new ideas for UI approaches. One of these approaches has proposed by Andreas

Komninos and Mark Dunlop [14]. Their approach relays on taps on the touch screen as the basic text input approach. They proposed a new layout for keyboard which has only 6 buttons and therefore the buttons can be large enough to be used on very small screens such as the screens of smart watches. Some hand writing recognition approaches like

Unistrokes [15] and Graffiti [16] also tried to make text input easier for users. None of these approaches is hands-free neither heads-up. On the contrary, in this thesis the proposed method is hands-free, heads-up approach for text input through tracking the arm movement of the user which frees the user from interacting with the device directly.

Agrawal et al. leverage the in-built accelerometer sensor in mobile phones to capture the information written on the air [1]. They introduce PhonePoint Pen as a new input method for mobile phones. The need for new methods of input is more significant for smart

4

watches because of their smaller touch screens. AirDraw is introducing a new input method for smart watches.

2.2. Gesture Recognition

A gesture is “a motion of the body that conveys information” [17]. Gesture recognition is a simple daily activity of humans. We receive the data related to a gesture, in any form such as sound or image, from our environment, and we match it to a known gesture almost immediately, and then we act accordingly. Although the basics are the same for computers, the whole process is one of the most computationally intensive and time taking process. It has two major phases; training phase, where the system is given the information it needs to learn the gesture, and classification phase, where the system shows the user how close the learned gestures are to the input.

Hidden Markov Model (HMM) is very popular among the gesture recognition applications [18]. Byung-woo Min et al. utilize a 4-state HMM for hand gesture recognition in their system [19]. AirDraw system utilizes a model of HMM which defines the number of states dynamically.

2.2.1. Letter Recognition

As is shown, the letter recognition in this thesis is a specific type of gesture recognition.

When user writes a letter in the air, the movement of the user’s arm can be seen as a gesture. In this scenario the data from the entire movement is used to train the algorithm and later to classify the sample. Another approach proposed in PhonePoint Pen system

[1] treats each written character as some strokes and tries to detect the strokes. It finally detects the letters by putting the detected strokes together.

5

David Goldberg and Cate Richardson proposed a unistrokes approach [15]. Their system defines a stroke for each letter. Since the system is not using the actual letter, the user needs to learn in advance how to interact with the system. Our proposed system uses the natural form of letter, so that the only thing that the user needs to know to be able to interact with the system is the basic knowledge of writing.

2.2.2. Challenges of Gesture Recognition

“Signal variation due to orientation changes” is one of the challenges Parate et al. encounter during their project [2]. The signal coming from motion sensors varies based on the user’s body orientation. It happens because they are doing the calculations based on earth fixed frame. To overcome this challenge in AirDraw system the calculation was based on the device fixed frame of reference, so that the orientation of the body could not affect the signals.

Another challenge is “Concurrent activity while smoking” [2]. Concurrent activities like running may significantly affect the motion sensors’ signals. A person can perform a wide range of other activates while smoking. Walking, running, driving, interacting with other systems are just a few examples. But in the case of writing, the range of the activities that the user may perform is very limited and this characteristic indicates that we may be able to recognize those activities and consider how they change the signals in order to remove them from our calculations. Although it can be a very interesting area for the later improvements, in the scope of this project, user is considered not to performing any concurrent activities while using the system.

6

Parate et al. project includes both the gesture detection and recognition. First the system detects that something is going on and then it tries to figure it out and check if it is a smoking session or not [2]. AirDraw does the same. Its session detection algorithm first detects the session (the time window during which the user is writing one letter) and then

AirDraw tries to recognize what letter the user has written or which shape they have drawn.

7

3. SYSTEM OVERVIEW

This chapter provides an overview of the AirDraw system. AirDraw’s hardware component consists of two off-the-shelf devices: a smart watch and a handheld device. A

Bluetooth connection is used to facilitate the communication between the devices.

AirDraw’s software component also has two parts: the smart watch software and the handheld software. The software developed for the smart watch is referred to as Wear, and the software developed for the tablet is referred to as Mobile. The detailed explanations on each of these components are respectively under the hardware overview section and the software overview section.

3.1. Hardware Overview

This section is an explanation of the AirDraw’s hardware component. Two off-the-shelf devices are used in this project; a smart watch as a wearable device and a tablet as a handheld device. The following paragraphs in this section discuss the smart watch’s task, communication between devices, and tablet’s task. You can see the hardware schema in

Figure 3‎ .1.

Figure 3‎ .1: Hardware schema; handheld device, wearable device and the Bluetooth connection.

The data needed by the system to detect arm motion is the gravity and the linear acceleration of the device, in all three axes; x, y and z. The data collection is done by the

8

smart watch. The accelerometer sensor on the smart watch provides the system with all the motion data that is needed by the system.

The next step is to send the data to the handheld device where we have more resources to do data analysis and display results. These resources include the processing power, the storage capacity and the interface of the device such as the larger touch screen compare to the wearable device. This larger screen can be leveraged as input and output. Table 3‎ .1 compares the specifications of the two devices used in this project.

Table 3‎ .1: Comparison of the devices’ specifications used in this project; Nexus 7 as the handheld device and Live as the wearable device.

Device CPU CPU cores RAM Screen Size Storage

Nexus 7 1.5 GHz Quad-core 2 GB 7 inches 16 GB

Samsung 1.2 GHz Quad-core 512 GB 1.2 inches 4 GB Gear Live

9

3.2. Software Overview

The software component of AirDraw is an Android Wear application and it consists of two android applications; Wear and Mobile. Wear is an android application to be installed on the wearable device. All the data collection and filtering happens in this application.

Mobile is to be installed on the handheld device and all the process-heavy tasks such as data analysis and classifications are taking place in this application. The communication between these two applications is facilitated by GoogleApiClient and done through a

Bluetooth connection. The following paragraphs in this section give you an overview of the software architecture and structure of the AirDraw’s software component and how it works. Figure 3‎ .2 illustrates the software overview of the system.

Figure 3‎ .2: Software Overview. Wear Application and Mobile Application

10

The data needed in this thesis to detect and recognize the arm movement is the acceleration data on all three axes. Wear application is responsible for collecting and filtering this data and making it accessible for the Mobile application to use. The raw acceleration data straight from accelerometer is not useable because it is not linear and it also fluctuates too much. Linear acceleration can be thought of as acceleration minus gravity in all axes. This data can be calculated out of the acceleration data and android makes it available to applications [4]. But we still need to smooth that data. The

Weighted Moving Average Smoothing Algorithm implemented on the Wear application makes the signals coming from the linear accelerometer smoother. Weighted Moving

Average Algorithm Section discusses the details about the implementation of this filter in this project.

The other data produced by the Wear application is the angle the user’s arm is making with horizon. This angle is later used by Mobile application to cancel the effect of the user’s arm orientation on the linear acceleration data while drawing or writing in the air.

The gravity data, like linear acceleration, can also be calculated from acceleration and is available in android libraries [4]. The Angle Calculator on Wear application is the function that calculates this angle. It takes the gravity data and returns the angle. The details of its implementation are discussed in the Angle Calculation section.

The last important function implemented in Wear application is session detection. This algorithm detects the writing sessions. Each session is the time window in which the user writes a letter or draws a shape. Sessions start by a significant change in acceleration data and ends if there is no significant acceleration for a while. The data will be updated only during the session in order to prevent sending useless data over to the handheld device.

11

Session Detector Algorithm section has all the details about this algorithm and compares the number of times data being transferred with and without applying this algorithm.

The data synchronization between the devices is done through GoogleApiClient. Wear application being connected to the GoogleApiClient means Wear is ready to connect to

Mobile. On the other hand, there is a listener implemented on the Mobile application. As far as Mobile are connected to the GoogleApiClient the listener can listens for the changes on the data instance in Wear application. Now, the listener can update the data instance in the Mobile application based on the changes on the Wear application, in case of any changes.

Now, Mobile application has the linear acceleration on smart watch’s fixed frame of reference and the angle that arm makes with the horizon during each session. It uses a rotation matrix to rotate the frame of reference around z-axis by the angle it receives from

Wear application to cancel the effect of the arm orientation. When all is done, Mobile has the rotated linear acceleration data for each session, now the only thing to do is to send it to a classifier algorithm in order to classify and recognize the motion. The classifier algorithms get the rotated acceleration data on three axes for each session and compare it against the trained models and return the nearest match for that data. These algorithms and the way they work are explained thoroughly under the Machine Learning section.

The following paragraph takes a quick look on the interfaces designed for the applications. The Wear application has a very simple interface. A button appears on the screen, when the GoogleApiClient gets connected, allowing the user to start sending the data from motion sensors to the handheld device. The user can “pause” and “restart” sending the data by tapping on the button as far as the GoogleApiClient remains

12

connected. On the other end, the user can set up the machine learning algorithm using a simple control panel on Mobile application. The interface helps the user to train the algorithm and also shows the detected letter and the log of system.

3.3. Weighted Moving Average Algorithm

Weighted moving average is the filter used in this project to smooth the signals from sensors. The idea of moving average algorithm is to average the fluctuated signals over time in order to have smoother signals. There are two types of moving average, simple moving average and weighted moving average. The simple moving average takes the average of the whole data it has got so far, whereas the weighted moving average gives more value to certain data point. In the case of this thesis the more recent data is the more significant data, and that is the reason why weighted moving average is used in this project. The smoothing is done by a function implemented in the Wear application considering the principals of the weighted moving average. The weight considered for that algorithm is 0.1. Figure 3‎ .3 shows the improvement we have by applying this smoothing algorithm. The blue line is the original acceleration signal on the z-axis while drawing a circle without any smoothing filter applied, and the red line shows the same signal for the very same movement when the smoothing algorithm is applied.

13

Figure 3‎ .3: Comparing the signal with and without weighted moving average filter. The blue line shows the acceleration signal on the z-axis while writing letter a without any filter applied, while the red line shows the same signal for the very same movement while the filter is applied.

3.4. Angle Calculation

This algorithm is implemented in the Wear application and calculates the angle made by user’s arm and horizon. Smoothed gravity data is the input to this algorithm and the output is the angle that the smart watch’s x-axis (or user’s arm) makes with the horizon.

Figure 3‎ .4 shows the smart watch’s fixed frame of reference and how the x-axis coincides with user’s arm.

Figure 3‎ .4: Smart watch’s fixed frame of reference.

14

The reason why the orientation of the arm is important to us is that the user arm can be in different orientation while writing or drawing and this orientation may affect the acceleration data on the x-axis and the y-axis. To cancel the effect of the arm orientation, first we calculate the angle that the arm makes with horizon, and then we rotate our frame of reference accordingly. When it is done, the orientation of the arm simply does not matter anymore, up and down movement changes the acceleration on y-axis, and back and forth changes the acceleration on x-axis. Left and write movement always affects the acceleration on z-axis so that we do not touch it.

In order to calculate the angle, first we need to calculate the norm of the gravity vector. In

3D environments this is done as shown in Equation 3‎ .1.

Equation 3‎ .1: Now that we have the norm we need to normalize the gravity on each axis. Normalized gravity for each axis is calculated by dividing the gravity on each axis by the norm. One example for x is shown in Equation 3‎ .2 below.

Equation 3‎ .2: Now the angle that x makes with horizon can be calculated by having the arctangent of gx-normal over gy-normal. The updateAngle function takes care of the calculating the angle.

Every time that the sensor listener method on Wear application receives a new data form gravity sensor it calls the updateaAngle function and gives it the smoothed gravity data and it returns the newly calculate angle which will be sent to the handheld device later.

Figure 3‎ .5 shows the effect of the rotation on signals x and y. Both Figure 3‎ .5.a and

Figure 3‎ .5.b show the acceleration data when the wearable device is moving up and down repeatedly while the device is rotating around z-axis 90 degrees. Figure 3‎ .5.a shows the

15

data before rotation and Figure 3‎ .5.b shows the data after rotation. Note the effect of the device’s orientation being canceled on the x-axis and added to the y-axis. This addition and deletion is in the reverse direction when the device is moving back and forth.

Figure 3‎ .5: Comparing the acceleration data before and after applying the rotation of the frame of reference while the device is moving up and down repeatedly and rotating around z-axis 90 degrees. 3.5.a, on the top, shows the acceleration without applying the rotation on the frame of reference and 3.5.b, on the bottom, shows the acceleration after applying the rotation on the frame of reference. 3.5. Session Detection Algorithm

Communication between the devices is expensive in terms of power. To minimize that communication and the data synchronization the session detection is done on the Wear application. The synchronization is only done during the session. The time window that the user is writing a letter or drawing a shape is referred to as session. The indicator that shows user is about to start a session is the acceleration signal. If the acceleration becomes greater than 1 m/s2, session will start. Once acceleration stays less than 1 m/s2 for more than 400 milliseconds, session will end. Table 3‎ .2 is comparing the average

16

number of data transfer between devices in order to write some words while the connection is continues with the situation that we just have the connection during the sessions. The data for each word is the average of five times sampling. These words are the 5 top most searched food related words in the U.S. in 2014 according to

Trend [7].

Table 3‎ .2: Comparing the number of data (average for 5 times measurement) updated and transferred between devices while connection is continues and while the connection is only there during the sessions. The words are the 5 top most searched food related words in the U.S. in 2014.

Words Pizza Chicken Cake Wine Coffee

Continues 281.4 305.4 228.8 201.2 297 Connection

Connection only 186 198.8 118.2 118.2 155.8 during the session

Percentage of 33.9% 34.9% 48.3% 41.5% 47.5% unused data

3.6. Letter Detection with Machine Learning

This section introduces the two pattern detection algorithms used in AirDraw system;

Hidden Markov Model (HMM) and Dynamic Time Wrap (DTW). User can choose between these two to work with the system. HMM algorithm training phase is done previous to using the system by training the algorithm with several instances if the data.

On the other hand DTM uses only one instance of the data for training and it can be done just before using the system.

The detailed explanation on how these algorithms are implemented is in the chapter 5 under the pattern detection section.

17

4. APPROACH

4.1. Inertial Navigation

This section discusses the first approach followed to detect the letters written by user in the air. The approach utilizes the principals of inertial navigation. Since the results for this approach were not satisfying, another approach (letter detection with machine learning) was pursued and the results improved by a great amount.

The inertial navigation is done in Mobile application. Integrating acceleration gives us velocity, and integrating velocity gives us the position. This double integration magnified the error [5]. “This is the obvious cause of drift in the tracked position” says Oliver J.

Woodman in [5]. There are two major errors associated with each MEMS accelerometer and MEMS gyroscope, bias error and random walk [5]. Bias error is the deference between real data and the data collected from sensor. In case of gyroscope we can easily average this error when the device is still and remove it from calculation. Each cell of

Table 4.1 is average of 10000 samples collected from gyroscope data when the device is staying still motionless. The sampling has been done four times and the orientation of the device was different each time. Rows of the table represent the sampling for orientations.

You can see the results are almost constant across the different orientation. Therefore we can easily remove this average data from our calculation in order to remove the gyroscope bias error. But this is not the case for linear acceleration data. Table 4.2 shows how different the results are with different orientation. In conclusion the bias error of the accelerometer tightly depends on the orientation of the device so that it is not easy to remove.

18

Table 4‎ .1: Gyroscope average data of 10000 samples.

Gyro. x Gyro. x Gyro. x 1 0.010410108 -0.0036096694 0.001742135 2 0.009559085 -0.0038097943 0.0020044514 3 0.009556783 -0.0037695186 0.002265009 4 0.0105297975 -0.0030996492 0.0019788065

Table 4‎ .2: Linear Accelerometer average data of 10000 samples.

Linear Acc. x Linear Acc. x Linear Acc. x 1 -0.15858515 0.10599122 -0.22445583 2 0.02882182 0.06583694 -0.25512913 3 0.021968732 0.10870765 -0.16708066 4 0.06863882 -0.0021310293 -0.48662505

One of the filters designed and implemented in the course of this project is velocity filter.

Velocity filter borrows its idea from Kalman filter. Basically Kalman filter is to justify the current error in order to avoid the propagation of the error to the next calculation [6].

Kalman filter implementations usually use an external source of information to correct the error. In positioning, this external source can be GPS for example [3]. In this project the assumption is that the movement of a human arm to write a letter or draw a single shape happens in a limited time window (a session). Therefore we can assume that the velocity of the hand becomes zero at the end of that time window. That is the reason why we can reset the velocity to zero at the end of each session to avoid the velocity random walk. There is still a small amount of error during the session caused by random walk, but considering the fact that we do not have access to any external source of data to correct the position, this filter enhances the accuracy of our calculation [1]. Figure 5.1

19

shows the velocity signals before and after applying the velocity filter respectively. It can easily be seen in Figure 4.1.a that there is a random walk error associated to the velocity which is highly mitigated by applying the velocity filter (Figure 4.1.b). The cause of this error is that each time the new position is being calculated we need to consider the current velocity which has been cumulatively calculated from previous velocities. The idea here is that the same amount of force that start the movement should apply in opposite direction to stop the movement but since the sensors are not precise enough there always remain something extra which cumulatively makes up the random walk.

Figure 4‎ .1: Comparing the velocity signal before and after applying the velocity filter. Figure 4.1.a on the right is the Velocity signal before applying velocity filter. While Figure 4.1.b is velocity signal after applying the velocity filter. Since the human arm movement has a very wide range of acceleration and it changes its direction very fast we need to define a restricted problem. Then with the help of some specifically designed filters we may be able to do the positioning.

Because of the shortfall in this approach, and the fact that the positioning errors affect the pattern recognition approach less, pattern recognition is more suitable for this application.

Next section shows the implementation of the system with regards to pattern detection in order to recognize the letters written by user in the air.

20

4.2. Pattern Detection

This section is an overview of how pattern detection is done in AirDraw. Pattern recognition is a subsection of machine learning in which a set of labeled training data compared against a newly produced data to recognize the new data (supervised learning).

The type of pattern detection which works without labeled training data (unsupervised learning) is out of the scope of this thesis.

Two pattern detection algorithms are used in this system; Hidden Markov Model (HMM) and Dynamic Time Wrap (DTW). The rest of this section suggests the implementation details of each of them and later in chapter 6 you can see the comparison between these two algorithms in terms of how accurately they can act to detect the letters.

4.2.1 HMM

Hidden Markov Model is a very popular algorithm in the area of gesture recognition [18].

It can be explained through Urn problem with replacement [21]. There are many urns in a room and in each of them there are many types of balls. The urns are hidden from the observer. The observer only sees the sequence of the balls have been taking out of the urns and based on this information can make predictions such as the probability of the urn from which the next ball is coming. The basics of Markov model is driven from

Markov process. Markov process’s predication depends solely on the process present state and not its full history.

Joel Rarang implemented the HMM with dynamic number of state in his thesis [22]. This thesis implementation of HMM is using Rarang’s classes to detect the pattern.

21

The training phase for this algorithm starts with collecting examples for writing the letters by AirDraw. We collected four different examples for each letter. These examples defer on size, speed and form of writing. Each example is made by taking the acceleration data and converting it to some discrete values. By putting the values of each session together we make a string. These strings will later be used by training algorithm on Rarang’s system to produce the models. Finally, the models are going to be used by the classifier on AirDraw.

Classification phase is done at the end of each session so that the user can see what he or she has written immediately before proceeding to write the next letter. During each session the acceleration data is converted to discrete values and are saved on an array for each axis separately. At the end of each session the getClosestLetterWithHMM function returns the closest letter to the movement using the data recorded during the session.

The getClosestLetterWithHMM iterates through the alphabet (or a limited subset of alphabet) and compare the model for each letter with the recently produced pattern and calculates the probability that the letter matches the pattern. The probability is calculated separately for each axis and then they multiply together in order to make up the overall probability. At the end the most probable letter is chosen as the guessed letter and will be shown on the screen.

The process of comparing the data on each axis with the models starts by making an instance of ForwardAlgo class. To make the instance we need to provide the constructor with the model and recent pattern that we have from the user’s arm movement. Models are made using JSON objects which have produced in the training session in Rarang’s system from the examples for each letter on each axis. These JSON objects are read from

22

separate files saved on the system and passes to the constructor method of HMM class in order to make the models. Then a method of ForwardAlgo provides us with the probability.

4.2.2 DTW

Dynamic Time Wrapping algorithm is to compare two sequences binned with time in terms of similarities. Any data that can be converted into a linear sequence can be compared with DTW. The fact that the acceleration data on each axis is a linear sequence of numbers makes it perfect for this algorithm. When DTW compares two signals, it returns the distance of how far or close are them in terms of similarity. Then observer can use that number to decide if it is a match or not.

Before the user start using the system to write on the air the algorithm should be trained.

DTW’s training phase is so simple that we could implement it on the Mobile application.

The user chooses the training option from the dashboard and starts to train the algorithm.

The user needs to writes the entire letters which are going to be used and labeled them one by one. AirDraw calls trainDTW function to save the acceleration data on x-axis, y- axis and z-axis for each letter at the end of each session and to use them to detect the letter later in the classification phase.

In this approach we also do the comparison on each axis separately to detect the letters.

The total distance is calculated from adding the distances for x-axis, y-axis and z-axis.

When the user writes a letter on the air, getClosestLetterWithHMM compares the acceleration data of the newly written letter with the data that it has saved in the training phase for each letter and chooses the closest letter. The closest letter is the letter which has the minimum total distant.

23

5. RESULTS

This chapter demonstrates the test results of the AirDraw system. All the tests are done by one person (the author). For all the tests, the smart watch is worn on the author’s dominant hand (right hand). For all the tables in this chapter, first column shows the letters written on the air, and the rest of columns are the percentage of the resulted letter shown on the screen. Table 5‎ .7 are the results of repeating the test 20 times for each letter. The rest of the tables are the result of repeating the test 100 times for each letter.

The test for each letter was completed before starting the next one and it took 1.5 second in average to write each letter. The training phase is done once for the HMM algorithm but, for DTW algorithm the training phase is done every time for new tests.

Table 5‎ .1 illustrates the results for 5 letters (a, b, j, w and z) using the HMM algorithm.

The reason behind choosing these letters is that they are different from each other so the data can show how the algorithm works for letters with different forms of writing.

Table 5‎ .2 shows the same data for DTW algorithm. The size of the letter written on air is

12 inches by 12 inches.

Table 5‎ .1: Detection accuracy of HMM algorithm for 5 non-similar letters. The size of letters is 12 inches by 12 inches. The test is done 100 times for each letter.

a b j w z

a 90% 5% 0% 5% 0%

b 5% 75% 20% 0% 0%

j 0% 30% 67% 3% 0%

w 10% 3% 3% 84% 0%

z 6% 20% 4% 0% 70%

24

Table 5‎ .2: Detection accuracy of DWT algorithm for 5 non-similar letters. The size of the letters is 12 inches by 12 inches. The test is done 100 times for each letter.

a b j w z

a 100% 0% 0% 0% 0%

b 5% 95% 0% 0% 0%

j 0% 4% 84% 10% 2%

w 0% 10% 16% 74% 0%

z 0% 0% 0% 4% 96%

As is seen in Table 5‎ .1 and Table 5‎ .2, the DTW is more accurate than HMM to detect the non-similar letters.

When we make the range of the arm movement (size of the letter) to 6 inches by 6 inches, it is seen on Table 5‎ .3 that for some letter such as z and j, the accuracy drops dramatically for HMM algorithm.

Table 5‎ .3: Detection accuracy of HMM algorithm for 5 non-similar letters. The size of the letters is 6 inches by 6 inches. The test is done 100 times for each letter.

a b j w z

a 100% 0% 0% 0% 0%

b 34% 61% 4% 1% 0%

j 9% 27% 46% 18% 0%

w 18% 6% 6% 70% 0%

z 16% 52% 22% 0% 10%

25

Table 5‎ .4 shows the effect of reduction in size of letters does not necessarily lower the accuracy of the DTW algorithm in spite of HMM algorithm.

Table 5‎ .4: Detection accuracy of DTW algorithm for 5 non-similar letters. The size of the letters is 6 inches by 6 inches. The test is done 100 times for each letter.

a b j w z

a 96% 4% 0% 0% 0%

b 22% 77% 0% 0% 1%

j 0% 12% 70% 18% 0%

w 0% 6% 0% 94% 0%

z 0% 0% 0% 14% 86%

A set of 5 letters is chosen to test the performance of HMM and DWT algorithms for the similar letters. The letters are a, d, g, q and u. The similarity in form of writing is the reason why we chose these letters.

Table 5‎ .5: Detection accuracy of HMM algorithm for 5 similar letters. The size of the letters is 12 inches by 12 inches. The test is done 100 times for each letter.

a d g q u

a 76% 24% 0% 0% 0%

d 73% 13% 14% 0% 0%

g 59% 41% 0% 0% 0%

q 55% 44% 1% 0% 0%

u 56% 44% 0% 0% 0%

26

Table 5‎ .6: Detection accuracy of DTW algorithm for 5 similar letters. The size of the letters is 12 inches by 12 inches. The test is done 100 times for each letter.

a d g q u

a 54% 0% 19% 6% 11%

d 0% 100% 0% 0% 0%

g 0% 17% 18% 65% 0%

q 3% 0% 16% 81% 0%

u 0% 0% 0% 0% 100%

Comparing the result of Table 5‎ .5 and Table 5‎ .6 shows that DTW algorithm also performs better in detecting the similar letters.

Since the data indicates that overall performance of the DTW algorithm is more accurate than HMM we tested the whole English small letters with DTW. Table 5‎ .7 suggests the results of this test.

27

Table 5‎ .7: Accuracy percentage of HMM algorithm for the detection of lowercase English alphabet. The test is repeated 20 times for each letter. The size of the letters is 12 inches by 12 inches.

a b c d e f g h i j k l m n o p q r s t u v w x y z a 90 10 b 80 20 10 c 60 5 35 d 60 15 5 5 5 10 e 45 10 15 15 15 f 85 15 g 20 80 h 80 5 10 15 i 95 5 j 50 45 5 k 5 35 60 l 85 15 m 85 15 n 60 10 30 o 5 5 80 5 5 p 5 5 85 5 q 15 5 35 45 r 5 80 10 5 s 5 25 5 5 5 55 t 10 5 5 80 u 15 5 35 45 v 20 35 55 w 100 x 40 60 y 15 10 5 5 5 10 50 z 15 85

28

6. CONCLUSION

This thesis proposed a new method of input for smart watches. The proposed method utilizes the acceleration data on the smart watch provided by the motion sensors as the input to the system. The acceleration data becomes smoother with the Moving Average algorithm and Angle Calculation algorithm helps to cancel the effect of user’s arm orientation. The smart watch is connected to a handheld device via Bluetooth. More intense process on the data is done in handheld device. There are two letter detection algorithm implemented in handheld device to detect the letters written in the air by the user; HMM and DTW.

Many factors such as form of writing, frequency of sampling the data and the similarities of the letter affect the accuracy of the performances of the algorithms. Each of the algorithms has proved to have its own strength and weakness under different circumstances.

In conclusion, it is believed that writing in the air, as a new method of text input, using the principals of gesture recognition and taking advantage of the natural features of wearable computing can be considered as a very promising user interface for smart watches.

29

REFERENCES

[1] S. Agrawal, I. Constandache, S. Gaonkar, R. Roy Choudhury, K. Caves, and F.

DeRuyter. Using mobile phones to write in air. In MobiSys, 2011.

[2] A. Parate, M.-C. Chiu, C. Chadowitz, D. Ganesan, and E. Kalogerakis. Risq:

Recognizing smoking gestures with inertial sensors on a wristband. In Proceedings of

the 12th Annual International Conference on Mobile Systems, Applications, and

Services, MobiSys ’14, pages 149–161, New York, NY, USA, 2014. ACM.

[3] Liu, H.H.S.; Pang, G.K.H., "Accelerometer for mobile robot positioning," Industry

Applications, IEEE Transactions on , vol.37, no.3, pp.812,819, May/Jun 2001.

[4] http://developer.android.com/guide/topics/sensors/sensors_motion.html, 13/3/2015.

[5] Oliver J. Woodman, "An introduction to inertial navigation," Technical Report,

UCAM-CL-TR-696, University of Cambridge, Aug 2007.

[6] Kalman, Rudolph Emil, “A New Approach to Linear Filtering and Prediction

Problems,” Journal of Basic Engineering, Transactions of the ASME, vol.82,

no.Series D, pp.35,45,1960.

[7] http://www.google.com/trends/topcharts#vm=chart&cid=foods&geo=US&date=2014

&cat=, 3/31/2015.

[8] Gouthaman, S.; Pandya, A.; Karande, O.; Kalbande, D.R., "Gesture detection system

using smart watch based motion sensors," Circuits, Systems, Communication and

Information Technology Applications (CSCITA), 2014 International Conference on ,

vol., no., pp.311,316, 4-5 Apr 2014.

30

[9] Weinland, Daniel, Remi Ronfard, and Edmond Boyer. "A survey of vision-based

methods for action representation, segmentation and recognition." Computer Vision

and Image Understanding 115.2 (2011): 224-241.

[10] Quincy, The invention of the first wearable computer, in The Second International

Symposium on Wearable Computers: Digest of Papers, IEEE Computer Society,

1998, pp. 4–8.

[11] http://www.ccsinsight.com/press/company-news/1944-smartwatches-and-smart-

bands-dominate-fast-growing-wearables-market, 4/2/2015.

[12] https://www.fitbit.com/surge, 4/2/2015.

[13] http://www.businessinsider.com/how-apple-watch-will-dominate-the-smartwatch-

market-2015-3, 4/2/2015.

[14] Komninos, A.; Dunlop, M., "Text Input on a Smart Watch," Pervasive Computing,

IEEE , vol.13, no.4, pp.50,58, Oct.-Dec. 2014

[15] D. Goldberg and C. Richardson, “Touch-typing with a stylus,” in Proc. INTERACT

Human Factors Comput. Syst., 1993, pp. 80–87.

[16] C. H. Blickenstorfer, “Graffiti: Wow!!!!,” Pen Comput. Mag., vol. 1, pp. 30,31,

1995

[17] R. Khan, N. Ibraheem, “Hand Gesture Recognition: A Literature review”,

International Journal of Artificial Intelligence & Applications, pp161-174, 2012.

[18] Shuai Yang; Premaratne, P.; Vial, P., "Hand gesture recognition: An overview,"

Broadband Network & Multimedia Technology (IC-BNMT), 2013 5th IEEE

International Conference, vol., no., pp.63,69, 17-19 Nov. 2013.

31

[19] B. Min, H. Yoon, J. Soh, Y. Yang, T.Ejima,"Hand gesture recognition using hidden

Markov models," Systems, Man, and Cybernetics, 1997. Computational Cybernetics

and Simulation., 1997 IEEE International Conference on , vol.5, no., pp.4232,4235

vol.5, 12-15 Oct 1997.

[20] Dong Y, Hoover A, Scisco J, Muth E. A new method for measuring meal intake in

humans via automated wrist motion tracking. Appl Psychophysiol

Biofeedback. 2012,vol.37, no.3, pp205,215, 2012.

[21] L. R. Rabiner, A Tutorial on Hidden Markov Models and Selected Applications in

Speech Recognition, Proc. IEEE, Vol. 77, No. 2, pp. 257‑286, February 1989.

[22] Rarang, Joel C. "Implementing the Simultaneous Temporal and Contextual Splitting

Algorithm for Modeling Discrete Valued Emission and Functionally Homologous

Proteins." Thesis. California State University, Northridge, 2015. Print.

[23] http://campustechnology.com/Articles/2015/02/26/Wearable-Market-Outlook.aspx,

4/7/2015.

[24] http://finance.yahoo.com/news/wearable-computing-market-report-growth-

192528241.html, 4/7/2015.

[25] http://www.businessinsider.com/the-wearable-computing-market-report-2014-10,

4/7/2015.

32