A Serious Game in Aid of Speech Recognition
Total Page:16
File Type:pdf, Size:1020Kb
SpeechIsHard - A Serious Game in Aid of Speech Recognition Brian Maguire M.A.I. Supervisor: Dr. Saturnino Luz Trinity College Dublin submitted to the University of Dublin, Trinity College, May 21, 2015 Declaration I, Brian Maguire, declare that the following dissertation, except where oth- erwise stated, is entirely my own work; that it has not previously been sub- mitted as an exercise for a degree, either in Trinity College Dublin, or in any other University; and that the library may lend or copy it or any part thereof on request. May 21, 2015 Brian Maguire i Summary This project aimed to research, design and build a serious game that would aid in speech recognition research. The end product is SpeechIsHard, avail- able on the Google Playstore. It is a two player gamification of a map tasks, a popular experiment used in speech research. The report outlines the re- search into speech recognition, mobile game design and serious games. Research into speech recognition focused on areas where a serious game, or game with a purpose could be of help. The map task design was chosen due to the ease at which it could be converted into a game and the experiment's future use in the field, as a method of collecting realistic speech data. The report also looks into the designs used in the current popular mobile games. This research focuses on the aspects of the design which have been described as addictive. It was the aim of this research to pinpoint the key features of a mobile game that makes them widely popular and make use of it in the game design. The report includes a look at the field of serious games, or more gen- erally gamification. It looks at what serious games are, and what their being used for today. The report outlines the final design chosen, as well as an ear- lier design. The designs were based on what was learned from the research undertaken. The final design of SpeechIsHard was then implemented. This report describes some of the technical challenges which the design presents and the solutions that were found for them. The finished implementation ii was then evaluated by way of a survey. The survey's results showed that the game concept was an enjoyable one. The survey also highlighted several flaws. Among them was some issues with the games poor graphics and frustration caused by slow communication offered by the speech recognition. The survey also asked participants about their experience with speech recognition. One comment given in the survey provides interesting further work, of how people change their speech when communicating through a recognition system. The Report finally concludes by saying that in the aims of producing an enjoyable game concept, that might improve speech recognition technology through its play, this project has been some what successful. This comes with the caveat that SpeechIsHard's ability to function as a research tool is not evaluated. The project only evaluates SpeechIsHard on its merits as a game. The report goes on to suggest further work, including further development on the game to address some issues that came up during the evaluation, as well as some different fields in which SpeechIsHard, or a game like it may be of use. iii Abstract This report describes the design and build of a serious game to aid in speech recognition. It covers a brief review of speech recognition technology, how it works and how it might be improved by a game. The final game is available on the playstore under the name SpeechIsHard, and is a two player game that connects players over the Internet. It is loosely based on a gamification of a map task, an experiment used in speech recognition research. The game should allow for the collection of a corpus of speech data as people play. The report includes an evaluation survey on the enjoyability of the game. This evaluation suggests that the game concept has potential as an enjoyable game. SpeechIsHard has the advantage over normal map task experiments in that it provides a much greater scale of use. Acknowledgements I would like to thank Dr. Saturnino Luz for his help and guidence on this project. A special thanks to those that gave their time and took part in the survey. I would also like to thank my family and friends for their love and support. i Contents 1 Introduction 1 1.1 Background . 1 1.1.1 Treadris . 1 1.1.2 Mobile Gaming . 2 1.1.3 Speech Recognition . 3 1.2 Outline . 3 2 Research 5 2.1 Speech Recognition . 5 2.1.1 The Speech Recognition Process . 6 2.1.2 Speech Recognition Evaluation . 9 2.1.3 Map Tasks . 10 2.1.4 Possible Areas of Work . 12 2.2 Gamification & Serious Games . 13 2.3 Game Design and Mobile Gaming . 14 2.3.1 Top Current Mobile Games . 14 2.3.2 Hedonic Adaption . 16 2.3.3 The Zeigarnik Effect . 18 2.3.4 Behavioral Game Design and The Skinner Box . 19 ii 3 Ethics 22 3.1 Addictive Properties . 23 3.2 Data Protection . 24 4 Design 25 4.1 Early Design . 25 4.2 Final Design . 27 4.2.1 Limitations . 30 5 Implementation 32 5.1 Communication . 32 5.1.1 Communication Requirements . 33 5.1.2 Peer 2 Peer . 34 5.1.3 Google Play Games Services (GPGS) . 36 5.2 Speech Recognition . 37 5.2.1 Recognition Requirements . 38 5.3 3d Game Engine . 40 5.3.1 Android Graphics . 40 5.3.2 Unity Game Engine . 41 5.4 Android App . 42 6 Evaluation 45 6.1 Survey . 45 6.2 Discussion . 46 6.2.1 Positive Feedback . 47 6.2.2 Negative Feedback . 49 6.3 Limitations of The Evaluation . 51 iii 7 Conclusions 52 7.1 Further Works . 53 7.1.1 Further Game Development . 53 7.1.2 New Fields of Research . 54 A Evaluation Survey 60 B Survey Results 65 iv List of Figures 1.1 A screen shot from the original Treadris game (20) . 2 2.1 A HMM based speech recogniser from (10) . 7 2.2 Each Phone has a HMM which produces feature vectors (10) 8 2.3 The Map used in (19) . 11 2.4 Screen shot of HabitRPG showing To-Do tasks as characters (12) . 13 2.5 Candy Crush Saga Game Play . 16 2.6 CCS forces players to pay for extra lives or wait before con- tinuing play . 17 2.7 A Skinner Box with a rat as the specimen . 20 4.1 Early design mock up . 26 4.2 Player 1 (left) & Player 2 (right) . 29 4.3 Flowchart explaining how to play the game . 31 5.1 Nat punch trough from (8) . 37 5.2 Unity work environment . 41 5.3 The Rooms Model (left) & The House Model (right) . 42 5.4 Overview of the technologies and how they interact . 44 6.1 Answers from question 1 & 4 of the survey . 47 v 6.2 Answers from question 10 & 14 of the survey . 48 6.3 Answers from question 2 & 7 of the survey . 50 6.4 Answers from question 17 & 15 of the survey . 51 vi Chapter 1 Introduction In the following chapter I will introduce the project, by describing some of the background and the motivation for choosing this subject. I will then go on to give an overview of what is to be expected throughout the rest of the report. 1.1 Background 1.1.1 Treadris This project is based on the serious game Treadris. The aim of the game is to correct results produced from a speech recognition system. The results were shown to the player as word lattices that would start at the top of the screen and move steadily back and towards the bottom of the screen. If not corrected by the time it reached the end of the screen, the word lattice would stay on screen, reducing the amount of time for each subsequent sentence. The player corrections which were likely accurate could then be used to adjust the speech recognition model so that it might not make similar errors in future. The game could also be used as a check for automatic transcriptions of video 1 or podcasts. To progress the ideas started by Treadris, I chose to develop a mobile game for android smart devices that would continue the work of Treadris of being a game for a purpose, that improved speech recognition. Figure 1.1: A screen shot from the original Treadris game (20) 1.1.2 Mobile Gaming I chose to develop a game for the mobile platform in particular. The mobile gaming market has been one of the fastest growing games markets. It has introduced a new type of gaming. Mobile games have taken off with the spread of affordable smart devices. More and more people take a potential gaming machine with them in their pocket as they go about their day. Mobile games are developed to take advantage of the short but many moments of boredom people experience throughout their day. It is my aim with this project to put some of the hours used in playing mobile games to productive use. 2 1.1.3 Speech Recognition Speech recognition has become an important technology for most people. Google Now, Siri, Microsoft's Cortana are all dependant on quick accurate speech recognition. With the advance of wearable technology, such as smart watches, the need for oral interface has become even more important due to their small screens.