How Can Game Developers Leverage Data from Online
Total Page:16
File Type:pdf, Size:1020Kb
HOW CAN GAME DEVELOPERS LEVERAGE DATA FROM ONLINE DISTRIBUTION PLATFORMS?A CASE STUDY OF THE STEAM PLATFORM by DAYI LIN A thesis submitted to the School of Computing in conformity with the requirements for the Degree of Doctor of Philosophy Queen’s University Kingston, Ontario, Canada January 2019 Copyright © Dayi Lin, 2019 Abstract EVELOPING a successful game is challenging. Prior work shows that gamers are extremely difficult to satisfy, making the quality of games an important D issue. Prior work has yielded important results from mining data that is available on the online distribution platforms for software applications, helping practi- tioners save valuable resources, and improving the user-perceived quality of software that is distributed through these platforms. However, much of the work on mining online distribution platforms focuses on mining mobile app stores (e.g., Google Play Store, Apple App Store). Video game development differs from the development of other types of software. Hence, knowledge derived from mining mobile app stores may not be directly applicable to game development. In this Ph.D. thesis, we focused on mining online distribution platforms for games. In particular, we mined data from the Steam platform, the largest digital distribution platforms for PC gaming, with over 23,000 games available and over 184 million active i users. More specifically, we analyzed the following four aspects of online distribution for games: urgent updates; the early access model (which enables game developers to sell unfinished versions of their games); user reviews; and user-recorded gameplay videos on the Steam platform. We observed that the choice of update strategy is as- sociated with the proportion of urgent updates that developers have to release. Early access game developers can use the early access model as a method for eliciting early feedback and more positive reviews to attract additional new players. In addition, al- though negative reviews contain more valuable information for developers, the por- tion of useful information in positive reviews should not be ignored by developers and researchers. Finally, we proposed an approach for determining the likelihood that a gameplay video demonstrates a game bug, with both a mean average precision at 10 and a mean average precision at 100 of 0.91. Our approach can help game developers leverage readily available gameplay videos to automatically collect otherwise hard-to- gather bug reports for games. The results of our empirical studies highlight the value of mining online distribution platforms for games in offering practical suggestions for game developers. ii Acknowledgments am extremely grateful for the supervision I received from Prof. Ahmed E. Hassan. Without his continuous guidance and support, this thesis would not be possi- I ble. I would like to thank Prof. Cor-Paul Bezemer for his patient help and guidance throughout my graduate study. I would also like to thank my co-authors Prof. Ying Zou and Prof. Chakkrit Tantithamthavorn for their contributions and feedback in the studies. In addition, I appreciate Prof. Nicholas Graham, Prof. Hossam Hassanein, Prof. Mohammad Zulkernine, Prof. Kai Salomaa, Prof. Robin Dawes, Prof. Yuan Tian, Prof. Ali Etemad, and Prof. Ali Ouni for the advice that they gave me. I thank Prof. Jennifer R. Whitson from the University of Waterloo and Mr. Patrick Walker from EEDAR for the inspiring discussions about the gaming industry, and Sergey Galyonkin, the owner of Steam Spy, who generously gave us access to all the historical data of Steam as collected by Steam Spy for my research. I am also iii grateful for the attention and feedback that I received from the gaming community. In particular, I appreciate Lars Doucet for introducing me to the indie game developer community, and providing me with their data. My appreciation extends to my fellow labmates and alumni, visiting researchers, and friends from the Software Analysis and Intelligence Lab (SAIL), including Prof. Tse- Hsun (Peter) Chen, Prof. Weiyi Shang, Prof. Yasutaka Kamei, Prof. Daniel Alencar da Costa, Dr. Gustavo Ansaldi Oliva, Dr. Mark D. Syer, Dr. Mohamed Sami Rakha, Dr. Shaowei Wang, Daniel Lee, Gopi Krishnan Rajbahadur, Heng Li, Jiayuan Zhou, Safwat Hassan. I also thank my friends Chenyang Song, Chunyang Zhu, David Cox, Hao Li, Huixiang Huang, Michal Pasternak, Wenbo Liu, Xiaohe Zhang, for all the memories we shared in the journey. I would also like to thank Prof. Xiaoge Li from Xi’an University of Posts and Telecommunication, who inspired me to pursue graduate studies. Special thanks to my father and mother, Huijian Lin and Wenzhi Chen, who have been offering me their unconditional love throughout my entire life. I dedicate this thesis to my father and mother. iv Co-authorship N all chapters and related publications of the thesis, my contributions are: draft- ing the initial research idea; researching background knowledge and related I work; collecting data; conducting experiments and data analysis; and writing and polishing drafts. Earlier versions of the work in the thesis were published as listed below: 1. Studying the urgent updates of popular games on the Steam platform. Dayi Lin, Cor-Paul Bezemer and Ahmed E. Hassan. Empirical Software Engineer- ing Journal (EMSE), 2017, 22(4), pp.2095-2126. 2. An empirical study of early access games on the Steam platform. Dayi Lin, Cor-Paul Bezemer and Ahmed E. Hassan. Empirical Software Engineer- ing Journal (EMSE), 2018, 23(2), pp.771-799. v 3. An empirical study of game reviews on the Steam platform. Dayi Lin, Cor-Paul Bezemer, Ying Zou and Ahmed E. Hassan. Empirical Software Engineering Journal (EMSE), 2018, In Press. vi Statement of Originality , Dayi Lin, hereby declare that I am the sole author of this thesis. All ideas and in- ventions attributed to others have been properly referenced. This is a true copy I of the thesis, including any required final revisions, as accepted by my examin- ers. I understand that my thesis may be made electronically available to the public. vii Table of Contents Abstract i Acknowledgments iii Co-authorship v Statement of Originality vii List of Tables x List of Figures xii 1 Introduction 1 1.1 Thesis Statement ......................................... 2 1.2 Thesis Overview .......................................... 3 1.3 Thesis Contribution ....................................... 6 2 Background on the Steam Platform 8 3 Literature Survey 12 3.1 Literature Selection ....................................... 12 3.2 Mining Online Distribution Platforms .......................... 14 3.3 Software Engineering and Games ............................. 19 4 Studying Urgent Updates of Popular Games on the Steam Platform 24 4.1 Introduction ............................................ 25 4.2 Background on the Study of Urgent Updates ..................... 26 4.3 Our Methodology for Studying Urgent Updates ................... 30 4.4 Preliminary Study of the Update Cycles of the Studied Steam Games . 38 4.5 Results of our Study of Urgent Updates of Popular Steam Games . 50 4.6 Threats to Validity ........................................ 61 4.7 Chapter Summary ........................................ 62 viii 5 Studying Early Access Games on the Steam Platform 65 5.1 Introduction ............................................ 66 5.2 Background on the Study of Early Access Games . 69 5.3 Our Methodology for Studying Early Access Games . 73 5.4 Results of our Study of Early Access Games (EAGs) . 78 5.5 Additional Interesting Observations ........................... 94 5.6 Threats to Validity ........................................100 5.7 Chapter Summary ........................................102 6 Studying Game Reviews on the Steam Platform 105 6.1 Introduction ............................................106 6.2 Our Methodology for Studying Game Reviews . 108 6.3 Preliminary Study of the Characteristics of Game Reviews . 112 6.4 Results of our Study of Game Reviews . 128 6.5 Threats to Validity ........................................145 6.6 Chapter Summary ........................................148 7 Studying Gameplay Videos on the Steam Platform 150 7.1 Introduction ............................................151 7.2 Background on the Study of Gameplay Videos . 153 7.3 Data Collection of Our Study of Gameplay Videos . 155 7.4 Preliminary Study of Gameplay Videos on the Steam Platform . 159 7.5 Determining the Likelihood that a Gameplay Video Showcases Game Bugs167 7.6 Related Work ............................................180 7.7 Threats to Validity ........................................182 7.8 Chapter Summary ........................................183 8 Conclusion and Future Work 185 8.1 Thesis Contributions ......................................186 8.2 Future Research Directions ..................................188 ix List of Tables 2.1 Comparison between the number of games on Steam and on other PC gaming distribution platforms (as of Dec 19, 2017) ................. 9 2.2 All available Steam channels ................................. 10 3.1 Names of conferences and journals as starting venues of the literature survey ................................................. 13 4.1 Basic information about the studied games on Steam, sorted by the number of players (as of January 12, 2016) ....................... 33 4.2 Update note for the Team Fortress 2 game ....................... 34 4.3 Dataset description of our study of urgent updates . 37 4.4 Updates of studied games on Steam, sorted by the kurtosis