Multi-Class Classification of Textual Data: Detection and Mitigation of Cheating in Massively Multiplayer Online Role Playing Games
Total Page:16
File Type:pdf, Size:1020Kb
Wright State University CORE Scholar Browse all Theses and Dissertations Theses and Dissertations 2017 Multi-Class Classification of extualT Data: Detection and Mitigation of Cheating in Massively Multiplayer Online Role Playing Games Naga Sai Nikhil Maguluri Wright State University Follow this and additional works at: https://corescholar.libraries.wright.edu/etd_all Part of the Computer Engineering Commons, and the Computer Sciences Commons Repository Citation Maguluri, Naga Sai Nikhil, "Multi-Class Classification of extualT Data: Detection and Mitigation of Cheating in Massively Multiplayer Online Role Playing Games" (2017). Browse all Theses and Dissertations. 1717. https://corescholar.libraries.wright.edu/etd_all/1717 This Thesis is brought to you for free and open access by the Theses and Dissertations at CORE Scholar. It has been accepted for inclusion in Browse all Theses and Dissertations by an authorized administrator of CORE Scholar. For more information, please contact [email protected]. MULTI-CLASS CLASSIFICATION OF TEXTUAL DATA: DETECTION AND MITIGATION OF CHEATING IN MASSIVELY MULTIPLAYER ONLINE ROLE PLAYING GAMES A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science By NAGA SAI NIKHIL MAGULURI B.Tech., Jawaharlal Nehru Technological University, India, 2015 2017 Wright State University WRIGHT STATE UNIVERSITY GRADUATE SCHOOL April 13, 2017 I HEREBY RECOMMEND THAT THE THESIS PREPARED UNDER MY SUPERVISION BY Naga Sai Nikhil Maguluri ENTITLED Multi-class Classification of Textual Data: Detection and Mitigation of Cheating in Massively Multiplayer Online Role Playing Games BE ACCEPTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science. _____________________________________ Michelle A. Cheatham, Ph.D. Thesis Director _____________________________________ Mateen M. Rizki, Ph.D. Chair, Department of Computer Science and Engineering Committee on Final Examination _____________________________________ Michelle A. Cheatham, Ph.D. _____________________________________ Tanvi Banerjee, Ph.D. _____________________________________ Mateen M. Rizki, Ph.D. _____________________________________ Robert E.W. Fyffe, Ph.D. Vice President for Research and Dean of the Graduate School ABSTRACT Maguluri, Naga Sai Nikhil. M.S., Department of Computer Science and Engineering, Wright State University, 2017. Multi-class Classification of Textual Data: Detection and Mitigation of Cheating in Massively Multiplayer Online Role Playing Games. The success of any multiplayer game depends on the player’s experience. Cheating/Hacking undermines the player’s experience and thus the success of that game. Cheaters, who use hacks, bots or trainers are ruining the gaming experience of a player and are making him leave the game. As the video game industry is a constantly increasing multibillion dollar economy, it is crucial to assure and maintain a state of security. Players reflect their gaming experience in one of the following places: multiplayer chat, game reviews, and social media. This thesis is an exploratory study where our goal is to experiment and propose a new way to detect, mitigate cheating in Massively Multiplayer Online Role Playing Games by performing a multiclass classification on these unstructured textual data to categorize cheaters and victims with good classification accuracy that is acceptable for practical applications. In this thesis, First, we have studied the current situation regarding cheating and anti-cheating in online games. Second, we have studied various Natural Language Processing and Machine learning methods and tools for text classification. Third, a general iii method for automatic player categorization is proposed and finally, its performance is evaluated by experimenting on various datasets. iv Table of Contents 1. Introduction ..................................................................................................................1 1.1. Overview ...............................................................................................................1 1.2. Motivation for Cheaters ........................................................................................4 1.3. Problem Statement ................................................................................................4 1.4. Current Trends.......................................................................................................5 1.5. Purpose, Scope, and Contribution .........................................................................6 1.6. Methodology .........................................................................................................7 2. Literature Survey .........................................................................................................9 2.1. Cheating in Online Gaming...................................................................................9 2.1.1. Overview ........................................................................................................9 2.1.2. Game Types .................................................................................................10 2.1.3. Qualitative risk analysis of Game types.......................................................10 2.1.4. General Architecture of MMO and MO games ...........................................11 2.1.5. Cheats and Exploits......................................................................................13 2.1.1. Anti-cheat Software .....................................................................................21 2.1.2. Related Work ...............................................................................................23 2.2. Machine Learning ...............................................................................................26 2.2.1. Overview ......................................................................................................26 2.2.2. Types of Machine Learning .........................................................................27 2.3. Natural Language Processing ..............................................................................28 2.3.1. Overview ......................................................................................................28 2.3.2. N-grams........................................................................................................29 2.3.3. Tokenization and Sentence Segmentation ...................................................29 2.3.4. Term Frequency – Inverse Document Frequency ........................................30 2.3.5. Stemming .....................................................................................................30 2.4. Automatic Text Categorization ...........................................................................31 2.4.1. Overview ......................................................................................................31 2.4.2. Logistic Regression ......................................................................................31 2.4.3. Naïve Bayes .................................................................................................33 2.4.4. Random Forest .............................................................................................34 2.4.5. Support Vector Machine ..............................................................................35 2.4.6. Related Work ...............................................................................................35 v 3. Building Classifier .....................................................................................................40 3.1. Introduction .........................................................................................................40 3.2. Identification of Keywords..................................................................................42 3.3. Datasets ...............................................................................................................43 3.3.1. Data Collection SM-GEN, SM-CSGO ........................................................43 3.3.2. Data Collection MC-TF2C, MC-TF2S ........................................................47 3.3.3. Data Collection RV-CSGO ..........................................................................48 3.4. Labeling the data .................................................................................................49 3.5. Preprocessing the data .........................................................................................52 3.5.1. SM-GEN, SM-CSGO ..................................................................................52 3.5.2. MC-TF2C, MC-TF2S ..................................................................................55 3.5.3. RV-CSGO ....................................................................................................56 3.6. Feature Extraction ...............................................................................................56 4. Experiment and Results .............................................................................................65 4.1. Introduction .........................................................................................................65 4.2. Classifiers ............................................................................................................65 4.3. Evaluation Metric ................................................................................................66