Identification of Machine- Generated Reviews 1D CNN Applied on the GPT-2 Neural Language Model

DEGREE PROJECT IN TECHNOLOGY, FIRST CYCLE, 15 CREDITS STOCKHOLM, SWEDEN 2020 Identification of machine- generated reviews 1D CNN applied on the GPT-2 neural language model STAFFAN AL-KADHIMI PAUL LÖWENSTRÖM KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE Identification of machine-generated reviews 1D CNN applied on the GPT-2 neural language model STAFFAN AL-KADHIMI PAUL LÖWENSTRÖM Degree Project in Computer Science, DD142X Date: June 8, 2020 Supervisor: Christopher Peters Examiner: Pawel Herman School of Electrical Engineering and Computer Science Swedish title: Identifiering av maskingenererade recensioner: 1D CNN applicerat på den neurala språkmodellen GPT-2 iii Abstract With recent advances in machine learning, computers are able to create more convincing text, creating a concern for an increase in fake information on the internet. At the same time, researchers are creating tools for detecting computer-generated text. Researchers have been able to exploit flaws in neural language models and use them against themselves; for example, GLTR provides human users with a visual representation of texts that assists in classification as human-written or machine-generated. By training a convolutional neural network (CNN) on GLTR output data from analysis of machine-generated and human-written movie reviews, we are able to take GLTR a step further and use it to automatically perform this classification. However, using a CNN with GLTR as the main source of data for classification does not appear to be enough to be on par with the best existing approaches. iv Sammanfattning I och med de senaste framstegen inom maskininlärning kan datorer skapa mer och mer övertygande text, vilket skapar en oro för ökad falsk information på internet. Samtidigt vägs detta upp genom att forskare skapar verktyg för att identifiera datorgenererad text. Forskare har kunnat utnyttja svagheter i neurala språkmodeller och använ- da dessa mot dem. Till exempel tillhandahåller GLTR användare en visuell representation av texter, som hjälp för att klassificera dessa som människo- skrivna eller maskingenererade. Genom att träna ett faltningsnätverk (convolutional neural network, eller CNN) på utdata från GLTR-analys av maskingenererade och människoskrivna filmrecensioner, tar vi GLTR ett steg längre och använder det för att genomföra klassifikationen automatiskt. Emellertid tycks det ej vara tillräckligt att använ- da en CNN med GLTR som huvuddatakälla för att klassificera på en nivå som är jämförbar med de bästa existerande metoderna. Contents 1 Introduction 1 1.1 Research Question . .2 1.2 Scope . .2 1.3 Approach . .3 1.4 Thesis Outline . .3 2 Background 4 2.1 Neural language models . .4 2.1.1 Transfer learning . .4 2.1.2 GPT-2 . .5 2.2 Detecting machine-generated text . .5 2.2.1 GAN-based approaches . .5 2.2.2 Neural LMs as weapons against themselves . .6 2.3 Text classification using CNNs . .7 3 Method 10 3.1 Text preparation stage . 10 3.2 GLTR stage . 11 3.3 CNN training stage . 12 3.4 CNN testing stage . 13 4 Results 14 4.1 Overall performance . 14 4.2 Impact of text length . 16 5 Discussion 18 5.1 Limitations . 20 5.2 Future Work . 20 6 Conclusions 21 v vi CONTENTS Bibliography 22 A Structure Diagram 25 B Review Samples 27 Chapter 1 Introduction “OpenAI has published the text-generating AI it said was too dangerous to share” —The Verge, November 2019.1 Never before has it been easier to publish information, and now with the recent improvements in machine learning, computers can create text that is hard to differentiate from human writing. Simple methods for machine generation of text have existed for a relatively long time, but in recent years it has been possible to utilize deep learning to achieve better results (Gatt and Krahmer 2017). In 2019, the research organi- zation OpenAI released GPT-2, which is one of the latest additions to the text generating neural language models, and has been recognized for its capabili- ties in generating false information including fake news1 (Radford et al. 2019). This has sprouted concerns about the future and the dangers of text-generating artificial intelligence (AI)1 (Radford et al. 2019). As technology continues to improve, we may need ways to distinguish real from fake automatically, considering the potential massive amount of false information that could be published on different media. Detection of automatically generated text is something that is being actively researched. For example, GLTR is a tool that has been recognized for how it can help people detect machine-generated text by visualizing important data given by GPT-2.2 Today people choose movies, restaurants, airlines, hair salons, and more 1Vincent, James (2019). OpenAI has published the text-generating AI it said was too dangerous to share. url: https://www.theverge.com/2019/11/7/20953040/ openai-text-generation-ai-gpt-2-full-model-release-1-5b-parameters (visited 2020-02-15) 2Quach, Katyanna (2019). Remember the OpenAI text spewer that was too dangerous to release? Fear not, boffins have built a BS detector for it. url: https://www. theregister.co.uk/2019/03/11/openai_gltr_ai/ (visited 2020-05-13) 1 2 CHAPTER 1. INTRODUCTION based on online reviews that they read. The integrity of reviews could be chal- lenged if computers start to fill the internet with computer-generated reviews. This could in turn deceive people into buying a product or service, or more seriously, cripple companies and potential adversaries. The number of fake reviews on the web today is likely large; Akoglu, Chandy, and Faloutsos (2013) estimate that around 20% of the reviews on Yelp are faked by paid human writers. Automatically generating convincing reviews is also already possible, as evidenced by Adelani et al. (2019). In a capitalistic society where everyone wants to outperform their opponents, it may only be a matter of time before computer-generated reviews become a common weapon in the company arsenal. 1.1 Research Question The purpose of our work is to investigate a possible extension of GLTR, that would be possible to use as part of spam filters or similar utilities. In this thesis, we will therefore study: “Is it possible to automatically classify reviews as human-written or machine-generated, using the text body mapped to GLTR values as input?” Under the assumption that the answer to the aforementioned research question is “yes”, the following will be studied as subquestions: • “How well would this adaptation work for detecting reviews written by models other than the one GLTR was trained on?” • “How does the length of a given review influence the classification performance?” • “Would this adaptation be on par with other automatic detection methods?” 1.2 Scope There are a large number of ways to generate text, and due to this we chose to limit our study to the GPT-2 language model only, and we also limited it to the smaller versions of it (124M and 355M) due to limited available resources. There is also a large number of types of reviews; by limiting ourselves to a single topic we can reduce the amount of variation in our text, and thus both reduce the complexity of our project and make our tool more useful in practice CHAPTER 1. INTRODUCTION 3 for entities such as websites. In our case, we focus on movie reviews from the media website and database IMDB, which makes our work potentially useful for identifying computer-generated reviews posted there. 1.3 Approach In short, we utilize GPT-2 to find out how likely it is to write text similar to what is analyzed. A similar project, GLTR, has analyzed text before with this approach with promising results (Gehrmann, Strobelt, and Rush 2019), and we integrate it as the core of our framework. However, unlike GLTR which is a tool for making it easier for humans to identify machine-generated text (Gehrmann, Strobelt, and Rush 2019), we use a convolutional neural network to automatically classify the text based on the data extracted from the analysis. 1.4 Thesis Outline The following chapter explores the current state of fake text/review detection, in particular related work that has emerged following the development of GPT-2 and similar research. It also introduces the building blocks of the project: neural language models, GPT-2, GLTR, and convolutional neural networks. This is followed by Chapter 3, where we describe our approach in detail and go through the structure of the neural network we create. In Chapter 4, we present the results and compare the models we create. Chapter 5 is then delegated to discussion of the results as well as limitations and future work, and finally, Chapter 6 holds the conclusion where we summarize our work and discuss it in a broader context. Chapter 2 Background 2.1 Neural language models A language model (LM) essentially describes the probability of each possible token (e.g., a word) appearing as the next one given an input text (Mikolov et al. 2010). Traditional models are based on n-grams and look at the n previous words to decide what word should follow (Arisoy et al. 2012). More recently, neural networks have risen into the spotlight for language model generation due to their handling of data sparseness; neural networks are able to avoid data sparseness issues through embedding words in a continuous space, which removes issues with small changes in probabilities creating a big impact (Arisoy et al. 2012). In particular, recurrent neural networks (RNN) with long-short term memory have been able to build on previous approaches by enhancing the network’s memory capability (Józefowicz et al. 2016). Vaswani et al. (2017) introduced transformer architectures, which have the potential to take the place of RNNs.

Identification of Machine- Generated Reviews 1D CNN Applied on the GPT-2 Neural Language Model

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support