Stre: Self Attentive Edit Quality Prediction in Wikipedia

StRE: Self Attentive Edit Quality Prediction in Wikipedia Soumya Sarkar∗1, Bhanu Prakash Reddy*2, Sandipan Sikdar3 Animesh Mukherjee4 IIT Kharagpur, India1,2,4,RWTH Aachen, Germany3 [email protected],[email protected] [email protected], [email protected] Abstract guidelines for them1. Problem: The inherent openness of Wikipedia has Wikipedia can easily be justified as a behe- also made it vulnerable to external agents who moth, considering the sheer volume of content that is added or removed every minute to intentionally attempt to divert the unbiased, ob- its several projects. This creates an immense jective discourse to a narrative which is aligned scope, in the field of natural language pro- with the interest of the malicious actors. Our pilot cessing toward developing automated tools for study on manually annotated 100 Wikipedia pages content moderation and review. In this paper of four categories (25 pages each category) shows we propose Self Attentive Revision Encoder us that at most 30% of the edits are reverted (See (StRE) which leverages orthographic similar- Fig1). Global average of number of reverted dam- ity of lexical units toward predicting the qual- ∼ 2 ity of new edits. In contrast to existing propo- aging edits is 9% . This makes manual interven- sitions which primarily employ features like tion to detect these edits with potential inconsis- page reputation, editor activity or rule based tent content, infeasible. Wikipedia hence deploys heuristics, we utilize the textual content of machine learning based classifiers (West et al., the edits which, we believe contains superior 2010; Halfaker and Taraborelli, 2015) which pri- signatures of their quality. More specifically, marily leverage hand-crafted features from three we deploy deep encoders to generate repre- aspects of revision (i) basic text features like re- sentations of the edits from its text content, peated characters, long words, capitalized words which we then leverage to infer quality. We further contribute a novel dataset containing etc. (ii) temporal features like inter arrival time be- ∼ 21M revisions across 32K Wikipedia pages tween events of interest (iii) dictionary based fea- and demonstrate that StRE outperforms exist- tures like presence of any curse words or informal ing methods by a significant margin – at least words (e.g., ‘hello’, ‘yolo’). Other feature based 17% and at most 103%. Our pre-trained model approaches include (Daxenberger and Gurevych, achieves such result after retraining on a set as 2013; Bronner and Monz, 2012) which generally small as 20% of the edits in a wikipage. This, follow a similar archetype. to the best of our knowledge, is also the first attempt towards employing deep language mod- Proposed model: In most of the cases, the ed- els to the enormous domain of automated con- its are reverted because they fail to abide by the tent moderation and review in Wikipedia. edit guidelines, like usage of inflammatory word- ing, expressing opinion instead of fact among oth- 1 Introduction ers (see Fig2). These flaws are fundamentally related to the textual content rather than tempo- Wikipedia is the largest multilingual encyclopedia ral patterns or editor behavior that have been de- known to mankind with the current English ver- ployed in existing methods. Although dictionary sion consisting of more than 5M articles on highly based approaches do look into text to a small ex- diverse topics which are segregated into cate- tent (swear words, long words etc.), they account gories, constructed by a large editor base of more for only a small subset of the edit patterns. We than 32M editors (Hube and Fetahu, 2019). To en- further hypothesize that owing to the volume and courage transparency and openness, Wikipedia allows anyone to edit its pages albeit with certain 1en.wikipedia.org/Wikipedia:List of policies *Both∗ authors contributed equally 2stats.wikimedia.org/EN/PlotsPngEditHistoryTop.htm 3962 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3962–3972 Florence, Italy, July 28 - August 2, 2019. c 2019 Association for Computational Linguistics Facebook and Myspace are social Facebook is lot better than Myspace 7000 Total Edits networking sites is every aspects. Reverted Edits 6000 Facebook formerly known as thefacebook 5000 Facebook formerly known as thefacebook is is RETARTED as well as a social networking a social networking service for high school, 4000 service for high school, college, university college, university communities communities 3000 Edit Count 2000 Dr Eric Schmidt, FORMER CEO OF NOVELL, [[Eric Schmidt]], took over Google’s CEO took over Google’s CEO when co-founder 1000 when co-founder Larry Page stepped down Larry Page stepped down 0 Company Person Technology Concept Category Google is foraying into other Google is trying to become the jack of businesses, which other companies all trades have recently dominated. Figure 1: Average number of edits and average number of damaging edits, i.e., reverted edits for four different Figure 2: Examples of edits in Facebook and Google categories of pages. A fraction (at most 30%) of user Wikipedia page. The blue bubbles are the original generated edits are damaging edits. sentences. The orange bubbles indicate damaging edits while the green bubbles indicate ‘good faith’ edits. Good faith edits are unbiased formal English sentence variety of Wikipedia data, it is impossible to de- while damaging edits often correspond to incoherent velop a feature driven approach which can en- use of language, abusive language, imperative mood, compass the wide array of dependencies present opinionated sentences etc. in text. In fact, we show that such approaches are inefficacious in identifying most of the damag- our model to pages with lower number of edits. ing edits owing to these obvious limitations. We hence propose Self Atttentive Revision Encoder Contributions: Our primary contributions in this (StRE) which extracts rich feature representations paper are summarized below - of an edit that can be further utilized to predict (i) We propose a deep neural network based model whether the edit has damaging intent. In specific, to predict edit quality in Wikipedia which utilizes we use two stacked recurrent neural networks to language modeling techniques, to encode seman- encode the semantic information from sequence tic information in natural language. of characters and sequence of words which serve (ii) We develop a novel dataset consisting of a twofold advantage. While character embeddings ∼ 21M unique edits extracted from ∼ 32K extract information from out of vocabulary tokens, Wikipedia pages. In fact our proposed model i.e., repeated characters, misspelled words, mali- outperforms all the existing methods in detecting cious capitalized characters, unnecessary punctu- damaging edits on this dataset. ation etc., word embeddings extract meaningful (iii) We further develop a transfer learning set up features from curse words, informal words, imper- which allows us to deploy our model to newer cat- ative tone, facts without references etc. We further egories without the need for training from scratch. employ attention mechanisms (Bahdanau et al., Code and sample data related to the pa- 2014) to quantify the importance of a particular per are available at https://github.com/ character/word. Finally we leverage this learned bhanu77prakash/StRE. representation to classify an edit to be damaging 2 The Model or valid. Note that StRE is reminiscent of struc- tured self attentive model proposed in (Lin et al., In this section we give a detailed description of 2017) albeit used in a different setting. our model. We consider an edit to be a pair of Findings: To determine the effectiveness of our sentences with one representing the original (Por) model, we develop an enormous dataset consisting while the other representing the edited version of ∼ 21M edits across 32K wikipages. We observe (Ped). The input to the model is the concatenation that StRE outperforms the closest baseline by at of Por and Ped (say P = fPorjjPedg) separated by a least 17% and at most 103% in terms of AUPRC. delimiter (‘jj’). We assume P consists of wi words Since it is impossible to develop an universal and ci characters. Essentially we consider two lev- model which performs equally well for all cat- els of encoding - (i) character level to extract pat- egories, we develop a transfer learning (Howard terns like repeated characters, misspelled words, and Ruder, 2018) set up which allows us to de- unnecessary punctuation etc. and (ii) word level ploy our model to newer categories without train- to identify curse words, imperative tone, opinion- ing from scratch. This further allows us to employ ated phrases etc. In the following we present how 3963 we generate a representation of the edit and utilize an MLP. Each embedded character is then passed it to detect malicious edits. The overall architec- through a bidirectional LSTM to obtain the hidden ture of StRE is presented in Fig3. states for each character. Formally, we have 2.1 Word encoder yi = s(Wcci + bc); i 2 [0;T] (7) Given an edit P with wi;i 2 [0;L] words, we first −! −−−! h i = LSTM(yi); i 2 [0;T] (8) embed the words through a pre-trained embedding − −−− matrix We such that xi = Wewi. This sequence of h i = LSTM(yi); i 2 [T;0] (9) embedded words is then provided as an input to a bidirectional LSTM (Hochreiter and Schmidhu- We next calculate the attention scores for each ber, 1997) which provides representations of the hidden state hi as words by summarizing information from both di- z = s(W h + b ) (10) rections. i c i c exp(zT u ) a = i c (11) x = W w ; i 2 [0;L] (1) i T T i e i ∑i=0 exp(zi uc) −! −−−! v i = LSTM(xi); i 2 [0;L] (2) Rc = ∑aihi (12) − −−− i v i = LSTM(xi); i 2 [L;0] (3) Note that uc is a character context vector which We obtain the representation for each word by is learned during training.

Load more