arXiv:2103.17191v2 [cs.CL] 14 Apr 2021 ihteavn fsca ei,at-oiland anti-social media, social of advent the With nislts eoto niehrsmn ( harassment online on report latest its in ttshv xeine bsv eaironline, which behavior of abusive experienced have States 2017 result online. a encounters as their problems of health mental other and iety, anx- depression, develop con- may children, children on that cluding abuse online of ill-effects the ied itlpolmo u ie ur ( Munro so- time. important our an of it problem make cietal individuals effects on psychological abuse of Undesirable occur- prominent online. a rence become has behavior abusive Introduction 1 uhamsr@bcm [email protected], [email protected], ♣ nttt o oi,Lnug n optto,University Computation, and Language Logic, for Institute erve n nlz h tt fteatmeth- art the of state the analyze and review we xliaiiyi iwo h properties. the of view in of explainability operationalization effective the discuss properties and these of realization the facilitate can community and We user how exhibit. describe to aim an should method that explainable properties proposing lan- detection, abusive in guage ad- explainability of we topic Finally, the research. dress future guide to tions considera- and out user laying information, incorporating community of the challenges explore then ethical We detec- language. and abusive understanding of tion the enhance informa- to community or tion user leverage that ods Specifically, detection. abuse com- in online plays and munities users of modeling discuss that we role paper, the position this the In NLP. in of field detection language towards abusive effort research automated there substantial years, a lasting. been few has and past the profound over be Consequently, can individuals abuse of on effects various psychological across The abuse platforms. of types Inter- other of at- and personal Millions tacks, racism, soci- harassment, face important time. users our net an of is problem Internet etal the on Abuse ,rvae that revealed ), oeigUesadOln omnte o bs Detection: Abuse for Communities Online and Users Modeling 18% ♠ uha Mishra Pushkar aefcdsvr om fharass- of forms severe faced have eateto nomtc,Kn’ olg odn ntdK United London, College King’s Informatics, of Department Abstract 40% oiino tisadExplainability and Ethics on Position A faut nteUnited the in adults of e eerhCenter Research Pew ⋆ aeokA,Lno,Uie Kingdom United London, AI, Facebook ⋆ ee Yannakoudakis Helen , 2011 Duggan stud- ) , , while Directed iee bsv ( abusive con- is sidered what affect can across platforms norms (online) Different different formalize. to difficult and new NLP. has a in up detection years, sprung language abusive recent on in effort research Hence, mod- and systems. detection statistics automated eration These for need nature. the sexual stress of that e.g., ment, omo xltvs eoaoywrso threats, or words derogatory expletives, of form ihae l ( al. et while profanity, Mishra and language derogatory speech, n grso pehsc smtpo rsarcasm. or metaphor as such speech of terms figures ambiguous and of presence the by characterized poe to opposed iyaueit ra aeoisbsdon based ness categories broad into abuse sify language, natural of context the In eddadTetreault ( and networks Mehdad neural from tions ( engineering feature cial 2010 e tal. et nen al. et oregandve,Wse ta.( al. group et Waseem or view, person course-grained particular abuse a define offend we type, specific as the However, of ambiguous. regardless and overlapping be to abuse tend of types different for definitions The sexism. ino bsv agaehv enpooe,in- ( proposed, rule-based been have cluding language abusive of tion eth- or gender particular a as nicity. such group larger a ye fngtv xrsin.Freape No- ( example, al. For et bata expressions. negative of types fine-grained different many encompasses that term htsi,tento fauehspoe elusive proven has abuse of notion the said, That odt,svrlapoce oatmtddetec- automated to approaches