Explainable Tsetlin Machine framework for fake news detection with credibility score assessment

Bimal Bhattarai Ole-Christoffer Granmo Lei Jiao University of Agder University of Agder University of Agder [email protected] [email protected] [email protected]

Abstract particularly problematic as they seek to deceive people for political and financial gain (Gottfried The proliferation of fake news, i.e., news in- and Shearer, 2016). tentionally spread for misinformation, poses a threat to individuals and society. Despite In recent years, we have witnessed extensive various fact-checking websites such as Politi- growth of fake news in social media, spread across Fact, robust detection techniques are required news blogs, Twitter, and other social platforms. to deal with the increase in fake news. Sev- At present, most online misinformation is manu- eral models show promising re- ally written (Vargo et al., 2018). However, natural sults for fake news classification, however, language models like GPT-3 enable automatic gen- their black-box nature makes it difficult to ex- eration of realistic-looking fake news, which may plain their classification decisions and quality- assure the models. We here address this prob- accelerate future growth. Such growth is problem- lem by proposing a novel interpretable fake atic as most people nowadays digest news stories news detection framework based on the re- from social media and news blogs (Allcott and cently introduced Tsetlin Machine (TM). In Gentzkow, 2017). Indeed, the spread of fake news brief, we utilize the conjunctive clauses of poses a severe threat to journalism, individuals, and the TM to capture lexical and semantic prop- society. It has the potential of breaking societal be- erties of both true and fake news text. Fur- lief systems and encourages biases and false hopes. ther, we use the clause ensembles to calcu- late the credibility of fake news. For eval- For instance, fake news related to religion or gender uation, we conduct experiments on two pub- inequality can produce harmful prejudices. Fake licly available datasets, PolitiFact and Gossip- news can also trigger violence or conflict. When Cop, and demonstrate that the TM framework people are frequently exposed to fake news, they significantly outperforms previously published tend to distrust real news, affecting their ability baselines by at least 5% in terms of accuracy, to distinguish between truth and untruth. To re- with the added benefit of an interpretable logic- duce these negative impacts, it is critical to develop based representation. Further, our approach methods that can automatically expose fake news. provides higher F1-score than BERT and XL- Net, however, we obtain slightly lower accu- Fake news detection introduces various challeng- racy. We finally present a case study on our ing research problems. By design, fake news in- model’s explainability, demonstrating how it tentionally deceives the recipient. It is therefore arXiv:2105.09114v1 [cs.CL] 19 May 2021 decomposes into meaningful words and their difficult to detect fake news based on linguistic con- negations. tent, style, and diverseness. For example, fake news may narrate actual events and context to support 1 Introduction false claims (Feng et al., 2012). Thus, other than Social media platforms on the Internet have be- hand-crafted and data-specific features, we need to come an integral part of everyday life, and more employ a knowledge base of linguistic patterns for and more people obtain news from social rather effective detection. Training fake news classifiers than traditional media, such as newspapers. Key on crowdsourced data may further provide a poor drivers include cost-effectiveness, freedom to com- fit for future news events. Fake news is emerging ment, and shareability among friends. Despite continuously, quickly rendering previous textual these advantages, social media exposes people to content obsolete. Accordingly, some studies, such abundant misinformation. Fake news stories are as (Guo et al., 2018), have tried to incorporate so- cial context and hierarchical neural networks using Lukoianova, 2015). Many studies such as (Wang, attention to uncover more lasting semantic patterns. 2017)(Mitra and Gilbert, 2015)(Shu et al., 2020a) Despite significant advances in deep learning- incorporate publicly available datasets, providing a based techniques for fake news detection, few ap- basis for detailed analysis of fake news and detec- proaches can explain their classification decisions. tion methods. Currently, knowledge on the dynamics underlying Recently, deep learning-based latent representa- fake news is lacking. Thus, explaining why cer- tion of text has significantly improved fake news tain news items are considered fake may uncover classification accuracy (Karimi and Tang, 2019). new understanding. Making the reasons for deci- However, the latent representations are generally sions transparent also facilitates discovering and difficult to interpret, providing limited insight into rectifying model weaknesses. To the best of our the nature of fake news. In (Castillo et al., 2011), knowledge, previous research has not yet addressed the authors introduced features based on social con- explainability in fake news detection. text, obtained from the profiles of users and their Paper contributions: In this paper, we propose activity patterns. Other approaches depend upon so- an explainable framework for fake news detection cial platform-specific features such as likes, tweets, built using the Tsetlin Machine (TM). Our TM- and retweets for (Volkova et al., based framework captures the frequent patterns that 2017)(Ruchansky et al., 2017). characterize real news, distilling both linguistic and While the progress on detecting fake news has semantic patterns unique for fake news stories. The been significant, limited effort has been directed to- resulting model constitutes a global interpretation wards interpretability. Existing deep learning meth- of fake news. To provide a more refined view on ods generally extract features to train classifiers individual fake news (local interpretability), we without giving any interpretable explanation. This also propose a credibility score for measuring the lack of transparency makes them black boxes when credibility of news. Finally, our framework allows it comes to understandability (Du et al., 2019). In the practitioner to see what features are critical for this paper, we propose a novel fake news detection making news fake (global interpretability). method that builds upon the TM (Granmo, 2018). Paper organization: The remainder of the pa- The TM is a recent approach to pattern classifica- per is organized as follows. In Section2, we briefly tion, regression, and novelty detection (Abeyrathna summarize related work. We then detail our ex- et al., 2021; Yadav et al., 2021; Abeyrathna et al., plainable framework in Section3. In Section4, the 2020; Bhattarai et al., 2021) that attempts to bridge datasets and experimental configurations are pre- the present gap between interpretability and accu- sented. We analyze our empirical results in Section racy in state-of-the-art . By using 5, before we conclude the work in the last section. the AND-rules of the TM to capture lexical and semantic properties of fake news, our aim is to 2 Related Work improve the performance of fake news detection. More importantly, our framework is intrinsically The problem of detecting deception is not new to interpretable, both globally, at the model level, and natural language processing (Vargo et al., 2018). locally for the individual false news predictions. Significant application domains include detecting false online advertising, fake consumer reviews, 3 Explainable Fake News Detection and spam emails (Ott et al., 2011)(Zhang and Framework Guan, 2008). The detection of fake news focuses on uncovering spread of misleading news articles In this section, we present the details of our TM- (Zhou and Zafarani, 2020)(Zhou et al., 2019). Typ- based fake news detection framework. ical detection techniques use either text-based lin- guistic features (Potthast et al., 2018) or visual 3.1 TM Architecture features (Gupta et al., 2013). Overall, fake news The TM consists of a team of two-action Tsetlin detection methods fall into two groups: knowledge- Automata (TAs) with 2N states. Each TA performs based models based on fact-checking news arti- either action “Include” (in State 1 to N) or action cles using external sources (WuYou et al., 2014), “Exclude” (in State N to 2N). Jointly, the actions and style-based models, which leverage linguis- specify a pattern recognition task. Action “Include” tic features capturing writing style (Rubin and incorporates a specific sub-pattern, while action Text Input Input Binary Encoded Input Real News Fake News

Clauses

Summation

Threshold Argmax

Output y

Figure 1: Interpretable Tsetlin machine architecture.

“Exclude” rejects the sub-pattern. The TAs are up- clauses. Positive polarity is assigned to one half of + dated based on iterative feedback in the form of the clauses, denoted by Cj . These are to capture rewards or penalties. The feedback signals how patterns for the target class (y = 0 for TM1 and well the pattern recognition task is solved, with the y = 1 for TM2). Negative polarity is assigned to − intent of maximizing classification accuracy. Re- the other half, denoted by Cj . Negative polarity wards reinforce the actions performed by the TAs, clauses are to capture patterns for the non-target while penalties suppress the actions. class (y = 1 for TM1 and y = 0 for TM2). In By orchestrating the TA team with rewards and effect, the positive polarity clauses vote for the in- penalties, a TM can capture frequent and discrimi- put belonging to the target class, while negative native patterns from labeled training data. Each pat- polarity clauses vote against the target class. + tern is represented as a conjunctive clause in propo- Any clause Cj for a certain target class is + sitional logic, based on the human-interpretable formed by ANDing a subset Lj ⊆ L of the lit- disjunctive normal form (Valiant, 1984). That is, eral set, written as: the TM produces AND-rules formed as conjunc- + ^ Y tions of propositional variables and their negations. Cj (X) = lk = lk, (1) l ∈L+ l ∈L+ A two-class TM structure is shown in Figure1, k j k j consisting of two separate TMs (TM1 and TM2). As seen in the Input-step, each TM takes a Boolean where j = (1, . . . , m/2) denotes the clause in- dex, and the superscript decides the polarity of the (propositional) vector X = (x1, . . . , xo), xk ∈ + {0, 1}, k ∈ {1, . . . , o} as input, which is ob- clause. For instance, the clause Cj (X) = x1x2 + tained by booleanizing the text input as suggested consists of the literals Lj = {x1, x2}, and it out- in (Berge et al., 2019; Yadav et al., 2021). That puts 1 if both of the literals are 1-valued. Similarly, − is, the text input is modelled as a set of words, we have clauses Cj for the non-target class. with each Boolean input signaling the presence The final classification decision is done after or absence of a specific word. From the input summing up the clause outputs (Summation-step vector, we obtain 2o literals L = (l1, l2, . . . , l2o). in the figure). That is, the negative outputs are The literals consist of the inputs xk and their substracted from the positive outputs. Employing a negated counterparts x¯k = ¬xk = 1 − xk, i.e., single TM, the sum is then thresholded using the L = (x1, . . . , xn, ¬x1,..., ¬xn). unit step function u, u(v) = 1 if v ≥ 0 else 0, as shown in Eq. (2): A TM forms patterns using m conjunctive clauses Cj (Figure1 – Clauses). How the pat- m/2 m/2  terns relate to the two output classes (y = 0 and X + X − yˆ = u  Cj (X) − Cj (X) . (2) y = 1) is captured by assigning polarities to the j=1 j=1 Type I (a) feedback Type I (b) feedback Type II feedback Include Exclude Include Exclude Include Exclude

time stamp for time stamp for time stamp for

Type I (a) feedback Type I (b) feedback Type II feedback Include Exclude Include Exclude Include Exclude

time stamp for time stamp for time stamp for

Figure 2: Visualization of Tsetlin machine learning.

The summation of clause outputs produces when y = 1 and to negative polarity clauses when an adaptive ensemble effect designed to y = 0. Type II feedback, on the other hand, in- help dealing with noisy and diverse data creases the discriminating power of the patterns, (Granmo, 2018). For example, the classifier suppresses false positives, and makes clauses eval- yˆ = u (x1x¯2 +x ¯1x2 − x1x2 − x¯1x¯2) captures the uate to 0. Type II feedback is given to positive XOR-relation. polarity clauses when y = 0 and to negative polar- With multi-class problems, the classification is ity clauses when y = 1. instead performed using the argmax operator, as The feedback is further regulated by the sum of shown in the figure. Then the target class of the votes v for each output class. That is, the voting TM with the largest vote sum is given as output. sum is compared against a voting margin T , which is employed to guide distinct clauses to learn differ- 3.2 Interpretable Learning Process ent sub-patterns. The details of the learning process As introduced briefly above, the conjunctive can be found in (Granmo, 2018). clauses in a TM are formed by a collection of We use the following sentence as an example TAs. Each TA decides whether to “Include” or to show how the inference and learning process “Exclude” a certain literal in a specific clause based can be interpreted: X= [“Building a wall on the on reinforcement, i.e., rewards, penalties, or inac- U.S-Mexico border will take literally years.”], with tion feedback. The reinforcement depends on six output target “true news” i.e., y = 1. First, the in- factors: (1) target output (y = 0 or y = 1), (2) put is tokenized and negated: X = [“build” = clause polarity, (3) clause output (Cj = 0 or 1), (4) 1, “¬build” = 0, “wall” = 1, “¬wall” = literals value (i.e., x = 1, and ¬x = 0), (5) vote 0, “U.S − Mexico” = 1, “¬U.S − Mexico” = sum, and (6) the current state of the TA. 0, “take” = 1, “¬take” = 0, “years” = 1, The TM learning process carefully guides the “¬years” = 0]. We consider two positive polar- + TAs to converge towards optimal decisions. To ity clauses and one negative polarity one (i.e., C1 , + − this end, the TM organizes the feedback it gives C2 , and C1 ) to show different feedback conditions to the TAs into two feedback types. Type I feed- occurring while learning the propositional rules. back is designed to produce frequent patterns, com- The learning dynamics are illustrated in Figure2. bat false negatives, and make clauses evaluate to 1. The upper part of the figure shows the current con- Type I feedback is given to positive polarity clauses figuration of clauses and TA states for three dif- ferent scenarios. On the left side of the vertical 3.3 Problem Statement for Fake News bar, the literals are included in the clause, and on Detection the right side of the bar, the literals are excluded. We now proceed with defining the fake news detec- The upper part also shows how the three different tion problem formally. Let A be a news article with kinds of feedback modifies the states of the TAs, ei- si sentences, where i = 1 to S and S is the total ther reinforcing “Include” or “Exclude”. The lower number of sentences. Each sentence can be written part depicts the new clause configurations, after the i i i as si = (w1, w2, . . . , wW ), consisting of W words. reinforcement has been persistently applied. Given the booleanized input vector X ∈ {0, 1}o, the fake news classification is a Boolean classifica- For the first time stamp in the figure, we as- tion problem, where we predict whether the news sume the clauses are initialized randomly as fol- article A is fake or not, i.e., F : X → y ∈ {0, 1}. + + We also aim to learn the credibility of news by lows: C1 = (build ∧ wall ∧ years), C2 = (build∧wall∧years∧¬take∧¬U.S −Mexico), formulating a TM-based credibility score that mea- − sures how check-worthy the news is. With this and C1 = (build ∧ take ∧ years). Since clause + addition, the classifier function can be written as C1 consists of non-negated literals of value 1 in the input X, the output becomes 1 (because F : X → (y, Q), where y is a classification output + and Q is the credibility score. C1 = 1 ∧ 1 ∧ 1). This invokes the condition + (C1 = 1 and y = 1). So, as shown in upper left 3.4 Credibility Assessment of Figure2, when the actual output is 1 and the clause output is 1, Type I (a) feedback is used to re- The credibility score can be calculated as fol- inforce the Include action. Therefore, literals such lows. Firstly, the TM architecture in Section 3.1 is as U.S−Mexico and take are eventually included, slightly tweaked for the score generation. In the because they appear in X. This process makes the architecture, instead of identifying the class with + the largest voting sum using Argmax, we obtain the clause C1 eventually have one sub-pattern cap- tured for y = 1, as shown on the lower left side of raw score from both the output classes. The raw Figure2. I.e., take and U.S − Mexico have been score is generated using clauses, which represent included in the clause. frequent lexical and semantic patterns that charac- terizes the respective classes. Therefore, the score contains information on how the input resembles + Similarly, in C2 , the clause output is 0 because the patterns captured by the clauses. We thus use of the negated inputs ¬take and ¬U.S − Mexico this score to measure the credibility of news. For + (i.e., C2 = 1 ∧ 1 ∧ 1 ∧ 0 ∧ 0). This invokes the instance, consider the vote ratios of 2:1 and 10:1 + condition (C2 = 0 and y = 1). Here, so-called for two classes (i.e., real vs. fake news) obtained Type I (b) feedback is used to reinforce Exclude from two different inputs. Then the first class wins actions. In this example, literals such as ¬take the majority vote in both cases, however, the 10:1 and ¬U.S − Mexico are eventually excluded from vote ratio suggests higher credibility than for the + C2 as shown in the middle part of Figure2, thus input that gave a ratio of 2:1. + establishing another sub-pattern for C2 . To normalize the credibility score so that it falls between 0 and 1, we apply the logistic function to the formula in Eq. (2) as shown in Eq. (3). Type II feedback (right side of figure) only oc- curs when y = 0 (for positive polarity clauses) 1 Qi = . (3) −k vF −vT and y = 1 (for negative polarity clauses). We 1 + exp ( i i ) − here use the negative polarity clause C1 to demon- th strate the effect of Type II feedback. This type of Above, Qi is the credibility score of the i sample, F T feedback is triggered when the clause output is 1. and k is the logistic growth rate. vi and vi are The goal is now to make the output of the affected the total sum of votes collected by clauses for fake clause change from 1 to 0 by adding 0-valued in- and true news respectively. Here, k is a user config- puts. Type II feedback can be observed on the urable parameter deciding the slope of the function. right side of Figure2, where all negated literals For example, consider the scores (43, -47) and (124, − finally are included in C1 to ensure that the clause -177), both predicting fake class. The credibility outputs 0. scores obtained from Eq. (3) with k = 0.012 are 0.74 and 0.97, which shows that the second news Dataset #Real #Fake #Total is more credible than the first one. PolitiFact 563 391 954 GossipCop 15,338 4,895 20,233 4 Experiment Setup Table 1: Dataset statistics. In this section, we present the experiment configu- rations we use to evaluate our proposed TM frame- • RST (Rubin et al., 2015): Rhetorical Structure work. Theory (RST) represents the relationship be- tween words in a document by building a tree 4.1 Datasets structure. It extracts the news style features from a bag-of-words by mapping them into a For evaluation, we adopt the publicly available latent feature representation. fake news detection data repository FakeNewsNet (Shu et al., 2017). The repository consists of news • LIWC (Pennebaker et al., 2015): Linguistic content from different fact-checking websites, in- Inquiry and Word Count (LIWC) is used to ex- cluding social context and dynamic information. tract and learn features from psycholinguistic We here use news content annotated with labels and deception categories. by professional journalists from the fact-checking websites PolitiFact and GossipCop. PolitiFact is • HAN (Yang et al., 2016): HAN uses a hier- a fact-checking website that focuses on U.S polit- archical attention neural network (HAN) for ical news. We extract news articles published till embedding word-level attention on each sen- 2017. GossipCop focuses on fact-checking of en- tence and sentence-level attention on news tertainment news collected from various medias. content, for fake news detection. While GossipCop has more fake news articles than PolitiFact, PolitiFact is more balanced, as shown in • CNN-text (Kim, 2014): CNN-text utilizes Table1. convolutional neural network (CNN) with pre- trained word vectors for sentence-level clas- 4.2 Preprocessing sification. The model can capture different granularities of text features from news arti- The relevant news content and the associated la- cles via multiple convolution filters. bels, i.e., True or False news, are extracted from both of the datasets. The preprocessing steps in- • LSTM-ATT (Lin et al., 2019): LSTM-ATT clude cleaning, tokenization, lemmatization, and utilizes long short term memory (LSTM) with feature selection. The cleaning is done by remov- an attention mechanism. The model takes ing less important information such as hyperlinks, a 300-dimensional vector representation of stop words, punctuation, and emojis. The datasets news articles as input to a two-layer LSTM are then booleanized into a sparse matrix for the for fake news detection. TM to process. The matrix is obtained by build- ing a vocabulary consisting of all unique words • RoBERTa-MWSS: The Multiple Sources of in the dataset, followed by encoding the input as Weak Social Supervision (MWSS) approach, a set of words using that vocabulary. To reduce built upon RoBERTa (Liu et al., 2019), was the sparseness of the input, we adopt two methods: proposed in (Shu et al., 2020b). 1) Chi-squared test statistics as a feature selection technique, and 2) selecting the most frequent liter- • BERT (Devlin et al., 2019): Bidirectional als from the dataset. The experiment is performed Encoder Representations from Transformers using both methods, and the best results are in- (BERT) is a Transformer-based model which cluded. For the PolitiFact and GossipCop datasets, contains an encoder with 12 transformer we selected the 20 000 and 25 000 most significant blocks, self-attention heads and hidden shape features, respectively. size of 768. • XLNet (Yang et al., 2019): XLNet is a gener- 4.3 Baseline alized autoregressive pretraining model that We first summarize the state-of-the-art of the fake integrates autoencoding and a segment recur- news detection approaches. rence mechanism from transformers. In addition, we compare our results with other is arguably because HAN can capture the syntactic machine learning baseline models such as logis- and semantic rules using hierarchical attention to tic regression (LR), naive Bayes, support vector detect fake news. Similarly, the LIWC performs machines (SVM), and (RF). For a better than RST. One possible explanation for this fair comparison, we select the methods that only is that LIWC can capture the linguistic features in extract textual features from news articles. The news articles based on words that denote psycholin- performance for these baseline models has been re- guistic characteristics. The LSTM-ATT, which has ported in (Shu et al., 2019) and (Shu et al., 2020a). extensive preprocessing using count features and sentiment features along with hyperparameter tun- 4.4 Training and Testing ing (Lin et al., 2019), has similar performance com- We use a random training-test split of 75%/25%. pared with HAN in PolitiFact, however, outper- The classification results is obtained on the test forms HAN on GossipCop. One reason for this can set. The process is repeated five times, and the be that the attention mechanism is able to capture average accuracy and F1 scores are reported. To the relevant representation of the input. In addition, provide robust results, we calculated an ensemble we see that machine learning models such as LR, average by first take the average of 50 stable epochs, SVM, and Naive Bayes are not very effective in followed by taking the average of the resulting five neither of the datasets. averages. We run our TM for 200 epochs with The recent transformer-based models BERT and hyperparameter configuration of 10 000 clauses, a XLNet outperform our model in terms of accuracy. threshold T of 200, and sensitivity s of 25.0. To Our TM-based approach obtains slightly lower ac- ensure a fair comparison of the results obtained curacy, achieving 87.1% for PolitiFact and 84.2% by other machine learning algorithms, we use the for GossipCop. However, the F1 score for poli- Python Scikit-learn framework that includes LR, tiFact is 0.90, whereas for GossipCop it is 0.89, SVM and Na¨ıve Bayes, using default parameter which is marginally better than XLNet and BERT. settings. Thus, we achieve state-of-the-art performance with 5 Results and Discussion respect to F1-score overall, and with respect to accuracy when comparing with interpretable ap- We now compare our framework with above men- proaches. Since GossipCop is unbalanced with tioned baseline models. sample ratio of 3:1 for real and fake news, we sub- mit that the F1 score is a more appropriate perfor- 5.1 Comparison with State-of-the-Art mance measure than accuracy. Overall, our model The experimental results are shown in Table2. is much simpler than the deep learning models be- Clearly, our model outperforms several of the other cause we do not use any pre-trained embeddings baseline models in terms of accuracy and F1 score. for preprocessing. This also helps in making our model more transparent, interpretable, and explain- PolitiFact GossipCop able. Models Acc. F1 Acc. F1 RST 0.607 0.569 0.531 0.512 The credibility assessment is performed as de- LIWC 0.769 0.818 0.736 0.572 scribed in Subsection 3.4. The Figs.3 and4 show HAN 0.837 0.860 0.742 0.672 the credibility scores for a fake news sample from CNN-text 0.653 0.760 0.739 0.569 LSTM-ATT 0.833 0.836 0.793 0.798 PolitiFact and GossipCop, resepectively. As seen, LR 0.642 0.633 0.648 0.646 the fake news can be ranked quite distinctly. This SVM 0.580 0.659 0.497 0.595 facilitates manual checking according to credibility, Na¨ıve Bayes 0.617 0.651 0.624 0.649 RoBERTa-MWSS 0.825 0.805 0.803 0.807 allowing users to focus on the fake news articles BERT 0.88 0.87 0.85 0.79 with highest scores. However, when we observe the XLNet 0.895 0.90 0.855 0.78 soft scores e.g. obtained from XLNet, most of the TM 0.871 0.901 0.842 0.896 samples are given rather extreme scores. That is, Table 2: Performance comparison of our model with 8 classifications are in general submitted with very baseline models. high confidence. This is in contrast to the more cautious and diverse credibility scores produced by One can further observe that HAN outperforms TM clause voting. If we for instance set a credi- LIWC, CNN-text, and RST for both datasets. This bility score threshold of 0.8 in TM, we narrow the 1.0 negated features from the negative polarity clauses of other classes, as well as the the plain features 0.8 from positive polarity clauses for the intended class. Taking the positive- and negative polarity claues 0.6 together, one gets stronger discrimination power. Table3 and Table4 exemplify the above be- 0.4 Credibility havior for fake news detection. To showcase the explainability of our approach, we list the ten most 0.2 captured plain and negated words per class, for both datasets. We observe that negated literals ap- 0.0 pear quite frequently in the clauses. This allows the 0 100 200 300 400 trained TM to represent the class by the features Samples that characterize the class as well as the features Figure 3: Credibility assessment for fake news in Poli- that contrast it from the other class. TM clauses that tiFact. contain negated and unnegated inputs are termed non-monotone clauses, and these are crucial for 1.0 human commonsense reasoning, as explored in “Non-monotonic reasoning” (Reiter, 1987). 0.8 Also observe that the TM clauses include both descriptive words and discriminative words. For ex- 0.6 ample, in PolitiFact, the Fake class captures words like “trump”, while the True class captures the 0.4 Credibility negated version, i.e., “¬trump”. However, this does not mean that all the news related to Trump are 0.2 Fake. Actually, it means that if we break down the clauses into literals, we see “trump” as a descrip- 0.0 tor/discriminator for the Fake News class. There- 0 1000 2000 3000 4000 5000 Samples fore, most of the clauses captured the word “trump” in plain and in negated form (for the True news Figure 4: Credibility assessment for fake news in Gos- class). However, one single word cannot typically sipCop. produce an accurate classification decision. It is selection of fake news down to around 300 out of the joint contribution of the literals in a clause that 400 for PolitiFact and to around 2 800 out of 4 895 contributes to the high accuracy. Hence, we have for GossipCop. to look at all word pattern captured by the clauses. When an input is passed into the trained TM, the 5.2 Case Study: Interpretability clauses from both classes that capture the input We now investigate the interpretability of our TM word pattern are activated, to vote for their respec- approach in fake news classification by analyzing tive class. Finally, the classification is made based the words captured by the clauses for both true on the total votes gathered by both classes for the and fake news. In brief, the clauses capture two particular input. different types of literals: 6 Conclusions • Plain literals, which are plain words from the target class. In this paper, we propose an explainable and inter- pretable Tsetlin Machine (TM) framework for fake • Negated literals, which are negated words news classification. Our TM framework employs from the other classes. clauses to capture the lexical and semantic features The TM utilizes both plain and negated word pat- based on word patterns in a document. We also ex- terns for classification. When we analyze the plain the transparent TM learning of clauses from clauses, we see that most of the sub-patterns cap- labelled text. The extensive experimental evalua- tured consist of negated literals. This helps TM tions demonstrate the effectiveness of our model make decisions robustly because it can use both on real-world datasets over various baselines. Our PolitiFact Fake True Plain times Negated times Plain times Negated times trump 297 candidate 529 congress 136 trump 1252 said 290 debate 413 tax 104 profession 1226 comment 112 civil 410 support 70 navigate 1223 donald 110 reform 369 senate 64 hackings 1218 story 78 congress 365 president 60 reported 1216 medium 63 iraq 361 economic 57 arrest 1222 president 48 lawsuit 351 americans 49 camps 1206 reported 45 secretary 348 candidate 48 investigation 1159 investigation 38 tax 332 debate 44 medium 1152 domain 34 economy 321 federal 41 domain 1153

Table 3: Top ten Literals captured by clauses of TM for PolitiFact.

GossipCop Fake True Plain times Negated times Plain times Negated times source 357 stream 794 season 150 insider 918 insider 152 aggregate 767 show 103 source 802 rumors 86 bold 723 series 79 hollywood 802 hollywood 80 refreshing 722 like 78 radar 646 gossip 49 castmates 721 feature 70 cop 588 relationship 37 judgment 720 video 44 publication 579 claim 33 prank 719 said 33 exclusively 551 split 32 poised 718 sexual 32 rumor 537 radar 32 resilient 714 notification 25 recalls 535 magazine 30 predicted 714 character 25 kardashian 525

Table 4: Top ten Literals captured by clauses of TM for GossipCop.

results show that our approach is competitive with and fake news in the 2016 election. NBER Working far more complex and non-transparent methods, in- Paper Series. cluding BERT and XLNet. In addition, we demon- Geir Thore Berge, Ole-Christoffer Granmo, T. Tveit, strate how fake news can be ranked according to a M. Goodwin, Lei Jiao, and Bernt Viggo Matheussen. credibility score based on classification confidence. 2019. Using the tsetlin machine to learn human- We finally demonstrate the explainability of our interpretable rules for high-accuracy text catego- rization with medical applications. IEEE Access, model using a case study. In our future work, we 7:115134–115146. intend to go beyond using pure text features, also incorporating spatio-temporal and other meta-data Bimal Bhattarai, Ole-Christoffer Granmo, and L. Jiao. features available from social media content, for 2021. Measuring the Novelty of Natural Language Text Using the Conjunctive Clauses of a Tsetlin Ma- potentially improved accuracy. chine Text Classifier. In 13th International Confer- ence on Agents and Artificial Intelligence (ICAART 2021). INSTICC. References C. Castillo, Marcelo Mendoza, and B. Poblete. 2011. K. Darshana Abeyrathna, Bimal Bhattarai, Morten Information credibility on twitter. In WWW. Goodwin, Saeed Gorji, Ole-Christoffer Granmo, Lei Jiao, Rupsa Saha, and Rohan K. Yadav. 2021. J. Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Massively Parallel and Asynchronous Tsetlin Ma- Toutanova. 2019. Bert: Pre-training of deep bidirec- chine Architecture Supporting Almost Constant- tional transformers for language understanding. In Time Scaling. In The Thirty-eighth International NAACL-HLT. Conference on Machine Learning (ICML 2021). ICML. Mengnan Du, Ninghao Liu, and Xia Hu. 2019. Tech- niques for interpretable machine learning. Commun. K. Darshana Abeyrathna, Ole-Christoffer Granmo, ACM, 63:68–77. Xuan Zhang, Lei Jiao, and Morten Goodwin. 2020. The Regression Tsetlin Machine - A Novel Song Feng, Ritwik Banerjee, and Yejin Choi. 2012. Approach to Interpretable Non-. Syntactic stylometry for deception detection. In Philosophical Transactions of the Royal Society A, ACL. 378. Jeffrey A. Gottfried and E. Shearer. 2016. News use H. Allcott and Matthew Gentzkow. 2017. Social media across social media platforms 2016. Ole-Christoffer Granmo. 2018. The Tsetlin machine - N. Ruchansky, Sungyong Seo, and Y. Liu. 2017. Csi: A a game theoretic bandit driven approach to optimal hybrid deep model for fake news detection. Proceed- pattern recognition with propositional logic. ArXiv, ings of the 2017 ACM on Conference on Information abs/1804.01508. and Knowledge Management.

Han Guo, J. Cao, Y. Zhang, Junbo Guo, and J. Li. Kai Shu, Limeng Cui, Suhang Wang, D. Lee, and Huan 2018. Rumor detection with hierarchical social at- Liu. 2019. defend: Explainable fake news detec- tention network. Proceedings of the 27th ACM Inter- tion. Proceedings of the 25th ACM SIGKDD In- national Conference on Information and Knowledge ternational Conference on Knowledge Discovery & Management. . A. Gupta, H. Lamba, P. Kumaraguru, and A. Joshi. Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dong- 2013. Faking sandy: characterizing and identifying won Lee, and Huan Liu. 2020a. FakeNewsNet: A fake images on twitter during hurricane sandy. In data repository with news content, social context, WWW ’13 Companion. and spatiotemporal information for studying fake Hamid Karimi and Jiliang Tang. 2019. Learning hierar- news on social media. Big Data, 8(3):171–188. chical discourse-level structure for fake news detec- tion. In Proceedings of the 2019 Conference of the Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and North. Association for Computational Linguistics. Huan Liu. 2017. Fake news detection on social me- dia: A data mining perspective. ACM SIGKDD Ex- Yoon Kim. 2014. Convolutional neural networks for plorations Newsletter, 19(1):22–36. sentence classification. In EMNLP. Kai Shu, Guoqing Zheng, Yichuan Li, Subhabrata J. Lin, Glenna Tremblay-Taylor, Guanyi Mou, Di You, Mukherjee, Ahmed Hassan Awadallah, Scott W. and Kyumin Lee. 2019. Detecting fake news arti- Ruston, and Huan Liu. 2020b. Leveraging multi- cles. 2019 IEEE International Conference on Big source weak social supervision for early detection Data (Big Data), pages 3021–3025. of fake news. ArXiv, abs/2004.01732.

Y. Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Leslie G Valiant. 1984. A Theory of the Learnable. Joshi, Danqi Chen, Omer Levy, M. Lewis, Luke Communications of the ACM, 27(11):1134–1142. Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. Chris J. Vargo, Lei Guo, and Michelle A. Amazeen. ArXiv, abs/1907.11692. 2018. The agenda-setting power of fake news: A T. Mitra and E. Gilbert. 2015. Credbank: A large-scale big data analysis of the online media landscape from social media corpus with associated credibility anno- 2014 to 2016. New Media & Society, 20:2028 – tations. In ICWSM. 2049. Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T. Han- Svitlana Volkova, Kyle Shaffer, Jin Yea Jang, and cock. 2011. Finding deceptive opinion spam by any Nathan Oken Hodas. 2017. Separating facts from stretch of the imagination. In The 49th Annual Meet- fiction: Linguistic models to classify suspicious and ing of the Association for Computational Linguis- trusted news posts on twitter. In ACL. tics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Ore- William Yang Wang. 2017. ”liar, liar pants on fire”: A gon, USA, pages 309–319. The Association for Com- new benchmark dataset for fake news detection. In puter Linguistics. ACL.

J. Pennebaker, Ryan L. Boyd, K. Jordan, and Kate G. WuYou, K. AgarwalPankaj, LiChengkai, Yang-jun, Blackburn. 2015. The development and psychomet- and YuCong. 2014. Toward computational fact- ric properties of liwc2015. checking. In VLDB 2014.

Martin Potthast, Johannes Kiesel, K. Reinartz, Janek Rohan Yadav, Lei Jiao, Ole-Christoffer Granmo, and Bevendorff, and Benno Stein. 2018. A stylometric Morten Goodwin. 2021. Human-level interpretable inquiry into hyperpartisan and fake news. In ACL. learning for aspect-based sentiment analysis. In AAAI. R Reiter. 1987. Nonmonotonic reasoning. Annual Re- view of Computer Science, 2(1):147–186. Z. Yang, Zihang Dai, Yiming Yang, J. Carbonell, Victoria L. Rubin, Niall Conroy, and Yimin Chen. R. Salakhutdinov, and Quoc V. Le. 2019. Xlnet: 2015. Towards news verification: Deception detec- Generalized autoregressive pretraining for language tion methods for news discourse. understanding. In NeurIPS. Victoria L. Rubin and Tatiana Lukoianova. 2015. Truth Zichao Yang, Diyi Yang, Chris Dyer, X. He, Alex and deception at the rhetorical structure level. Jour- Smola, and E. Hovy. 2016. Hierarchical atten- nal of the Association for Information Science and tion networks for document classification. In HLT- Technology, 66. NAACL. Linfeng Zhang and Yong Guan. 2008. Detecting click fraud in pay-per-click streams of online advertising networks. 2008 The 28th International Conference on Distributed Computing Systems, pages 77–84. X. Zhou, R. Zafarani, Kai Shu, and H. Liu. 2019. Fake news: Fundamental theories, detection strate- gies and challenges. Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. Xinyi Zhou and Reza Zafarani. 2020. A survey of fake news. ACM Computing Surveys, 53(5):1–40.