A Stylometric Inquiry Into Hyperpartisan and Fake News

A Stylometric Inquiry into Hyperpartisan and Fake News Martin Potthast∗, Johannes Kiesely, Kevin Reinartzy, Janek Bevendorffy, Benno Steiny ∗Leipzig University, yBauhaus-Universität Weimar webis.de ACL, July 16th, 2018 1 @KieselJohannes 2 @KieselJohannes 3 @KieselJohannes What are Fake News? Disinformation displayed as news articles 4 @KieselJohannes What are Fake News? Disinformation displayed as news articles Image: Claire Wardle, First Draft 5 @KieselJohannes What are Fake News? Disinformation displayed as news articles Image: Claire Wardle, First Draft 6 @KieselJohannes A Stylometric Inquiry into Hyperpartisan “News” and “News” in False Context and/or with Content that is Impostered, Manipulated, and/or Fabricated Martin Potthast∗, Johannes Kiesely, Kevin Reinartzy, Janek Bevendorffy, Benno Steiny ∗Leipzig University, yBauhaus-Universität Weimar webis.de ACL, July 16th, 2018 7 @KieselJohannes The Political Spectrum The left-right political spectrum is a system of classifying political positions, ideologies and parties. Left-wing politics and right-wing politics are often presented as opposed, although either may adopt stances from the other side. [Wikipedia] Alt-left Left Center Right Alt-right 8 @KieselJohannes The Political Spectrum The left-right political spectrum is a system of classifying political positions, ideologies and parties. Left-wing politics and right-wing politics are often presented as opposed, although either may adopt stances from the other side. [Wikipedia] Alt-left Left Center Right Alt-right Liberal Conservative 9 @KieselJohannes The Political Spectrum The left-right political spectrum is a system of classifying political positions, ideologies and parties. Left-wing politics and right-wing politics are often presented as opposed, although either may adopt stances from the other side. [Wikipedia] Alt-left Left Center Right Alt-right Liberal Conservative Hyperpartisan Partisan Partisan Hyperpartisan Partisan: someone with a psychological identification with one major party. [Wikipedia] 10 @KieselJohannes The Political Spectrum The left-right political spectrum is a system of classifying political positions, ideologies and parties. Left-wing politics and right-wing politics are often presented as opposed, although either may adopt stances from the other side. [Wikipedia] Alt-left Left Center Right Alt-right Liberal Conservative Hyperpartisan Partisan Partisan Hyperpartisan Partisan: someone with a psychological identification with one major party. [Wikipedia] News media reporting on politics can be aligned on this spectrum as well. We are observing an increasing number of hyperpartisan news publishers. 11 @KieselJohannes Fake News and Hyperpartisan News 12 @KieselJohannes Why are Fake News Published by Hyperpartisan Pages? Image: Claire Wardle, First Draft 13 @KieselJohannes Why are Fake News Published by Hyperpartisan Pages? Image: Claire Wardle, First Draft 14 @KieselJohannes Fake News Detection Taxonomy of Approaches Knowledge-based Fake news detection Knowledge-based (also called fact checking) q Requires political knowledge base Etzioni et al., 2018 q Information retrieval Magdy and Wanas, 2010 Unavailable ahead of time Ginsca et al., 2015 q We cannot trust the web Wu et al., 2014 Semantic web / LOD Ciampaglia et al, 2015 Shi and Weninger, 2016 Long et al., 2017 Mocanu et al., 2015 Context-based Context-based Acemoglu et al., 2010 Kwon et al., 2013 q Ma et al., 2017 Limited to social media platforms Social network analysis Volkova et al., 2017 q Part of damage already done Budak et al., 2011 Nguyen et al. 2012 Derczynski et al., 2017 Style-based Tambuscio et al., 2015 Style-based Wei et al., 2013 Chen et al., 2015 Deception detection Rubin et al., 2015 q Allows for pre-posting check Wang et al., 2017 Bourgonje et al., 2017 q Real-time reaction possible Afroz et al., 2012 Badaskar et al., 2008 q Hard to mask Rubin et al., 2016 Text categorization Yang et al., 2017 q But are style differences sufficient? Rashkin et al., 2017 Horne and Adali, 2017 Pérez-Rosas et al., 2017 15 @KieselJohannes Fake News and Hyperpartisan News Corpus Construction 16 @KieselJohannes Fake News and Hyperpartisan News Corpus Construction Orientation Fact-checking results Publisher true mix false n/a Σ Center 80680 12 826 ABC News 90203 95 CNN 295408 307 Politico 421201 424 Left-wing 182 51 158 256 Addicting Info 95 2587 135 Occupy Democrats 59 2570 91 The Other 98% 28101 30 Right-wing 276 153 72 44 545 Eagle Rising 106 47 25 36 214 Freedom Daily 49 24 224 99 Right Wing News 121 82 254 232 Σ 1264 212 87 64 1627 Annotations provided by journalists at BuzzFeed 17 @KieselJohannes Fake News and Hyperpartisan News Selected Results Orientation Fact-checking results Publisher true mix false n/a Σ Center 80680 12 826 ABC News 90203 95 CNN 295408 307 Politico Fake421 News201 Detection424 Left-wing 182 51 158 256 Precision ≈ 42% Addicting Info 95 2587 135 Occupy Democrats 59 Recall25≈70 41% 91 The Other 98% 28101 30 Right-wing 276 153 72 44 545 Eagle Rising 106 47 25 36 214 Freedom Daily 49 24 224 99 Right Wing News 121 82 254 232 Σ 1264 212 87 64 1627 Annotations provided by journalists at BuzzFeed 18 @KieselJohannes Fake News and Hyperpartisan News Selected Results Orientation Fact-checking results Publisher true mix false n/a Σ Center 80680 12 826 ABC News 90203 95 CNN 295408 307 Politico Orientation421201 Detection424 Left-wing 182 51 158 256 Precision ≈ 21% Precision ≈ 56% Addicting Info 95 2587 135 Occupy DemocratsRecall ≈ 5920% 2570Recall ≈9159% The Other 98% 28101 30 Right-wing 276 153 72 44 545 Eagle Rising 106 47 25 36 214 Freedom Daily 49 24 224 99 Right Wing News 121 82 254 232 Σ 1264 212 87 64 1627 Annotations provided by journalists at BuzzFeed 19 @KieselJohannes Fake News and Hyperpartisan News Selected Results Orientation Fact-checking results Publisher true mix false n/a Σ Center 80680 12 826 ABC News 90203 95 CNN 295408 307 Politico Hyperpartisanship421201 Detection424 Left-wing 182 51 158 256 Precision ≈ 69% Addicting Info 95 2587 135 Occupy Democrats 59 Recall2570 ≈ 89%91 The Other 98% 28101 30 Right-wing 276 153 72 44 545 Eagle Rising 106 47 25 36 214 Freedom Daily 49 24 224 99 Right Wing News 121 82 254 232 Σ 1264 212 87 64 1627 Annotations provided by journalists at BuzzFeed 20 @KieselJohannes Fake News and Hyperpartisan News How can it be that the alt left and the alt right cannot be distinguished from the mainstream, when both together (hyperpartisan news) can be? Alt-left Left Center Right Alt-right Liberal Conservative Hyperpartisan Partisan Partisan Hyperpartisan 21 @KieselJohannes Fake News and Hyperpartisan News How can it be that the alt left and the alt right cannot be distinguished from the mainstream, when both together (hyperpartisan news) can be? Center Left Right Partisan Alt-left Hyperpartisan Alt-right 22 @KieselJohannes Fake News and Hyperpartisan News How can it be that the alt left and the alt right cannot be distinguished from the mainstream, when both together (hyperpartisan news) can be? Center Left Right Partisan Alt-left Hyperpartisan Alt-right The horseshoe theory asserts that the alt left and the alt right, rather than being at opposite and opposing ends of a linear political continuum, in fact closely resemble one another, much like the ends of a horseshoe. [Wikipedia] 23 @KieselJohannes Horseshoe Validation Experiment I Leave-out Classification left-wing center right-wing 24 @KieselJohannes Horseshoe Validation Experiment I Leave-out Classification left-wing center right-wing q Classifier is trained to distinguish left-wing and center articles q Right-wing articles are used for testing q Majority of right-wing articles are classified as left-wing rather than center 25 @KieselJohannes Horseshoe Validation Experiment I Leave-out Classification left-wing center right-wing 74% | 26% q Classifier is trained to distinguish left-wing and center articles q Right-wing articles are used for testing q Majority of right-wing articles are classified as left-wing rather than center 26 @KieselJohannes Horseshoe Validation Experiment I Leave-out Classification left-wing center right-wing 74% | 26% q Classifier is trained to distinguish left-wing and center articles q Right-wing articles are used for testing q Majority of right-wing articles are classified as left-wing rather than center left-wing center right-wing 34% | 66% 27 @KieselJohannes Horseshoe Validation Experiment II Unmasking [Koppel/Schler 2004] ? = A B 28 @KieselJohannes Horseshoe Validation Experiment II Unmasking [Koppel/Schler 2004] ? = A B A B 29 @KieselJohannes Horseshoe Validation Experiment II Unmasking [Koppel/Schler 2004] ? = A B 0.3 0.1 0.1 0.2 0.0 0.2 0.5 0.3 0.2 0.2 0.4 0.1 0.1 0.3 0.1 0.4 0.0 0.4 0.1 0.2 0.2 0.3 0.0 0.1 0.1 0.0 0.2 0.3 0.2 0.1 0.4 0.5 0.2 0.2 0.1 0.6 0.3 0.2 0.2 0.2 0.0 0.0 0.3 0.1 0.6 0.0 0.4 0.2 0.3 0.3 0.4 0.2 0.1 0.3 0.2 0.2 0.1 0.1 0.1 0.3 0.1 0.2 0.2 0.1 0.0 0.1 0.2 0.2 0.2 0.1 0.2 0.0 0.1 0.2 0.6 0.1 0.1 0.2 0.4 0.6 0.4 0.4 0.2 0.3 0.5 0.1 0.4 0.2 0.5 0.1 0.3 0.2 0.5 0.2 0.2 0.5 0.2 0.2 0.1 0.1 0.0 0.3 0.1 0.2 A B 0.1 0.3 0.2 0.4 0.2 0.1 0.0 0.3 0.0 0.2 0.0 0.1 0.2 0.1 0.0 0.0 A B 30 @KieselJohannes Horseshoe Validation Experiment II Unmasking [Koppel/Schler 2004] ? = A B 0.3 0.1 0.1 0.2 0.0 0.2 0.5 0.3 0.2 0.2 0.4 0.1 0.1 0.3 0.1 0.4 0.0 0.4 0.1 0.2 0.2 0.3 0.0 0.1 0.1 0.0 0.2 0.3 0.2 0.1 0.4 0.5 0.2 0.2 0.1 0.6 0.3 0.2 0.2 0.2 0.0 0.0 0.3 0.1 0.6 0.0 0.4 0.2 0.3 0.3 0.4 0.2 0.1 0.3 0.2 0.2 0.1 0.1 0.1 0.3 0.1 0.2 0.2 0.1 0.0 0.1 0.2 0.2 0.2 0.1 0.2 0.0 0.1 0.2 0.6 0.1 0.1 0.2 0.4 0.6 0.4 0.4 0.2 0.3 0.5

Load more