arXiv:2004.14907v1 [cs.CL] 30 Apr 2020 utainbsfie ( bushfires Australian amn ols hn2 than less to warming int nunetenraieo lmt change. climate of narrative the influence to tion wildfires Australian such fight headlines with articles as see animals. we billion a times, of these death the of and hectares million land 17 of destruction the to leading the of heart the at been has change climate pogenic anclrtadteei edt ii global limit to need a is there and culprit the main assessment are 5th humans that its concluded categorically in report (IPCC) Change Climate Panel Intergovernmental on The dreadful crisis. the this of of some effects just are and species droughts, of defin- extinction severe ice, the patterns, polar melting weather at levels, changing are sea Rising we moment. challenges and ing biggest world, the the of threatening one is change Climate Introduction 1 lmt hnehscue oeri,helping rain, more caused has Change Climate 3 2 1 https://www.heartland.org/news-opinion/news/climate https://www.aph.gov.au/About_Parliament/Parliamenta https://www.ipcc.ch/report/ar5/wg1/ ii h ieaueo lmt misinformation climate on literature the visit ntecmuiyof community the introduce to in it repackage and sciences social re- in We change. climate around narrative the counter change ( climate movement by care- constructed are fully articles cli- These spreading articles misinformation. with mate flooded is warming, web the global anthropogenic about com- climate scientific munity in of consensus challenge the Despite the crisis. facing is world The ilswt nw lmt hnemisinforma- tion. change climate ar- known releasing with and ticles scraping try by We gap this change. bridge climate to domain.of the is to spe- there cific is news, that available fake dataset misinformation of no detection in work able o r ih.Ia LRE u yCiaeCag Counter Change Climate by But – ALARMED am I right. are You CCCM [email protected] Abstract hayBai e a a ioh Baldwin Timothy Lau Han Jey Bhatia Shraey raiain oinfluence to organizations ) a lebrhe al. et Oldenborgh van ◦ C. NLP 1 colo optn n nomto Systems Information and Computing of School pedn misinforma- spreading oercnl,anthro- recently, More ept consider- Despite . h nvriyo Melbourne of University The 2 , During 2020 Movement , [email protected] 3 ), ( ´ rzRsse al. P´erez-Rosas et 2016 eeto raetsse,smlrt htwe ( what news to fake similar for seen system, have alert automatic an or ap- via users of detection inform development can climate the that motivates of plications domain This the to change. news tailored fake ( specifically arena in is political seen the that Table in mirrors in examples broadly in proach as seen satire, be and can clickbait sen- of melodrama, elements sationalism, stylistic the with sprinkled and hoax, propaganda narra- misinformation, of ingredients a disinformation, principle of the around consisting structured formula tive a use ganizations ato aeul rfe taeyb climate by strategy ( crafted movement counter carefully change a of part al 1 Table ulpadJacques and Dunlap org hmt a colt rts o more for protest action. to school en- wag ... to lunatics them climate An- courage The cooling? even? Age Global Ice other return? it to will sanity What for ending? take is education world the about believe cares you Who if about crisis. children climate terrify the to schools Zealand New interview an CO in that told Kutschera Ulrich Dr. Prof. physiologist and biologist evolutionary German eterhha et n hspatc idof kind a cult. practice religious thus and death heat earth immi- nent fictitious, a the predict who them alarmists among climate extremes, a rejected is he scientists ... myth. among consensus 97% claimed h ako noae aenw aaspurred data news fake annotated of lack The h ieo hsmsedn nomto is information misleading this of rise The ; Farrell rilscetdby created Articles : 2 sabesn o akn n htthe that and mankind for blessing a is ry_Departments/Parliamentary_Library/pubs/rp/rp1920 -change-has-caused-more-rain-helping-fight-australi , 2016 , , [email protected] 2017 ; , McKie 2013 ahi tal. et Rashkin ; in n Wilson and Jiang ; CCCM CCCM ahi tal. et Rashkin , osai n Coan and Boussalis 2018 organizations ) organizations. .Teeor- These ). 1 , hi ap- Their . 2017 , , 2018 ,but ), 2017 ). ; , /Quick_Guides/AustralianBushfires an-wildfires the creation of misinformation datasets. The ation plays an important role in the identifica- first public dataset for fake news detection tion of misinformation. Biyani et al. (2016) stud- (Vlachos and Riedel, 2014) and claim/stance ver- ied stylistic aspects of clickbait and formalised ification (Ferreira and Vlachos, 2016) are moder- it into 8 different categories ranging from exag- ately small with 221 and 300 instances, respec- geration to teasing, and proposed a clickbait clas- tively. More recently, larger datasets have been sifier based on novel informality features. Sim- developed, such as LIAR (Wang, 2017), collected ilarly, Kumar et al. (2016) examined the unique from PolitiFact and labelled with 6 levels of ve- linguistic characteristics of hoax documents in racity, and FEVER (Thorne et al., 2018), a dataset Wikipedia and built a classifier using a range of generated from Wikipedia with supported, hand-engineered features. Rashkin et al. (2017) refuted and not enough info labels. Ex- proposed using stylistic lexicons (e.g. Linguis- tending the task to full articles, FakeNewsNet is tic Inquiry and Word Count (LIWC)), subjective a valuable resource (Shu et al., 2017, 2018). But words, and intensifying lexicons for fact check- to the best of our knowledge there is no misinfor- ing, and demonstrated that words used to exag- mation dataset that is specific to the domain of cli- gerate like superlatives, subjectives, and modal ad- mate change. We attempt to fill this gap by releas- verbs are prominent in fake news, whereas trusted ing a large set of documents with known climate sources are dominated by assertive words. Wang change misinformation. (2017) experimented with detecting fake news us- ing meta data features with convolutional neural 2 Related Work networks adapted for text (Kim, 2014). Although articles with misinformation are pre- The way the public perceives and reacts to the con- dominantly human-written, the recent emergence stant supply of information around of large pre-trained language models means they is a function of how the facts and narrative are can now be automatically generated. Radford et al. presented to them (Fløttum, 2014; Fløttum et al., (2019) introduced a large auto-regressive model 2016). Flottum (2017) emphasizes that language (GPT2) with the ability to generate high-quality and communication around climate change are synthetic text. One limitation of GPT2 is its in- significant, as climate is not just the physical ability to perform controlled generation for a spe- science but has political, social and ethical as- cific domain, and Keskar et al. (2019) proposed pects, and involves various stakeholders, inter- a model to tackle this. Building on this further, ests and voices. A range of corpus linguis- Dathathri et al. (2019) introduced a plug and play tic methods have been used to study the topi- language model, where the language model is a cal and stylistic aspects of language around cli- pretrained model similar to GPT2 but with control- mate change. Tvinnereim and Fløttum (2015) lable components that can be fine-tuned through proposed the use of structured topic modelling attribute classifiers. (Roberts et al., 2014) to derive insights about the public opinion from 2115 open-ended survey re- 3 Climate Change Counter Movement sponses. Salway et al. (2014) leveraged unsu- Organisations pervised grammar induction and pattern extrac- tion methods to find common phrases in climate Despite the findings of the IPCC’s 5th Assesment change communication. Atanasova and Koteyko Report and more than 97 percent consensus in (2017) analysed frequently-used metaphors manu- the scientific community to support anthropogenic ally in editorials and op-eds, and concluded that global warming (Cook et al., 2013), coordinated the communication in the Guardian (U.K.) was efforts to tackle the climate crisis are lacking. This predominantly war based (e.g. threat of climate can be attributed to the rise in opposing voices in- change), Seuddeutsche (Germany) based on ill- cluding the lobby, conservative think- ness (e.g. earth has fever), and the NYTimes tanks, big corporations, and digital/print media (U.S.A) based on the idea of a journey (e.g. many questioning the science and research around cli- small steps in the right direction). mate change. These organisations are collectively In linguistics, style broadly refers to the proper- referred to as climate change counter movement ties of a sentence beyond its content or meaning (“CCCM”) organizations (Oreskes and Conway, (Pennebaker and King, 1999), and stylistic vari- 2010; Dunlap and Jacques, 2013; Farrell, 2016; Argument Frame news. CO2 is plant food and is good for the planet Science Climate change is natural and has always been changing Science 4 Dataset We are entering another ice age Science Adapting to global warming is cheaper than preventing it Policy To construct our dataset, we scrape articles with Renewable energy is way too expensive Policy known climate change misinformation from 15 dif- Table 2: Examples of counter climate arguments ferent CCCM organizations. These organizations and their frames. are selected from three sources: (1) McKie (2018), (2) .com,4 a website that maintains a database of individuals and organizations that have Boussalis and Coan, 2016; McKie, 2018). McKie been identified to perpetuate climate disinforma- (2018) argues that the motivation behind these tion; and (3) an organizations cited on the web- organizations is to maintain the status quo of site selected from above 2 sources. A number the hegemony of fossil fuel-based neo-liberal of considerations were made when developing the global capitalism. These organizations are found dataset: around the globe and can masquerade as philan- • We only scrape articles from organizations thropic organizations to fund climate misinforma- active in English-speaking countries: the tion (Farrell, 2019), hide behind libertarian ideas United States, , United Kingdom, (McKie, 2018) to question the scientists, and aug- Australia, and New Zealand. ment scepticism to promote pseudo science or ‘al- ternative facts’. Some of these organizations have • A considerable number of organizations are catchy names such as carbonsense.com or either dormant or have a very low level of friendsofscience.org, and organize their activity. To make sure our dataset is up to own scientific conferences. date, we only scrape articles from organiza- Oreskes and Conway (2010) concluded in their tions with a reasonable level of activity, e.g. analysis that the strategies employed by CCCM to they publish at least 1 article every month, construct the narrative to spread misinformation and their latest publication is in 2020. resembles the ones historically used by the to- • We set a minimum and maximum threshold bacco lobby. For instance, targeting researchers of 10 and 400 articles respectively for each and questioning the methodology of their re- organization. We set a maximum threshold search, and blaming scientific standards are con- so as to avoid bias towards one organization. sistent strategies used by both CCCM and tobacco Note, however, that there is a considerable lobby groups (Oreskes and Conway, 2010; McKie, variance in the article length for different or- 2018). Dunlap and Brulle (2015); Farrell (2016); ganizations. For instance, one organization Boussalis and Coan (2016) categorized their mis- with only 10 articles has an average length of information arguments into 2 frames: science and 342.1 words, while another organization with policy. Science frame arguments question the sci- 400 articles has an average length of 85.8 entific facts and deliberately plant a lie to sway words. the public towards pseudo science, whereas pol- icy frame arguments target issues of cost and econ- • As explained in Section 3, counter climate ar- omy (e.g. carbon tax) or pass the blame for action guments can be broadly categorised into the to other nations. We present several examples of science and policy frames. As organizations arguments in the science and policy frames in Ta- generally prefer one type of frame in their ble 2 . narrative, we manually identify frames asso- We believe the narrative of CCCM articles have ciated with organizations, and select a set of two aspects: topical and stylistic. The topical as- organizations that produces a balanced repre- pects describe common issues discussed in CCCM sentation of both frames in the dataset. articles (e.g. carbon tax, fossil fuel, and renewable energy). The stylistic aspects capture how the nar- We split the documents into training and test rative is presented — e.g. the use of exaggeration partitions at the organization level, where the train- and sensationalism, as evident in the examples in ing set comprises 12 organizations and the test set Table 1 — and bear similar characteristics to fake 4https://www.desmogblog.com/global-warming-denier-database # Organizations 12 PM Meets With Cricket Side To Discuss The # Articles 1168 1.7m Hectares Of NSW Forests Destroyed By Mean Length 559.3 Bushfires. Not even six months after being of- Median Length 332.0 ficially elected as the Australian Prime Minis- Std Length 640.2 ter with absolutely no policies, let alone any ac- knowledgement of his government’s denialism- Table 3: Statistics of CCCM training data led inaction on climate change, Scott Morrison has today had the opportunity to meet some more sportsmen! While the drought-stricken 3 organizations. We split at the organization level communities of rural Australian continue to because it allows us to test whether detection mod- burn at the hands of record-breaking and out- els are able to generalize their predictions to arti- of-control bushfires, ScoMo has today met with cles published by unseen/new CCCM organisations. the Australian cricket side for his ideal media We present some statistics for the training docu- appearance...... The cricketers appeared dis- ments in Table 3, and the list of organizations is tressed while also having to pose for goofy pho- provided in the supplementary file. tos with the Prime Minister, ...... planet’s tem- 4.1 Test data perature that will result in the certain deaths of the billions of people that haven’t been given We extend our test set to include documents that permission to join Gina Rinehart and her Lib- do not have climate change misinformation, to eral Party employees in the spaceship. create a standard evaluation dataset for climate change misinformation detection. We collect doc- uments from reputable sources, some of which are Table 4: A Beetota Advocate article on climate not climate-related, and some are satirical in na- change. ture. Sources of the full test documents are as fol- lows: change. Although there is a tone of sensation- • Guardian: A trusted source for indepen- alism in the writing, the articles are created dent journalism. We scrape articles under with the intent of humour. An example of a the category of climate change from both its Beetota Advocate article is given in Table 4. U.K. and Australian editions. These articles These articles test whether detection models CCCM test whether a detection system can correctly are able to distinguish them from arti- identify these articles as not having climate cles, as both have similar stylistic characteris- change misinformation. tics. • SSA • BBC: Similar to Guardian, we scrape articles Sceptical Science Arguments ( ) and SSB from their website under the category of cli- Sceptical Science Blogs ( ): This re- mate change. source focuses on explaining what science says about climate change.7 It publishes • Newsroom: This is a dataset released by general climate blogs and counters common Grusky et al. (2018) and consists of articles climate myths by putting forth arguments and summaries compiled from 38 different backed by peer-reviewed research. publications.5 We take a random sample of • CCCM CCCM articles which are not climate related. These : These are articles from the 3 articles test whether a detection system is organizations, as detailed in Section 4. These able to identify non-climate-related articles documents are the only documents with cli- as not having climate change misinformation. mate change misinformation in the test data. • Beetota Advocate: This is an Australian Table 5 contains statistics of the test set. satirical website which publishes articles on 5 Conclusion current affairs happening locally and interna- tionally;6 we scrape articles related to climate We introduced climate misinformation to the do- main of fake news and in the community of NLP. 5https://summari.es/ 6https://www.betootaadvocate.com/ 7https://www.skepticalscience.com/ Source #Doc Justin Farrell. 2016. Network structure and influence of the climate change counter-movement. Nature Guardian 80 Climate Change, 6(4):370. BBC 60 Justin Farrell. 2019. The growth of climate change mis- Beetota 70 information in us philanthropy: evidence from nat- Newsroom 100 ural language processing. Environmental Research SSA + SSB 50 Letters, 14(3):034013. CCCM 150 William Ferreira and Andreas Vlachos. 2016. Emer- gent: a novel data-set for stance classification. In Table 5: Test set document statistics. Proceedings of the 2016 conference of the North American chapter of the association for computa- tional linguistics: Human language technologies, We explored the literature around emergence of pages 1163–1168. CCCM organizations, the strategies and linguistic Kjersti Fløttum. 2014. Linguistic mediation of climate elements used by these organizations to construct change discourse. ASp. la revue du GERAS, (65):7– a narrative of climate misinformation. To help 20. in countering its spread, we scrape articles with Kjersti Flottum. 2017. The role of language in the cli- known sources of misinformation and release it to mate change debate. Taylor & Francis. the community. Kjersti Fløttum, Trine Dahl, and Vegard Rivenes. 2016. Young norwegians and their views on climate change and the future: findings from a climate con- References cerned and oil-rich nation. Journal of Youth Studies, 19(8):1128–1143. Dimitrinka Atanasova and Nelya Koteyko. 2017. Metaphors in guardian online and mail online Max Grusky, Mor Naaman, and Yoav Artzi. 2018. opinion-page content on climate change: War, reli- Newsroom: A dataset of 1.3 million summaries gion, and politics. Environmental Communication, with diverse extractive strategies. arXiv preprint 11(4):452–469. arXiv:1804.11283.

Prakhar Biyani, Kostas Tsioutsiouliklis, and John Shan Jiang and Christo Wilson. 2018. Linguistic sig- Blackmer. 2016. ” 8 amazing secrets for getting nals under misinformation and fact-checking: Evi- more clicks”: Detecting clickbaits in news streams dence from user comments on social media. Pro- using article informality. In Thirtieth AAAI Confer- ceedings of the ACM on Human-Computer Interac- ence on Artificial Intelligence. tion, 2(CSCW):1–23. Nitish Shirish Keskar, Bryan McCann, Lav R Varshney, Constantine Boussalis and Travis G Coan. 2016. Text- Caiming Xiong, and Richard Socher. 2019. Ctrl: A mining the signals of climate change doubt. Global conditional transformer language model for control- Environmental Change, 36:89–100. lable generation. arXiv preprint arXiv:1909.05858. John Cook, Dana Nuccitelli, Sarah A Green, Mark Yoon Kim. 2014. Convolutional neural networks for Richardson, B¨arbel Winkler, Rob Painting, Robert sentence classification. In EMNLP. Way, Peter Jacobs, and Andrew Skuce. 2013. Quan- tifying the consensus on anthropogenicglobal warm- Srijan Kumar, Robert West, and Jure Leskovec. 2016. ing in the scientific literature. Environmental re- Disinformation on the web: Impact, characteristics, search letters, 8(2):024024. and detection of wikipedia hoaxes. In Proceedings of the 25th international conference on World Wide Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Web, pages 591–602. Hung, Eric Frank, Piero Molino, Jason Yosinski, and Ruth McKie. 2018. Rebranding the Climate Change Rosanne Liu. 2019. Plug and play language mod- Counter Movement through a Criminological and els: a simple approach to controlled text generation. Political Economic Lens. Ph.D. thesis, Northumbria arXiv preprint arXiv:1912.02164. University. Riley E Dunlap and Robert J Brulle. 2015. Climate G. J. van Oldenborgh, F. Krikken, S. Lewis, N. J. change and society: Sociological perspectives. Ox- Leach, F. Lehner, K. R. Saunders, M. van ford University Press. Weele, K. Haustein, S. Li, D. Wallom, S. Spar- row, J. Arrighi, R. P. Singh, M. K. van Aalst, Riley E Dunlap and Peter J Jacques. 2013. Climate S. Y. Philip, R. Vautard, and F. E. L. Otto. 2020. change denial books and conservative think tanks: Attribution of the australian bushfire risk to anthropogenic climate change. Exploring the connection. American Behavioral Sci- Natural Hazards and Earth System Sciences Discus- entist, 57(6):699–731. sions, 2020:1–46. and Erik M Conway. 2010. Defeat- William Yang Wang. 2017. ” liar, liar pants on fire”: ing the merchants of doubt. Nature, 465(7299):686– A new benchmark dataset for fake news detection. 687. arXiv preprint arXiv:1705.00648.

James W Pennebaker and Laura A King. 1999. Lin- guistic styles: Language use as an individual differ- ence. Journal of personality and social psychology, 77(6):1296.

Ver´onica P´erez-Rosas, Bennett Kleinberg, Alexan- dra Lefevre, and Rada Mihalcea. 2017. Auto- matic detection of fake news. arXiv preprint arXiv:1708.07104.

Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog, 1(8):9.

Hannah Rashkin, Eunsol Choi, Jin Yea Jang, Svitlana Volkova, and Yejin Choi. 2017. Truth of varying shades: Analyzing language in fake news and polit- ical fact-checking. In Proceedings of the 2017 Con- ference on Empirical Methods in Natural Language Processing, pages 2931–2937.

Margaret E Roberts, Brandon M Stewart, Dustin Tingley, Christopher Lucas, Jetson Leder-Luis, Shana Kushner Gadarian, Bethany Albertson, and David G Rand. 2014. Structural topic models for open-ended survey responses. American Journal of Political Science, 58(4):1064–1082.

Andrew Salway, Samia Touileb, and Endre Tvinnereim. 2014. Inducing information structures for data- driven text analysis. In Proceedings of the ACL 2014 Workshop on Language Technologies and Computa- tional Social Science, pages 28–32.

Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dong- won Lee, and Huan Liu. 2018. Fakenewsnet: A data repository with news content, social context and dy- namic information for studying fake news on social media. arXiv preprint arXiv:1809.01286.

Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake news detection on social me- dia: A data mining perspective. ACM SIGKDD Ex- plorations Newsletter, 19(1):22–36.

James Thorne, Andreas Vlachos, Christos Christodoulopoulos, and Arpit Mittal. 2018. Fever: a large-scale dataset for fact extraction and verification. arXiv preprint arXiv:1803.05355.

Endre Tvinnereim and Kjersti Fløttum. 2015. Explain- ing topic prevalence in answers to open-ended sur- vey questions about climate change. Nature Climate Change, 5(8):744–747.

Andreas Vlachos and Sebastian Riedel. 2014. Fact checking: Task definition and dataset construction. In Proceedings of the ACL 2014 Workshop on Lan- guage Technologies and Computational Social Sci- ence, pages 18–22. arXiv:2004.14907v1 [cs.CL] 30 Apr 2020 aoIsiuehttps://www.cato.org/ https://wattsupwiththat.com/ www.manhattan-institute.org/ https://www.heartland.org/ https://friendsofscience.org/ https://www.fraserinstitute.org/ https://ipa.org.au/ Institute https://co2coalition.org/Cato https://www.climat that with up Watts Coalition Science Climate Zealand New Institute Manhattan Affairs Public of Institute The https://www.thegwpf Institute https://carbon-sense.com/ Heartland Foundation Policy Website Warming Global The https://australianclimatema Science of Friends Institute Fraser Coalition CO2 Coalition Sense Carbon The https://www.austra Madness Climate Australian Foundation Environmental Australian Name Organization h F https://thebfd.co.nz/ https://www.heritage.org/ Foundation Heritage BFD The al 1 Table al of Table : CCCM Organisations lianenvironment.org/ escience.org.nz/ .org/ dness.com/