News Source Credibility in the Eyes of Different Assessors
Total Page:16
File Type:pdf, Size:1020Kb
News Source Credibility in the Eyes of Different Assessors Martino Mensio, Harith Alani Knowledge Media Institute, The Open University, UK fmartino.mensio,[email protected] Abstract There are many efforts by journalists, fact- With misinformation being one of the biggest checkers and communities that annotate the rela- issues of current times, many organisations are tionship between the sources and misinformation. emerging to offer verifications of information Various tools exist that rely on such annotations and assessments of news sources. However, to notify users of the validity (or invalidity) of the it remains unclear how they relate in terms information they are looking at. of coverage, overlap and agreement. In this It is inevitable that different verification organ- paper we introduce a comparison of the as- isations will sometime produce slightly different, sessments produced by different organisations, in order to measure their overlap and agree- or even conflicting, assessments of certain news ment on news sources. Relying on the gen- articles or sources. It is also natural for verifica- eral term of credibility, we map each of the tion sources to focus on news sources and not oth- different assessments to a unified scale. Then ers. Such diversity is often needed, to reduce bias we compare two different levels of credibil- and to encourage further debate. This paper aims ity assessments (source level and document to reach a better awareness of such overlap and di- level) by using the data published by various versity, as a first step towards understanding how organisations, including fact-checkers, to see which sources they assess more than others, to relay such information to end users, and the po- how much overlap there is between them, and tential impact on their final judgements. how much agreement there is between their To the end, this study compares the different in- verdicts. Our results show that the overlap be- formation assessments available, looking for sim- tween the different origins is generally quite ilarities and differences between them. In order to low, meaning that different experts and tools perform this comparison, however, there is a set of provide evaluations for a rather disjoint set of problems that need to be addressed. Each asses- sources, also when considering fact-checking. sor uses a different set of labels and scores in their For agreement, instead we find that there are origins that agree more than others on the ver- assessments. This variety stems from different cri- dicts. teria and methods of assessment. Given the above, in this paper we investigate the 1 Introduction following Research Questions: In a public sphere polluted by different shapes of 1. How much do different assessments overlap misinformation, the most important role is played when evaluating the same sources? by the citizens (Cooke, 2017), with their ability to 2. How often do different assessments, and as- think critically and investigate while consuming a sessors, agree? And which ones are more piece of news. In order to investigate, there is a similar or different to each other? wide variety of tools that can help, by providing 3. Do fact-checkers check claims from sources indicators of credibility on different levels. that have been assessed at the source-level? Checking the source that published a piece of And do the outcomes agree? news is one of the first steps often referred to in To answer these questions, this work presents media literacy (Wineburg et al., 2016). Identify- two main contributions. First, after describing in ing who created the content and where it has been Section2 existing works that define credibility, in published can tell a lot about its credibility even Section3 we present our strategy to combine and before inspecting the news itself. make comparable different existing assessments, mapping from their specific features to a common position, partisanship and a set of factors that go measure of credibility. The second is the measure- beyond the scope of evaluating credibility. Then ment of overlap and agreement in Section4, to ex- there are mixed factors (e.g., bias) that can be gen- press how much the considered assessments evalu- erated by the community affiliation but affect the ate the same sources and whether or not they have believability as well. the same outcomes. Then Section5 presents the main challenges that we see for the next steps and 2.2 Real-world assessments Section6 concludes the paper. Here we present the existing assessments that 2 Related work provide credibility-related assessments, focusing mainly on human-generated evaluations that come In this section, we first look into studies that de- from experts in journalism. We can identify fine and formalise the concept of credibility. Then, two main levels of evaluations: source-level and after presenting the main assessments that are al- document-level. ready available, we take a look at some methods for combining them together. 2.2.1 NG: NewsGuard 2.1 Credibility formalisation On the source-level assessments, we start with One of the first works on source credibility (Hov- NewsGuard,2 that provides ratings of reliabil- land and Weiss, 1951) identified two main com- ity of online news brands through “nutrition la- ponents (trustworthiness and expertise), and sub- bels”. The ratings are generated by journalists sequent studies underlined the many different di- that rate the website compliance to a set of crite- mensions to be analysed (Gaziano and McGrath, ria.3 These criteria are divided into two groups: 1986). Meyer (1988) identified two main compo- credibility (publishing false content, presenting nents of credibility (believability and community information responsibly, regularly correcting er- affiliation), analysing their correlation with a set rors, distinguishing news and opinions, avoid- of items (bias, fairness, accuracy, patriotism, trust, ing deceptive headlines) and transparency (own- security) retrieved by interviewing users. Later ership and financing, clearly labelling advertis- works (Abdulla et al., 2004; Yale et al., 2015; ing, management, providing names of content cre- Kohring and Matthes, 2007) use this model and ators). The websites also receive an overall score find variations on the number of factors and the (a sum of points for each criteria) and a label, that importance of the items. A recent work (Proc- can be Trustworthy, Negative, Satire or hazka and Schweiger, 2019) compares measures Platform (user-generated contents, e.g., social of credibility and trust, to obtain a generalised networks or blogs). measure of trustworthiness of news media. From the point of W3C, the Credible Web Community 2.2.2 MBFC: Media Bias/Fact Check Group1 aims to standardise the assessments and Another origin of ratings is Media Bias/Fact credibility signals. Check,4 that provides reports assessing the factual From the analysis of these studies, we have as level of the sources (VERY LOW, LOW, MIXED, objectives to define which specific declination of HIGH, VERY HIGH) and the kind of bias credibility we want to use and to understand the (CONSPIRACY-PSEUDOSCIENCE, LEAST relationships of different components (bias, factu- BIASED, LEFT BIAS, LEFT-CENTER BIAS, ality, security, intent to deceive) with the selected PRO-SCIENCE, QUESTIONABLE SOURCE, measure of credibility. Of the two components RIGHT BIAS, RIGHT-CENTER BIAS, identified in Meyer (1988), we are more interested SATIRE). The evaluations are produced by a in the believability component, because we want team of editors with their own methodology.5 to have a measure of how much factual informa- tion a certain source provides and the quality of 2https://www newsguardtech com/ its journalism. Instead, the component of com- : : 3http://www:newsguardtech:com/ratings/ munity affiliation is more related to opinions and rating-process-criteria/ points of view, and it may be related to political 4https://mediabiasfactcheck:com/ 5https://mediabiasfactcheck:com/ 1https://credweb:org/ methodology/ 2.2.3 WOT: My Web of Trust big platforms (e.g., Facebook,11 Youtube12 and Google Search13). My Web of Trust6 is a crowdsourced reputa- tion service that provides assessments of websites 2.2.6 NTT: Newsroom Transparency Tracker through a browser extension. It provides two com- Newsroom Transparency Tracker14 is an initiative ponents of trustworthiness (“How much do you part of The Trust Project, that shares the infor- trust this site?”) and safety (“How suitable is this mation published by media outlets with respect to site for children?”), in terms of a score and a confi- four Trust Indicators (best practices, journalist ex- dence measure.7 Users can see the overall ratings pertise, type of work, diverse voices). Each source and comments coming from the community and is characterised by 15 attributes, belonging to one can provide their own rating. of the indicators, that can be full, partial or none 2.2.4 OS: OpenSources compliant. OpenSources8 was a resource for assessing online 2.2.7 FC: Fact-checkers information sources. The list was created looking On the document and claim level we have fact- for overall inaccuracy, extreme biases, lack of checkers, that manually review and debunk stories transparency, and other kinds of misinformation. by using different rating methods. In the last years The possible tags given to websites are the there has been a standardisation process towards following: fake news, satire, extreme a ClaimReview schema,15 in order to publish bias, conspiracy theory, rumor mill, the reviews in a structured format. However, only state news, junk science, hate news, some fact-checkers adopted the standard,16 and clickbait, proceed with caution, those who did sometimes differ in how they use political and reliable. A domain can some of the ClaimReview schema fields. have up to three different tags. The list was recently taken down, possibly due to lack of 2.3 Combining assessments updates, where many of the website they assessed Looking for ways to combine existing assess- have been shut down or have been renamed ments, we found studies spanning several ideas.