<<

Framing the mass : Exploring ‘fake ’ as a frame embedded in political discourse

Jan R. Riebling* University of Wuppertal,

Ina von der Wense* University of Bamberg, Germany

The recent growth of sites and sources has also seen the rise of an aggressive rhetoric decrying or parts thereof as being untrustworthy and politically biased. While it is unclear whether the ‘’ debate is directly connected with this, it is surely a framing of mass media. In this article, we use techniques of quantitative text analysis in order to analyse how the ‘fake news’ frame is structured and to understand its central determinants in terms of social context and political orientation. Using quantitative text analysis, we analyse the frame usage and semantic embeddedness in eight . We find evidence for a generalised frame that tends to be independent of political orientation of the .

Keywords

Alternative media, embedded frames, fake news, quantitative text analysis, structural embeddedness

Introduction The mass media are currently facing a crisis of trust (e.g. Denner & Peter, 2017; Schultz et al., 2017; Ziegele et al., 2018). Terms like ‘fake news’ and, in German media, the infamous ‘Lügenpresse’ have seen increased usage in public discourse and in the media (e.g. Bernhard, 2018; Denner & Peter, 2017). While a rational media critique based on a scientific foundation is helpful to improve functional and democratic mass media, spreading conspiracy theories and baseless allegations enlarges the social divide and increases the mistrust in established media (Schultz et al., 2017). In their study about the crisis of trust in German media, Ziegele and colleagues (2018) discovered a deficiency in knowledge about media and their mode of operation – for example, selection of issues, claim for balance and so on – which further increases scepticism towards the German media. This mistrust and disenchantment have led some activists to create their own alternatives to the established mass media (Downing, 2003). This growing divide, coupled with the overall growth of alternative media and sources of information, leads to the question of the state of this relationship. What are its determining factors? Does the social context matter? We try to shed some light on the semantic structure of this possibly antagonistic relationship by trying to reconstruct the framing of mass media from the perspective ______* Email: [email protected] * Email: [email protected] 58 Journal of Alternative and Community Media, vol. 4 (2019) of alternative online media. Our central assumption is the dependence of this relationship on the social context of the relationship. This question is addressed using weblogs as a data and techniques of quantitative text analysis as a methodology. The study is based on the fundamental concept of framing as developed by Erving Goffman (1986) and extended by Robert Entman (e.g. 1993). We supplement this concept by asking how a frame can be thought of and operationalised as being situated within a broader social context as well as embedded within the semantics of an already established discourse. By taking the words and concepts used in the same venue as ‘fake news’ and ‘Lügenpresse’ accusations, we focus on a pretty clearly delimited frame almost exclusively targeted at the mass media. This provides us with a relatively stable and easily identifiable frame, which we then use to develop our methodology for the analysis of the semantic embeddedness of frames. Working from the idea of frames embedded in a broader semantic space, we use a vector space model of text (Salton, 1979) and the spatial technique of latent semantic analysis (Deerwester et al., 1990) to operationalise our central concepts. Two working hypotheses are formulated in order to guide our research effort. First, we assume that the specific structural position in which alternative media sources find themselves with regard to the mass media and each other leads to a common frame that sees little differentiation, depending on who is employing it. Second, pre-existing social and political differentiations should make a difference in terms of framing, how the frame is used and how it is embedded.

The role of frames in media research The concept of framing is a relatively new approach in media research and has received increasing attention, particularly in communication studies and in the research into media impacts over the last two decades (e.g. Bonfadelli, 2015; Matthes, 2007, 2014; Pürer, 2014; Scheufele, 1999). When talking about frames and framing, researchers usually refer to Entman’s (1993: 52) widely accepted definition of this concept:

To frame is to select some aspects of a perceived reality and make them more salient in a communicating text, in such a way as to promote a particular problem definition, causal interpretation, moral evaluation, and/or treatment recommendation for the item described. Framing thus means to highlight specific aspects while others get pushed to the background. Frames can therefore be seen as offering ‘interpretation patterns’ (Scheufele, 2003: 46) that facilitate the processing of new information and affect the perception of the framed information (e.g. Bonfadelli, 2015; Pürer, 2014; Scheufele, 1999). The way a message is communicated influences how we react to it. Meanwhile, several studies confirm this effect (e.g. Simon & Jerit, 2007; Tversky & Kahneman, 1986). Simon and Jerit (2007), for example, analysed the impact of different wording concerning recipients’ attitudes. They determined that ‘exposure to articles featuring the exclusive use of “baby” or “fetus”, respectively, increased or decreased support for banning ’ (Simon & Jerit, 2007: 254). This effect consequently means that frames can be used for strategic purposes. In particular, political actors try to convey and establish their own perspective or interpretation by using specific frames (Matthes, 2007). This is often described as frame-building (e.g. Bonfadelli, 2015; Scheufele, 1999). Despite the tremendous increase in frame research, there is still no consensus concerning theory, and especially methodology (e.g. Jecker, 2014; Matthes & Kohring, 2004, 2008; Scheufele, 1999; Scheufele & Scheufele, 2010). While this line of inquiry has undoubtedly been one of the most productive in media research, recent studies have remarked critically on the lack of social context (Scheufele & Scheufele, 2010). Two dimensions of this critique are of interest to our discussion. First, it has been suggested that the scope of the theory needs to be widened by Riebling and von der Wense: Framing the mass media 59 including political power as a central dimension (Entman, 2007) as well as the multi-level effects produced by the interaction of framing and politics as a form of ‘cascading hierarchies’ (Entman, 2004: 9–13). Second, beyond the political context and power structures, a general lack of social context and a connection to sociological theories have been noted (Entman, 2010; Jecker, 2014). We are trying to address these desiderata by providing a quantitative methodology for the analysis of an embedded frame by focusing on the semantic and social context.

From ‘Lügenpresse’ to ‘fake news’ frame In the last few years, the term ‘Lügenpresse’1 has become more important in the political discourse. Particularly due to its popularisation by the PEGIDA movement, ‘Lügenpresse’ turned into a central notion in the political fight for opinion. The right-wing populist PEGIDA movement is a movement that against German and EU immigration policies. Since 2014, it has organised demonstrations in Dresden and other German cities. The concept of ‘Lügenpresse’ can be defined as a term for a general mistrust of the media: it includes mistakes made by as well as accusations of news outlets taking unilateral perspectives or of journalists telling and of potential or supposed influence of political or economic players (Ziegele et al., 2018). ’s statements and acts against the press in the context of the US election – especially his ‘fake news’ accusations – strengthened this anti-media atmosphere. The German media then intensified their reporting of ‘fake news’ and ‘Lügenpresse’ (Denner & Peter, 2017), and there was talk of a crisis of confidence in the media (e.g. Denner & Peter 2017; Schultz et al., 2017; Ziegele et al., 2018). Most of this criticism lacks any scientific foundation or empirical evidence (Schultz et al., 2017). Recent studies even reject the assumption that confidence in Germany’s media is decreasing (e.g. Schultz et al., 2017; Ziegele et al., 2018). Beyond these current examples, we find a broad range of expressions and terms related to the same underlying idea. Parts (or all) of the media cannot be trusted since they are reporting falsely and are corrupted through political power or . Therefore we take these words and their attached meanings as a concrete example of an underlying frame, which we venture to call the ‘fake news’ frame. It will serve as our example of the embeddedness of a frame within a broader social context. Compared with media in other countries, the German media are significantly more trusted. This concerns in particular public service broadcasting and (regional) . At the same time, there is little trust in the and especially alternative and blogs (Schultz et al., 2017; Ziegele et al., 2018). Yet a significant number of people are still fundamentally opposed to mass media (Ziegele et al., 2018), and they try to convince others that they are being lied to by the media (Schultz et al., 2017). These oppositionists attempt to resist opposite views and negative media reports, and tend to over-estimate the effects of media on attitudes (Neumann, 2015). By concentrating on this media-critical minority, media and public discourse tend to make them appear much more important than they actually are. Schultz and colleagues (2017) see an inherent danger in such a distorted perception (). At the same time, there is competition between mass media and alternative media for the same resources: attention and reach. One issue can usually be addressed by more than one frame at the same time (Matthes, 2007; Matthes & Kohring, 2004): ‘On most issues, highly vocal and well-organized promoters appear on both sides of a debate; therefore, one group’s frame will almost certainly be “countered” by another … Thus, no theme emerges without a countertheme’ (Callaghan & Schnell, 2005: 6). Since the most general theme/counter-theme division in contemporary political discourse is the distinction between left- and right-wing ideologies, we tend to assume, as a first working hypothesis, that the left–right divide between the analysed blogs should lead to two distinct ‘fake news’ frames, reflecting the political orientations. 60 Journal of Alternative and Community Media, vol. 4 (2019)

Weblogs: A case of alternative media The rapid development of and information technologies is providing an ongoing challenge to the study of media and communication, by transforming these phenomena ever more into ‘moving targets’ (Lecheler & Kruikemeier, 2016: 163). The patterns and trends in today’s online usage could be gone tomorrow, yet the abundance of new social data sources of unprecedented scale and detail has at the same time revealed the potential of the social sciences to become the ‘science of the twenty-first century’ (Watts, 2007: 489). It has led to the establishment of computational social science, a young but growing transdisciplinary effort to develop methods of collection and analysis of process-generated and large-scale digital data (Lazer et al., 2009). Weblogs and similar blogging services have received a lot of attention from researchers because they embody a combination of the two central data types in this field: text (e.g. articles and comments) and social networks (e.g. hyperlinks and blogrolls) (Heiberger & Riebling, 2016). The attention of media and communications research, as well as computational social science in general, has shifted more towards micro-blogging (almost always ) in recent years (e.g. Bollen, Mao & Zeng, 2011; Tumasjan et al., 2010). While this line of research has led to many important discoveries, it has also garnered some criticism in recent years. Two specific criticisms are of particular interest here. First, it has been suggested that Twitter data are often interpreted as being purely the product of social interaction, thereby neglecting the technical framework that creates and inevitably influences the resulting observations. This is a common problem in all data that are generated through social interaction in a technical medium (Riebling, 2018). Second, the techniques of quantitative text analysis have been criticised for overly generalising results that do only apply to very specific sub-populations of Twitter users (Cohen & Ruths, 2013). While the latter problem will be addressed more thoroughly in our technical section, the problem of socio-technical interference does not seem so prevalent in traditional weblogs as in contemporary micro-blogging services. The content hosted on a specific weblog is in most cases curated by a small group and in some cases only by a single author. Moreover, the interaction between authors on these platforms is not facilitated through the blogging infrastructure, meaning that content is not produced by a wide-ranging community through the use of the technical infrastructure of the site, like retweeting and . Traditional weblogs are, overall, not very sophisticated from a technical perspective. They allow for the presentation of content and the usage of hyperlinks, and provide a comment section to the readers. In comparison to newer social media, blogs are more akin to classic print media such as newspapers and . This reduces the problem of socio-technical interference significantly. Aside from providing easy access to relatively undistorted digital data, there are also substantial reasons for the importance of blogs in media and communication research. First, as already pointed out, they are structurally similar to the established mass media. They produce similar content, meaning articles, documentaries, opinion pieces, reviews and so on. However, they are not bound by institutional rules and regulations, as is the case for accredited journalists and mass media outlets. This, combined with the ease with which a weblog can be created, leads to a much wider range of diverse opinions and topics being expressed in weblogs. Since weblogs and mass media are structurally similar, they compete for the same resource: the attention of readers. This competition is characterised by a strong asymmetry as well as a mutual dependence. Weblogs generally have a faster production cycle, but are also much more specialised and lack the tremendous resources of the established media. Yet there is also a strong dependence between the two in form of a ‘source cycle’, in which the publication of one side becomes the source of the other (Messner & Distaso, 2008). Traditional mass media have successfully utilised the information provided by bloggers as well as their techniques of content creation, while bloggers Riebling and von der Wense: Framing the mass media 61 use the broader information content of the mass media as a basis for commentary and investigative (Graves, 2015; Lecheler & Kruikemeier, 2016; Meraz, 2009). As the current debate about ‘fake news’ suggests, this mutual dependence may very well be antagonistic, whereby both media outlets engage in criticism directed at the other. We assume that this antagonism is made manifest through the use of framing, and that it therefore can be observed in the way the opposition is framed. This leads to our second working hypothesis: the construction of an antagonistic framing of mass media by the alternative media through discursive means. Because of the similar position the blogs occupy in their relationship with the mass media, we predict the existence of a common ‘fake news’-frame. The resource asymmetry between blogs and mass media concerns almost all relevant positions: readership, staff, information sources, technical equipment and more. This begs the question of why blogs should be relevant at all in the mass media system. There are two main interconnected reasons to think of blogs as hidden champions in this arena: their networked structure and their high degree of specialisation. Blogs have a diverse readership and are connected through a network of hyperlinks, also known as the ‘’. This network structure shows a high connectivity as well as a high local clustering (Farrell & Drezner, 2008; Meraz, 2009). Such networks are known in social network analysis as ‘small-world networks’ (Watts & Strogatz, 1998) or ‘scale-free graphs’ (Newman, 2003).2 They facilitate fast and easy access to information through their high connectivity and short distances from other actors. Additionally, the high local clustering makes information dissemination easy and builds redundancies against data loss. Both effects increase the reach of a blog tremendously. It is therefore not surprising that they are a premier source for journalists, thereby further increasing their reach (Farrell & Drezner, 2008). While the actual readership might be small, they often act as information carriers by blogging or writing about the topic as well (Nahon et al., 2011). Information published by blogs can therefore easily reach a broad and diverse audience. The high degree of specialisation among blogs mostly seems to be a result of their historical development from a form of personalised online (Hookway, 2008: 94–6). This also allows them to focus their resources on a specific topic without having to worry about losing their audience. In fact, since they are able and willing to focus on topics and views of the , they are able to bind audiences that can hardly be reached by mass media. Whether this is seen as the deliberating discussions of alternative media (e.g. Harcup, 2003) or as a pretext to radicalisation and political action (Etling et al., 2010), it showcases the influential position of blogs’ ability to frame topics and set the agenda with regard to a specific audience. This power to bind audiences from the fringes of politics is further enhanced through the networked structure of the blogosphere, which makes it highly likely that a blog can be found with politics that align with one’s own. Considering this, weblogs seem like a natural choice to analyse framing processes in alternative media. They are highly relevant to the public discourse, while at the same time being locked in a competition with the mass media. This gives them a high incentive to frame their competition in specific ways in order to bind their audience and ‘symbolically’ defend their niche, thereby providing us with an opportunity to study framing and frames in a specific setting characterised by a power differential and competition. Furthermore, the specialisation strategy of blogs combined with their connection to fringe discourses makes it natural to use political orientation as a second dimension to observe. In this way, two effects can be observed and incorporated into our model. The antagonistic mutual dependence makes a negative framing of the mass media by the alternative media highly likely; at the same time, the high degree of specialisation and the connection to a specific audience should lead to more diverse frames in line with political orientations. 62 Journal of Alternative and Community Media, vol. 4 (2019)

Creation and composition of the text corpus In order to reconstruct the framing of mass media through the lense of alternative media, we selected weblogs that either described themselves in opposition to mass media or were described as such by other online sources. The reason for the focus on blogs opposing traditional media was to ensure that they would have a reason to be confrontational and utilise the ‘fake news’ frame. We therefore chose a most-similar sampling approach in terms of general orientation towards the media, which should make differences more pronounced and less dependent on our sampling strategy. In terms of their political orientation as well as other properties, we went for a strategy that sought to maximise the difference between the blogs. Out of an original pool of 55 weblogs, eight were selected after researching the blogs’ general political orientation and the direction of the content. A summary of the general characteristics of these blogs can be found in Table 1.

Table 1: Table description – overview of the general characteristics of the selected weblogs

We used a web-crawling routine to retrieve the weblog posts’ HTML-text and subsequently parsed it using the Pythons BeautifulSoup package. All the posts that were accessible through the blogs archive were used. Consequently, the window of observation starts with the oldest blog posts and stops in the first week of January 2018. Only the actual articles and some metadata (author, publication date, categories, etc.) were extracted, leaving out the comment section. This decision was made because the textual properties of the comment section sometimes differed very widely from the content observed in the rest of the weblog. Furthermore, it was not always clear how much influence the blogs’ moderators had on the comment section. Riebling and von der Wense: Framing the mass media 63

Following the extraction, the main body of text was broken down into tokens, consisting of single words and valid expressions. Additionally, the part of speech (the grammatical role) of the tokens was determined. Both tasks were done using the TreeTagger engine for German natural language processing (Schmid, 1995). After the tokenization, the corpus was inspected and cleaned. We removed a list of German stop words, consisting of the most common words, that was obtained from Pythons Natural Language Toolkit (Bird, Klein & Loper, 2009). 3 This is a common step in natural language processing, since these words tend to add no semantic value to the text. In addition to this, we also removed everything that was identified as a number or digit, special characters, URLs and any words that were riddled with special characters or had a length greater than characters. A descriptive overview of the general properties of the resulting corpus is provided in Table 2. We denote the absolute frequency by and the relative frequency by , with respect to either the posts or the tokens .

Table 2: Table of valid observations/tokens per blog following the removal of stop words and further exclusion of non-essential elements

The blogs were coded to have either a left- or right-wing orientation. This decision was based on an extensive reading of the weblog as well as taking note of the external description by other web sources. It was a conscious decision to code this as a simple dichotomy of left/right, since a more detailed approach would have raised many additional questions – for example, how appropriate a linear distinction would be in general, given the modern political discourse. In essence, this would have been a separate research question. In order to operationalise the concept of a ‘fake news’ frame, we extracted a list of words from our corpora that were connected to media or the press. The original list of roughly 2000 words was then reviewed to arrive at a corpus of 64 distinct words, selected on the basis that they were connected to a specific representation of media as being disingenuous, politically charged or outright fabricated. To give an example, the main candidates were words like ‘Lügenpresse’, ‘fake news’ and ‘’. We will be referring to this corpus as the FN-corpus in the rest of this article, thereby denoting the difference between the empirical corpus (FN) and the theoretical concept (‘fake news’). The step from qualitative to quantitative analysis inevitably leads to a loss in context- sensitivity and the amount of detail that can be included. What we gain, however, is the ability to scale our research to ever bigger corpora as well as being able to more easily reproduce our results. The loss of context-sensitivity is most pronounced when transferring the corpus from a collection of tokens into a numerical representation that can be subjected to statistical analysis. We use a bag-of-words (BOW) approach to represent our textual data numerically. This is a classical technique from the field of information retrieval (Salton, 1979). A single document is 64 Journal of Alternative and Community Media, vol. 4 (2019) represented as a vector containing the absolute frequency of all unique words

� = {�, … , �} in an ordered . The combined document vectors can be considered as a common space of words forming a �× matrix that is also known as a document-term matrix. Representing a document as a numerical vector means we lose the information on the sequential ordering of the tokens in the original document, yet this is a small price to pay for unlocking the power of linear algebra. It allows us to numerically transform the vectors and describe their properties in terms of vector geometries. As Karlgren, Holst and Sahlgren (2008: 531) note:

Vector space models have attractive qualities: processing vector spaces is a manageable implementational framework, they are mathematically well-defined and understood, and they are intuitively appealing, conforming to everyday metaphors such as ‘near in meaning’. In this way, vector spaces can be interpreted as a model of meaning, as semantic spaces. Having access to the techniques of linear algebra, we are able to operationalise properties like similarity by reformulating them. The law of cosines can be used to describe the similarity between document vectors in terms of the cosine of the angle between them (Salton, 1979). Following from the definition of the dot-product we can describe a cosine between two vectors of arbitrary dimensionality by dividing the dot-product of two vectors by the product of their normalised length:

Here the two vectors � and � refer to two document vectors containing the frequency of all unique words in the corpus. Therefore both have indexes of the same length . In essence, this can be seen as finding the correlation between all vectors that make up the document-term matrix. Calculating the cosine similarity is often prefaced by weighting the vectors in order to arrive at a more stable model and counteract outliers in the numerical value. The TFiDF-weighting scheme is often seen as the gold standard in quantitative text analysis and information retrieval (Soucy & Mineau, 2005; Spärck Jones, 1972). It weighs the term frequency by the inverse document frequency. The latter being defined as the logarithm of the ratio of the number of all texts to the frequency of posts containing that word . This results in heavier weights for words that occur in some but not all documents while at the same time increasing the weight for words that occur very often or in all texts. Building on this general framework, Deerwester and colleagues (1990) introduced the technique of latent semantic analysis (LSA). Using singular value decomposition the document- term matrix �× can be decomposed into three matrices: �× = �× ⋅ �× ⋅ �×.

The matrix contains the left singular vectors, the matrix consists of singular values along its diagonal while � is the hermitian transpose of the right singular vectors. This matrix can be reduced down to a rank �^ giving us a matrix expressing the following identity: �^ = �× ⋅ �× ⋅ �×

The reduced matrix �^ can be seen as a semantic space reduced down to its underlying dimensions. Choosing to be equal to the entire length of all documents would produce the original document-term matrix. Anything below this would give a reduced version in which the documents are described through dimensions which in turn are associated, to varying degrees, with the words in the corpus. The semantic content of a dimension can be interpreted through the absolute value of the association towards the words. The higher the absolute value the stronger is the connection between the dimension and the word. Riebling and von der Wense: Framing the mass media 65

It is also possible to place arbitrary vectors in the latent semantic space via a technique known as folding-in, which was also pioneered by Deerwester and colleagues (1990). This is simply done by constructing a vector (or matrix) � and placing it at the centroid of its respective dimensions: � = � ⋅ �× ⋅ �×.

The resulting vector (or matrix) provides us with the strength of the association between the folded- in vector and the dimensions to which the semantic space was reduced.

Framing media Even though the presence of words signifying a FN-frame was not the criteria for a selection of the blog into our dataset, we find them included in the posts of all eight weblogs. Overall, 1056 postings contain words also included in the FN-corpus, which in relative terms amounts to 7.6 per cent of the dataset. However, their distribution is very uneven, as can be seen in Figure 1. There also seems to be no clear distinction in terms of the political orientation of the blogs when it comes to framing mass media as untrustworthy and manipulative. While two of the right-wing oriented blogs occupy the top position on the graph, the middle is dominated by left-wing oriented blogs. The uneven distribution of posts and tokens across blogs (see Table 2) makes the overall picture even less clear-cut. For example, the weblog ‘netzfrauen.de’ accounts for 28.2 per cent of all blog posts as well as 43.3 per cent of all tokens in the corpus. Yet, at the same time, it contains only 143 FN-tokens, which make up 0.2 per centof the overall token count.

Note: The political orientation of the blogs is represented in the color with left-leaning blogs being colored red and right-leaning blogs in blue.

Figure 1: Bar chart showing the number of tokens in the FN-corpus per blog.

The presence of words from the FN-corpus only shows the general presence of such a frame, but not how it is used and whether its use differs between blogs. During the construction of our corpus denoting the FN-frame, we noticed the wide variety of different terms and usages associated with the selected words. For example, the German ‘Lügenpresse’ could be used as a description of the mass media or it could be referring to the term itself and its usage in the wider political discourse. In some cases, it was even turned around semantically by referring to persons and outlets as ‘Lügenpresse’-Rufer’ – someone who uses the ‘fake news’ accusation as a political 66 Journal of Alternative and Community Media, vol. 4 (2019) attack. While these special cases were not very common, they still shed light on the general problem of the context sensitivity of the framing process. Another special case that deserves mention is the creation of new words to address and frame the mass media in a specific way. The most instructive example is ‘Hochleistungspresse’, a word coined by the ‘neopresse.com’ weblog. It is hard to translate directly, but refers to a mainstream press dominated by intellectuals and the elite. Overall, this suggests that, instead of focusing at the overall prevalence of FN-words, the specific usage across blogs needs to be considered. As Table 4, located in the appendix, suggests, there is an overall tendency of all blogs to focus on a similar wording, at least when the sum of FN-words is considered. The top five words used across blogs tend to be rather similar. The list contains words like ‘propaganda’, ‘fake’ and ‘lügenpresse’. In many cases, the top words are also new creations of words or composites, thereby tending to be less context sensitive then words that are already politically charged. The trend of posts containing FN-words (see Figure 2) also suggests the existence of a general FN-frame. Since 2013, the number of posts containing terms from the FN-corpus has continuously increased. The overall growth seems to be exponential which is suggestive of an internal dynamic fuelling this process. One has to keep in mind, however, that the overall amount of blog posts has increased over time as well. In short, looking at the top five FN-words per blog as well as the development over time, we find some evidence for the existence of a general frame that addresses the mass media, or at least a part thereof.

Figure 2: Absolute frequency of posts containing FN-words over time

Yet this alone does not preclude systematic differences across blogs. It is entirely reasonable to assume that a specific term is used differently in certain publications. We tried to test this by calculating the cosine similarity between the blogs based on the vectors of the absolute frequencies of the words from our FN-corpus. This can be thought of as the profile of FN-words for a single blog. In case this profile would be the same, the cosine between them would be exactly 1, as the two vectors can be thought of as parallel. The resulting heat map is shown in Figure 3. Riebling and von der Wense: Framing the mass media 67

Note: Similarity is calculated as the cosine values of the tfidf-weighted wordvectors.

Figure 3: Heat map showing the relative similarities between the selected weblogs in usage of words belonging to the FN-corpus

We find that the overall similarity is relatively high within the right-wing group of the weblogs (fourth quadrant). On a first glance, the left-wing blogs seem to be more diverse internally; however, this can be attributed mostly to the relative dissimilarity of the blogs uebermedien.de und neulandrebellen.de. Both are similar to each other but dissimilar to almost all of the other blogs, as far as the usage of FN-words is concerned. The reason for this could be that both blogs are explicitly concerned with media critique,4 while at the same time being unable to use ‘Lügenpresse’ or similar wordings since these are associated strongly with right-wing extremism in the German political discourse. This is demonstrated first in their stronger reliance on words like ‘fake news’ and ‘fakes’ compared with the other blogs. Only the weblog tichyseinblick.de is somewhat similar in its stronger reliance on the terms from the English discourse, which seems to fit our description well since this blog is the next closest in terms of similarity. Second, in the case of uebermedien.de, we also find the aforementioned phenomenon of discussing and criticising the term ‘Lügenpresse’ itself. Yet this is also complemented by framing the mass media as ‘’ and relying more heavily on the term ‘propaganda’. While we do not find conclusive evidence for a common frame concerning mass media among all weblogs under consideration, we do find some evidence for a generalised frame among the majority of them. Furthermore, the components of the frame seem to be relatively indifferent to the political left–right dichotomy. This is demonstrated by two blogs of the left having more in common with the right-wing blogs than those in their own group. The framing of some specific media as being untrustworthy and deliverers of propaganda therefore seems to be a rather robust frame that exists relatively independently of political orientation. This is also highlighted by the dominant usage of the same words across all blogs and the increasing usage over time. However, while political orientation might not be very influential in terms of the general composition of the FN-frame, it still seems to have an effect when it comes to stronger political agendas and explicit media critique. 68 Journal of Alternative and Community Media, vol. 4 (2019)

Embeddedness of frames The notion of a generalised FN-frame is supported further when compared with the overall similarity in terms of topics addressed by the blogs. In order to find this similarity, a latent semantic analysis was carried out to reduce the space of all words down to the 30 most prominent semantic dimensions. Since the focus of our article is on the embeddedness of frames and the existence of a generalised FN-frame, the actual topics contained in the semantic dimensions are not interpreted here. Instead, we use the LSA mostly to reduce the dimensionality of the semantic space with the goals of having a more robust model of the general similarity between the blogs and finding the region of semantic space in which the FN-words are located. In order to make the model more transparent and to provide the interested reader with some background information, we have included the 12 most prominent dimensions as well as their strongest associations with specific words in the Appendix (see Tables 5 and 6). Using these 30 dimensions, the tokens of the blog posts are combined in a vector containing the frequency of each word. This vector is then situated in the semantic space of our model via folding in. The result is a vector for each blog giving the association of that blog with every one of the 30 dimensions of the semantic space, thereby providing us with a reduced profile detailing the topics and overall semantic directions of the blog. These profiles can then be related in the same way that the FN-profiles were compared before. Using the cosine as a measure of similarity again gives us Figure 4, a heat map detailing the overall similarity of the blogs with regard to the topics addressed by them.

Figure 4: Heat map showing the similarities between the selected weblogs in terms of the top 30 topics

The heat map shows that the blogs seem to be relatively similar with regard to the topics they address. However, there are no major distinctions between the political orientations or among the blogs. In addition to this, the similarity is generally at a mid- to high level, implying that these similarities could be constrained by some upper boundary. This helps to illustrate two central points. First, although the similarity is high it is not as consistently high as was the case with the similarity of the FN-words. This helps to showcase the relatively higher integration of the FN-frame and also its relative independence from the main content in the blog; otherwise, we would observe a comparable pattern of similarities as before. Second, the lack of any significant outliers – except for ‘netzfrauen.org’ which is still not as much of an outlier when compared to the previous similarity heat map – in either direction can be seen as the result of the aforementioned tendency towards Riebling and von der Wense: Framing the mass media 69 specialisation of alternative media. This is not evident at first glance. If we assume that there exists a general pressure on alternative blogs to be up to date and able to provide surprising news to their audience, as is generally the case for all media, then it would follow that they try to engage with topics that are currently important. At the same time, they are in a competitive arena, which restricts them from becoming too similar to their competition. This would help to explain the relative absence of peak values and discernible groups. Placing vectors in the LSA space through folding-in allows us to position not only the original documents (or the sum thereof) but also arbitrary vectors of words, as long as they have been included in the construction of the vector space model. We made use of this fact by creating a vector on the basis of the words contained in the FN-corpus. Folding-in of this FN-vector gives us its relationship with the 30 dimensions of our semantic space. Of these, we decided to take a closer look only at the six most prominent, since these were the ones with the highest association to the FN-vector. The resulting selection of dimensions, including their topmost 20 words, is shown in Table 3.

Table 3: Words and strength of their association with those LSA dimensions that have the strongest association to the FN-corpus. Table shows only the top 6 dimensions.

As already described in the method section, the sign of the negative association values cannot be interpreted in this case. Looking at the words associated with these dimensions, we find that they describe the dominant political and social issues of the last five years, which would also coincide with the growth period of posts containing FN-words (see Figure 2). Our interpretation of these dimensions is as follows: • Dimension 26 describes German domestic issues with a focus on party politics and politicians. • Dimension 8 seems to be associated mostly with topics regarding the migration and refugees crisis as well as the foreign politics associated with it. • Dimension 21 points towards the and its regulations, as emphasised by containing major EU policy terms and players. • Dimension 4 has a clear connection with the state debt crisis and EU monetary policies. • Dimension 19 consists of words associated mostly with environmental issues and problems. • Dimension 20 addresses the recently strained relationship between Western countries and . 70 Journal of Alternative and Community Media, vol. 4 (2019)

From this list of dimensions, we can see that the FN-frame – as an element of alternative media publications – seems to be embedded mostly in divisive societal issues. If we look at the association of the blogs with the selected six dimensions, we see no clear pattern emerge. Figure 5 shows these associations in the form of parallel coordinates. From this, we can see, for example, that ‘geolitico.de’ and ‘nachdenkseiten.de’ both have a strong association with dimension 4, which talks about the state debt crisis, meaning both blogs have engaged with this topic comparatively strongly. Yet looking at dimension 8 (migration and refugees), we find ‘geolitico.de’ only weakly associated with that specific topic. This could be seen as further evidence of the specialisation undergone by alternative media in order to cultivate a specific niche.

Figure 5: Parallel coordinates showing the absolute strength of the weblogs relationship with the top five dimensions associated to the FN-corpus by folding-in

With regard to the left–right divide, the specialisation becomes even clearer, while the evidence for political orientation governing the embedding of the FN-frame becomes much weaker. In order not to rely solely on graphical interpretations, we used k-means clustering in order to see how well the parallel coordinates could be expressed along the left–right divide. Comparing the original cluster with the one predicted by the model, we found scarcely any overlap (homogeneity score 0.048). While we have not found a strong indication of political embeddedness, we have found a strong focus on contemporary issues. The ‘fake news’ frame is embedded in diverse societal issues such as the migration and refugees crisis as well as the state debt crisis. We also interpret these findings as a confirmation of the above-mentioned specialisation of alternative media due to their competition for attention.

Discussion We began our discussion by analysing the embeddedness of the ‘fake news’ frame in terms of its wider social context. First we assumed that the left–right divide between the analysed blogs should lead to two distinct ‘fake news’ frames, reflecting the political orientation. This distinction could not be found in our data. In fact, on the contrary, the components of the frame seem to be relatively indifferent towards the political left–right dichotomy. While political orientation might not be very influential in terms of the general composition of the ‘fake news’ frame, however, it still seems to have an effect when it comes to stronger political agendas and explicit media critique. Riebling and von der Wense: Framing the mass media 71

Our second working hypothesis suggested that the relationship between mass media and alternative media leads to an antagonistic framing of mass media by the alternative media through the use of a common frame. This assumption was grounded in the observation of a significant power differential between mass media and alternative media. While we find some evidence for a generalised ‘fake news’ frame, the existence of outliers suggests that some social contexts can change the frame fundamentally. We also find that the frames are tied to highly relevant societal issues, but even so they are embedded in differing contexts and content, so they remain internally consistent. While our study was able to shed some light on the complex interactions between politics and media frames, there are still some limitations that should be borne in mind. First and foremost, the operationalisation of political orientation seems to have been too broad. The simple dichotomy of left- and right-leaning politics does not capture the variety and complexity of political ideas in the online world. Second, we focused on specific online blogs and the topics contained in them. Although this choice can be defended on methodological grounds, including articles in our research would have allowed for a better understanding of the general topics. Quantitative text analysis, and specifically models of semantic space, seem to provide a promising new way towards a deeper understanding of frames and how they are influenced by social context; however, there are still many open questions. The operationalisation of semantic embeddedness could be improved further by larger datasets and methods that take further external variables into account. Promising candidates include, but are not limited to, the influence of readership, network positions relative to other alternative media, multi-level models combining the aforementioned factors and so on. On the theoretical side of things, we hope that our methodology can lead up to a more formal and overall integrative, theoretical notion of frames. 72 Journal of Alternative and Community Media, vol. 4 (2019)

Appendix Table 4: Top 5 words designated as signifiers for the ‘fake news’ topic per weblog.

Riebling and von der Wense: Framing the mass media 73

Table 5: Words and strength of their association with LSA dimensions – dimensions 0 to 5

Table 6: Words and strength of their association with LSA dimensions – dimensions 6 to 11

74 Journal of Alternative and Community Media, vol. 4 (2019)

References Bernhard U (2018) ‘Lügenpresse, lügenpolitik, lügensystem. Wie die berichterstattung über die pegida-bewegung wahrgenommen wird und welche konsequenzen dies hat. Medien & Kommunikationswissenschaft 66(2): 170–87. Bird S, Klein E & Loper E (2009) Natural Language Processing with Python – Analyzing Text with the Natural Language Toolkit. Newton, MA: O’Reilly Media. Bollen J, Mao H & Zeng X (2011) Twitter mood predicts the stock market. Journal of Computational Science 2(1): 1–8. Bonfadelli H (2015) Medienwirkungsforschung 5, Überarbeitete Auflage. Konstanz: UVK Verlagsgesellschaft mbH. Callaghan K & Schnell F (eds) (2005). Framing American Politics. Pittsburgh, PA: University of Pittsburgh Press. Cohen R & Ruths D (2013) Classifying political orientation on Twitter: It’s not easy! Proceedings of the ICWSM. Deerwester S, Dumais ST, Furnas GW, Landauer TK & Harshman R (1990) Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6): 391–407. Denner N & Peter C (2017) Der begriff lügenpresse in deutschen tageszeitungen. Publizistik 62(3): 273–97. Downing JD (2003) Audiences and readers of alternative media: The absent lure of the virtually unknown. Media, Culture & Society 25(5): 625–45. Entman RM (1993) Framing: Toward clarification of a fractured paradigm. Journal of Communication 43(4): 51–8. —— (2004) Projections of Power. Chicago: University of Chicago Press. —— (2007) Framing bias: Media in the distribution of power. Journal of Communication 57(1): 163–73. —— (2010) Media framing biases and political power: Explaining slant in news of campaign 2008. Journalism 11(4): 389–408. Etling B, Kelly J, Faris R & Palfrey J (2010) Mapping the Arabic blogosphere: Politics and online. New Media & Society 12(8): 1225–43. Farrell H & Drezner DW (2008) The power and politics of blogs. Public Choice 134(1/2): 15–30. Goffman E (1986) Frame Analysis: An Essay on the Organization of Experience. Boston: Northeastern University Press. Graves L (2015). Blogging back then: Annotative journalism in IF Stone’s Weekly and Talking Points Memo. Journalism 16(1): 99–118. Harcup T (2003) The ‘unspoken-said’ the journalism of alternative media. Journalism 4(3): 356–76. Heiberger RH & Riebling JR (2016) Installing computational social science: Facing the challenges of new information and communication technologies in social science. Methodological Innovations 9: 1–11. Hookway N (2008) ‘Entering the blogosphere’: Some strategies for using blogs in social research. Qualitative Research 8(1): 91–113. Jecker C (2014) Entmans Framing-ansatz: Theoretische grundlegung und empirische umsetzung. Munich: UVK. Karlgren J, Holst A & Sahlgren M (2008) Filaments of meaning in word space. Proceedings of European Conference on Information Retrieval. Dordrecht: Springer, 531–8. Lazer D Pentland A, Adamic L, Aral S, Barabási A-L et al. (2009) Computational social science. Science 323(5915): 721–3. Riebling and von der Wense: Framing the mass media 75

Lecheler S & Kruikemeier S (2016) Re-evaluating journalistic routines in a digital age: A review of research on the use of online sources. New Media & Society 18(1): 156–71. Matthes J (2007) Framing-effekte. Zum einfluss der politikberichterstattung auf die einstellungen der rezipienten. Vol. 13. Munich: Reinhard Fischer. —— (2014). Framing. 1. Auflage. Baden-Baden: Nomos. Matthes J & Kohring M (2004) Die empirische erfassung von medien-frames. Medien & Kommunikationswissenschaft 52(1): 56–75. —— (2008) The content analysis of media frames: Toward improving reliability and validity. Journal of Communication 58(2): 258–79. Meraz S (2009). Is there an elite hold? Traditional media to social media agenda setting influence in blog networks. Journal of Computer-Mediated Communication 14(3): 682–707. Messner M & Distaso MW (2008). The source cycle. Journalism Studies 9(3): 447–63. Nahon K, Hemsley J, Walker S & Hussain M (2011) Fifteen minutes of fame: The power of blogs in the lifecycle of viral political information. Policy & Internet 3(1): 1–28. Neumann K (2015). Reziproke effekte auf rechtsextreme. Erweiterung des modells und empirische daten. Medien & Kommunikations Wissenschaft 63(2): 190–207. Newman MEJ (2003). Mixing patterns in networks. Physical Review E 67(2): 026126. Pürer H (2014) Publizistik- und kommunikationswissenschaft. 2. völlig überarbeitete und erweiterte Auflage. Konstanz: UVK Verlagsgesellschaft mbH. Riebling JR (2018) The medium data problem in social science. In: Stuetzer CM, Welker M & Egger M (eds), Computational Social Science in the Age of Big Data: Concepts, Methodologies, Tools, and Applications. Köln: Herbert von Halem, 77–103. Salton G (1979) Mathematics and information retrieval. Journal of Documentation 35(1): 1–29. Scheufele BT (2003) Frames – framing – framing-effekte. Wiesbaden: Westdt Verl. Scheufele BT & Scheufele DA (2010) Of spreading activation, applicability, and schemas: Conceptual distinctions and their operational implications for measuring frames and framing effects. In: D’Angelo P and Kuypers JA (eds), Doing News Framing Analysis: Empirical and Theoretical Perspectives. New York: Routledge, 110–34. Scheufele DA (1999) Framing as a theory of media effects. Journal of Communication 49(1): 103–22. Schmid H (1995) Improvements in part-of-speech tagging with an application to German. Proceedings of the ACL. Dublin: ACL. Schultz VT, Jackob N, Ziegele M, Quiring O & Schemer C (2017) Erosion des vertrauens zwischen medien und publikum? Media Perspektiven 5: 246–59. Simon AF & Jerit J (2007) Toward a theory relating political discourse, media, and public opinion. Journal of Communication 57(2): 254–71. Soucy P & Mineau GW (2005) Beyond tfidf weighting for text categorization in the vector space model. Proceedings of International Joint Conference on Artificial Intelligence, 19: 1130–5. Spärck Jones K (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation 28(1): 11–21. Tumasjan A, Sprenger TO, Sandner PG & Welpe IM (2010). Predicting elections with Twitter: What 140 characters reveal about political sentiment. Proceedings of the ICWSM. Tversky A & Kahneman D (1986) Rational choice and the framing of decisions. The Journal of Business 59(4): S251–78. Watts DJ (2007) A twenty-first century science. Nature 445(7127): 489. Watts DJ & Strogatz SH (1998) Collective dynamics of small-world networks. Nature 393: 440–2. 76 Journal of Alternative and Community Media, vol. 4 (2019)

Ziegele M, Schultz T, Jackob N, Granow V, Quiring O & Schemer C (2018) Lügenpresse- hysterie ebbt ab. Media Perspektiven 4: 150–62.

Notes

1 ‘Lügenpresse’ means ; we do not translate this term literally because of its special status in the German political discourse. 2 There are many subtle and not so subtle distinctions here; however, since this article is not concerned with the network properties of blogs as objects of analysis, we chose to forego a more in-depth discussion. The interested reader is referred to Nahon et al. (2011). 3 A description of the German stop words corpus can be obtained at its source: https://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/snowball/stopwords. 4 In the case of ‘uebermedien.de’, it is already part of the name, but also clearly stated as the goal of the website. This is also true for ‘neulandrebellen.de’, but to a lesser extent since they mostly seem to frame themselves as the vanguard of the new media.