Associative Framing
Associative Framing
A unified method for measuring media frames and the media agenda
Wouter van Atteveldt1, Nel Ruigrok2, and Jan Kleinnijenhuis1
1 Department of Communication Science, Free University Amsterdam, [email protected], [email protected] 2 Department of Communication Science, University of Amsterdam, [email protected]
Abstract. Communication Theory, such as Framing, is using increas- ingly sophisticated models for message content and transfer. To support this theoretical work it is needed to also devise more sophisticated meth- ods of content analysis, as the manual thematic content analysis often practised is too expensive and specific to a single theory to provide for the large corpora needed to test sophisticated communication models. This paper proposes Associative Framing, a Relational Content Anal- ysis method based on the marginal and conditional reading chance of concepts. This provides a Communication Theoretic interpretation of linguistic co-occurrence analysis, and can help communication research by increasing the reach of automatic analyses and create more generic data sets, allowing for the simultaneous testing of different theories. We illustrate this method with a case study on the associations of Islam and Terror in newspapers from three countries.
Keywords: Content Analysis, Computer Content Analysis, Relational Content Analysis, Framing, Agenda Setting, Co-occurrence
Introduction
Theoretical progress in Communication Science depends largely on empirical evidence to test and sharpen theories. Such empirical evidence often consists wholly or partially of media content, making Content Analysis an important technique for progress in Communication Science. The last decades have seen a sharp increase in the number and complexity of communication theories, among which Framing takes a prominent place. This increased theoretical complexity has not been fully accompanied with increased sophistication of Content Analysis methods. In particular, content is generally analyzed using manual Content Analysis specifically aimed at extracting infor- mation for a particular research question. Such often expensive types of research rely on one off coding schemes leading to relatively small data sets that are very difficult to combine. This makes it very difficult to compare the explanatory power of competing theories, hindering theoretical progress. This article proposes Associative Framing, a probabilistic Relational Con- tent Analysis approach. In this approach, we extract connections between con- cepts based on textual proximity. These concepts are relatively close to the text, making automated extraction feasible. Theoretically relevant variables, such as frames, are then defined as patterns or features over the graph. By introducing this layer between the actual text analysis (nodes and edges) and the theoretical concepts (frames), it is easier to combine data sets and test different theories on the same coded texts. Presenting associative frames we rely on a method for extracting asymmetric association patterns based on calculating the conditional reading chance. This method is similar to existing methods, but using a probabilistic approach allows for easier extensions and combination with existing work. In taking this approach, this article makes two contributions. Firstly, we pro- vide an interpretation of co-occurrence graphs in terms of recent theoretical work in Communication Science, decreasing the gap between theory and measurement. Secondly, we propose a probabilistic operationalization of co-occurrence, give a substantive interpretation to the edge weights of the co-occurrence graph and suggest ways for extensions based on more thorough linguistic analysis. The next section will discuss the Communication Theories that this method aims at. This is followed by a conceptual definition of Associative Framing in the third section and a concrete operationalization using a probabilistic model of co-occurrence in section 4. The fifth section will give an extended example, showing some of the information that can be extracted using this method. The final two sections offer a brief discussion of other potential uses and extensions to this method and the conclusions.
Theoretical Framework
Within the communication literature we can distinguish a number of theories dealing with the transfer of salience on different levels. The most general level looks at the transfer of salience of concepts in media messages to the concepts in the public mind. This is the core idea of the Agenda Setting theory. A more specific level focuses on the transfer of salience of concept-attribute pairs. This is the core idea of both the second level of Agenda Setting and some research into Framing, such as Issue Framing. Framing research, however, also includes dif- ferent views about the process of Framing. Both first and second order Agenda Setting originate from a linear model of influence stating that there is a lin- ear relationship between the messages sent and the reception of those messages by the public. Other researchers focusing on Framing theories look at the pro- cess differently, stating that it is a more complex process than just a transfer of salience. In fact they argue that a certain frame can strengthen particular frames in the audience’s mind. Moreover, they argue that only one concept, for example the picture of the Omarska detention camps in Bosnia, can trigger prior estab-
2 lished associations in people’s minds, such as associations with the Holocaust. In the next sections we will discuss these theories in more detail, finishing with a theoretical argumentation for associative frames as the common denominator.
Agenda Setting
The core of Agenda Setting research was stated years before the actual term was coined by Bernard Cohen stating that the mass media ‘may not be successful much of the time in telling people what to think, but it is stunningly successful in telling its readers what to think about’ (Cohen, 1963, p.13). In their seminal Chapel Hill study McCombs and Shaw (1972) introduced the term Agenda Set- ting to describe this influence after finding a nearly perfect correlation between the public agenda and issue visibility in the media. If Agenda Setting occurs, people come to believe an issue is more important after exposure to the issue through the mass media than before. In the years since this first study, Agenda Setting has turned out a robust and conceptually clear theory, with numerous studies reproducing these effects and elaborating on the theory (see for example Dearing and Rogers, 1996; McCombs and Bell, 1996; Rogers et al., 1993). Dearing and Rogers (1996, p.22) formulated a three-component model of the Agenda Setting process, consisting of (a) the media agenda, which influences (b) the public agenda, which in turn may influ- ence (c) the policy agenda. Expanding the original model with influences on the media agenda, researchers divided the theory into Agenda Building and Agenda Setting processes, with the media agenda being the dependent variable in the building phase and the audience as the dependent variable in the setting phase (see Scheufele, 1999). The common theoretical base underlying the large variety of Agenda Building and Agenda Setting studies is the transfer of salience, with salience interpreted as the “degree to which an issue on the agenda is perceived as relatively important” (Dearing and Rogers, 1996, p.22).
Second Level Agenda Setting
Elaborating on the original Agenda Setting hypothesis, McCombs and Shaw (1993, p.62) argue that Agenda Setting “is a theory about the transfer of salience, both the salience of objects and the salience of their attributes.” McCombs and Ghanem (2001) speak about a second level of Agenda Setting. They argue a divi- sion between objects and their attributes. “Beyond the agenda of objects there is another aspect to consider. Each of these objects has numerous attributes, those characteristics and properties that fill out the picture of each object” (p.68). The attributes connected to the objects form the central part of this Second Level Agenda Setting, or Attribute Agenda Setting. According to the researchers ‘these attributes suggests that the media also tells us how to think about some objects’ (p.69).
3 Framing
During the last decades, the study of Framing gained an important place in the field of communication research. Similarly to the second level of Agenda Setting, Framing theory also deals with the influence on how to think about objects. This becomes clear in the seminal definition of Framing by Entman (1993) who defines the concept as selecting “some aspects of a perceived reality and make them more salient in a communicating text, in such a way as to promote a particular problem definition, causal interpretation, moral evaluation, and/or treatment recommendation for the item described.” (p.52) Entman’s definition shows the multi-faceted nature and complexity of Fram- ing research. Besides a transfer of salience it is about selection, and recommen- dation, including not only the communicator but also the audience. In the same line as Agenda building and Agenda Setting, researchers distinguish a frame building process with the media as the dependent variable and a frame setting process where the audience is the dependent variable (Scheufele, 1999; De Vreese, 2002), as presented in Figure 1.
Agenda Building Agenda Setting
Sender Receiver External Influences Media Agenda Message Audience Agenda (Elite, Real World, …) Media Frames Audience Frames
Frame Building Frame Setting
Fig. 1: Setting and Building in the Communication Pipeline
The many theoretical complexities attached to Framing cause Entman (1993) to complain about a lack of structure and paradigmatic unity. Starting from Rosengren (1993) and Beniger (1993) arguments stating that three paradigms infuse the communication research (constructionist, critical and cognitive ap- proach), D’Angelo (2002) takes a more optimistic view and sees Framing as a multiparadigmatic research program. Due to the inherent link between salience and cognition, we focus here on the cognitive perspective.
The Cognitive Approach
The construction of mental models or frames is a central part within the cog- nitive approach of Framing. Within this approach Framing can ‘encourage par- ticular trains of thoughts about political phenomena’ (Price et al., 1997, p.483). Grounded in cognitive psychology, the approach uses the associative network
4 model of human memory (Collins and Quillian, 1969), proposing that the con- cepts in semantic memory are represented as nodes in a complex hierarchical network. Each concept in the network is directly linked to other related concepts. Collins and Loftus (1975) refined this model in introducing important changes regarding the processing of information in the network. They talk about the automatic spreading of activation. According to them the processing of a con- cept is manifested in the network as the activation of the appropriate node that represents it. When the proper concept node is activated, activation continues automatically to all connected nodes. Minsky (1975) connected this view to Framing when he defined a frame as a structure containing various pieces of information. These discursive or mental structures are closely related to the de- scription of a schema as ‘a cognitive structure that represents knowledge about a concept or type of stimulus, including its attributes and the relation among those attributes’ (Fiske and Taylor, 1991, p.98).
Framing: Definition and Process
Within frame setting research there are two critical questions: ‘what are frames?’ and ‘how are frames transferred between media an audience?’ In the following sections we will discuss some distinctions made by researchers trying to answer these questions.
Equivalency versus Emphasis Frames Within the answers to the question of what frames are, an important distinction is made between equivalency frames and emphasis frames. ’Equivalency frames’ present an issue in different ways with “the use of different, but logically equivalent, words or phrases” (Druckman, 2001, p.228). In experiments researchers found systematic changes of audience’s preference when the same problem is presented in different wordings, such as rescuing some versus sacrificing others (Tversky and Kahneman 1981; Quattrone and Tversky 1988; for a general discussion see Levin et al. 1998). Emphasis frames, later transformed into the term issue Framing (Druckman, 2004), on the other hand highlight a particular “subset of potentially relevant considerations” (Druckman, 2001, p.228). In line with Entman’s definition is- sue Framing can be defined as a process of selecting and emphasizing certain aspects of an issue on the basis of which the audience can evaluate the issue described or the protagonists associated with the issues. In our study we will focus on issue Framing rather than equivalence Framing, since we are interested in the relationship between different concepts and their attributes, rather than in the different descriptions of a certain concept. Issue frames reflect networks of concepts and attributes.
Linear versus Interactive Frame Setting
The second question has also lead to a number of different hypotheses. The behavioralists consider the transfer of salience as a linear process, straight from
5 the sender into the audience. According to criticasters these accessibility models “portray the individual as rather mindless, as automatically incorporating into the final attitude whatever ideas happen to pop into mind”, that is, whatever is suggested in a mediated frame. By contrast, other researchers suggest a more complex situation in which meanings are produced and exchanged between the sender, the receiver and the larger community in which they operate. For example, Nelson et al. (1997) show that Framing effects occur through a complex psychological process in which receivers of messages consciously think about the importance of different considerations suggested by a frame. In other words, the Framing process can be regarded as an interaction between textual features and the interpreter’s social knowledge. This interaction process leads to a construction of a mental model, as a resulting state of interpretation (Rhee, 1997). According to Graber (1988), people use schematic thinking to handle in- formation. They extract only those limited amounts of information from news stories that they consider important for incorporation into their schemata. She added that the media make major contributions to this schema formation. Be- sides the creation of these mental models, the Framing process can trigger an already existing mental model, or frame within the receivers perception. In their studies Snow and Benford focus on social movement Framing and the individ- ual and collective action effects. The researchers define frames as “interpretive schemata” that organize the meanings of objects, situations, events, experiences, and sequences of action for social actors (Snow and Benford, 1992, p.137). They state that media frames and audience frames interact through ’frame alignment’ (Snow et al., 1986) and ’frame resonance’(Snow and Benford, 1988).
Association: the Common Denominator of Agenda Setting and Framing In the sections above we described both Agenda Setting and Framing research conducted over the last decades. Despite the big differences between the actual angles taken by the researchers, we perceive a common denominator between those studies in the use of associations, either between concepts, concepts and attributes or as more complex networks of concepts. In this study, therefore, we will focus on what we call ‘Associative Framing’. These frames, which extend the object-attributes in second level Agenda Setting, consist of associations between objects and other objects. Taking the cognitive perspective as described above, these frames refer to the earlier described schemata of interpretation of (Goff- man, 1974). The corresponding audience frames are seen as associative networks as described by (Collins and Quillian, 1969). Analogous to Agenda Setting as a transfer of salience of concepts, this allows us to see associative frame setting as a transfer of salience of the links between concepts. Where these concepts are the attributes of other concepts, this is identical to Second Level Agenda Setting, but we believe that it is fruitful to look at the associative network as one large interconnected network rather than as the attributes of individual concepts. Thus, this model is related to the audience frames or schemata in which individuals form strong associations between different mental concepts. The re-
6 lationships between these different concepts can be triggered through outside cues, such as news messages, consistent with second level of Agenda Setting of McCombs and Estrada (1997) and the strong media effect postulated by Graber (1988). Although the general transfer of salience from the media to the audience has been found by numerous Agenda Setting studies, there are various hypothe- ses about the transfer of attribute and relation salience from media frames to audience frames. Thus, rather than hypothesizing a direct transfer of relation salience, we propose developing a framework for testing these different hypothe- ses in a systematic way, in which an automatic measurement of associative media frames such as proposed here is an important first step.
Associative Framing: Measuring the News
For testing hypothesis regarding media effects or media logics, it is necessary to measure the news content. For Agenda Setting research, which hypothesizes the transfer of salience of concepts, a thematic content analysis suffices (Holsti, 1969; Krippendorff, 2004). Second Level Agenda Setting and Framing theories, on the other hand, postulate more complicated patterns. It is possible to measure these frames directly using thematic content analysis, and in fact well described and tested methodologies exists such as the one descibed in Semetko and Valken- burg (2000). The measurement variables in such an approach, however, are far removed from the text, making it difficult to automate such analyses. Moreover, material annotated using this method have low reusability, as a slight change in the definition of frames neccesitates a reannotation. Also, the opaque nature of human annotation makes it difficult to judge the effects of coder culture and bias on the obtained results. For these reasons, we think it would be beneficial to use Relational Content Analysis methods for Framing research (Roberts, 1997; Carley, 1997). These methods represent message contents as a graph of relations betweeen concepts that are relatively close to the text. The target variables, such as frames, are then defined as patterns or metrics on the graph representation. This allows for the post-hoc redefinition of frames as long as they can be expressed using the same concepts and relations (nodes and edges). Additionally, this allows for easier testing of competing hypotheses as they can be based on the same annotated material. Finally, graph-based data structures are intensively studied in Graph Theory, Knowledge Representation, and Social Networks Analysis, which has lead to the development of toolkits and techniques that can be used to gain more insight into relational content data (see for example Van Atteveldt, Kleinnijenhuis, and Carley, 2006). Within Relational Content Analysis, some methods, such as Automap (Car- ley, 1993; Diesner and Carley, 2004) and TABARI (Schrodt and Gerner, 1994; Schrodt, 2001), rely on simple keyword and co-occurrence based information to derive this graph. Other methods, such as NET (Van Cuilenburg et al., 1988; Kleinnijenhuis et al., 1997), assume a more complicated data structure with dif- ferent edge types (action, causation, affinity) and signs (association versus dis-
7 sociation), which makes it more challenging to automate this method, although work is being done in that direction to include ‘subjective’ language resources (Van Atteveldt et al., 2004) and grammatical structure (Kleinnijenhuis, 2006). We propose a new method called Associative Framing. This method uses simple occurrence and co-occurrence to derive edges between predefined nodes. In particular, it uses a probabilistic model, using the marginal reading chance of concepts as measure for visibility, and conditional reading chance as association measure. The combination of concept visibility and associative links constitute Associative Frames. Using unspecified association as links makes the data structure proposed here relatively simple. We realize that some of the frames proposed in the literature employ more complex relations between concepts than unspecified association. However, we think that simple association is sufficient to study a large propor- tion of these theories, and can still be a valuable tool for other studies as a exploratory step. This simple data structure allows for the automatic analysis of large amounts of text. Furthermore, as will be discusses in the penultimate section of this article, it is easy to extend this method to more complicated struc- tures if linguistic markers can be found, extending the reach of this method. Associative Frames can be found both as Media Frames, in terms of linguistic co-occurrence patterns, and as Audience Frames, based on concept and attribute salience and associative memory structures such as described by Collins and Quillian (1969) and Collins and Loftus (1975). This method specifically does not assume a particular Frame Setting process. We can imagine direct ‘hypodermic’ transfer of salience but also more complicated filtering, interaction or frame ac- tivation mechanisms. What combination of these processes best describes media effects is a very important question, and a formal and structural frame represen- tation method such as our proposal can help in containing empirical evidence to answer this question.
A Probabilistic Model of Associative Framing
In Associative Framing we assume that media messages can be reduced to con- texts containing atomic events which have a certain probability of occurring, and within which co-occurrence is meaningful. More concretely, we assume that there is a document size, such as a sentence, paragraph, document, or newspaper, for which we can measure the occurrence of our target concepts and within which we want to know whether they co-occur. The occurrence of concepts is measured using synonyms or keywords as indi- cators of the target concept, along with disambiguating conditions. An example of this is requiring the phrase “President Bush” to occur in a document as a condition of accepting the word ‘Bush’ as an indicator of the concept. This would prevent articles about George H.W. Bush or Governor Jeb Bush to be mistakenly counted while still allowing the use of just ‘Bush’ as an indicator in other sentences within the article. For example, in the two sentences from a constructed article, we will find the following keyword counts, assuming sensible
8 keywords for the target concepts Bush, Immigration, and American Values (the latter including speaking English).
Sentence Bush Immigration Values Bush Continues Campaign for Immigration Reform 1 1 0 New arrivals to this country must adopt American values and 1 1 2 learn English, President Bush said Wednesday
To transform these keyword counts to probabilities, we need a function from [0, ∞> to [0, 1]. We would wish this function to increase monotonically but sub- linearly from zero towards one, and the probability of a concept with two syn- onyms should equal the probability of encountering either of the two synonyms if they were separate keywords. A set of functions satisfying these constraints is given in equation 1, where c and m stand for the concept and message being 1 investigated, and the parameter b is the probability of a concept encountered only once. This latter parameter is difficult to base on theoretical grounds. It could be modelled empirically, but we believe that results should be fairly stable with different settings. We would advice a value around .5 for short contextual units (sentence or paragraph), and lower (such as .25 or even .1) for longer units such as documents. µ ¶ 1 count(c,m) p(c|m) = 1 − 1 − (1) b 1 Using this to assign probabilities to the sentences above, taking b = 50% yields:
Bush Immigration Values Sentence Count p(c—m) Count p(c—m) Count p(c—m) Bush Continues ... 1 .5 1 .5 0 0 New arrivals ... 1 .5 1 .5 2 .75
Associative Frames are defined for the unit of analysis, which is a set of documents probably corresponding to a given time period and/or medium. In essence, the formula above translates this into a term by document matrix con- taining probabilities as cell values. On this matrix we define the two measures comprising Associative Frames, visibility and associations, as the marginal and conditional probabilities. Visibility is the marginal probability of a concept, in other words the chance that if a single message is received from the set of messages, that message will contain that concept. This probability is based on the chance of a concept oc- curring in a message and the chance of receiving that message. This latter prob- ability could be based on message properties, such as position in a newspaper and reach of that newspaper for articles, but also characteristics of the potential receiver of the message in individual level analyses. In a formula, where p(m) is the chance of receiving a message (the normalized weight of a message): X V isibility(c) = p(c) = p(m)p(c|m) (2) m
9 The association between two concepts, called the base and target concepts, is defined as the conditional probability of a message containing the target concept given that that message contains the base concept. This corresponds to the following formula, where ct is the target concept and cb is the base concept. P m p(m) · p(cb|m) · p(ct|m) ass(cb → ct) = (3) V isisility(cb) In our two-sentence example, this leads to the associations below:
Association with Base Concepts Visibility Bush Immigration Values Bush .5 - .5 .19 Immigration .5 .5 - .19 Values .38 .38 .38 -
Motivation and Relation to other Methods In the preceding section, we proposed using marginal and conditional probability to describe association patterns. This section will give a number of reasons why we think this is a good representation, and also discuss how it relates with existing association measures. The simplest alternative to probabilities is using the raw keyword co-occur- rence counts, such as in Automap (Diesner and Carley, 2004). These numbers, however, are very difficult to compare between data sets and even concepts, and also suffer from strong autocorrelation between different edges from the same node (cf Krackhardt, 1987). Moreover, outliers such as very long documents can strongly influence these counts, necessitating a normalization using local and global weighting in studies such as Deerwester et al. (1990). The resulting normalized numbers are hard to interpret. Probabilities do not suffer from these problems, having a very clear substantive interpretation as (conditional) reading chance. Marginal and conditional probabilities are a direct extension of simple di- chotomous occurrence of concepts. This is similar to the crisp versus partial set membership such as used by Fuzzy Logics. The method being a direct extension means that it is valid to use deterministic concept occurrence, ie a concept is either present or not, while still remaining within the Associative framework. In that case, visibility reduces to the weighted proportion of documents mentioning a document, and associations are the proportion of documents containing the base concept that also contain the target. In fact, Tversky (1977) describes a similarity measure quite similar to a dichotomized version of Associative Frames. Another advantage of using probabilities is that Statistical Natural Language Processing methods generally return a probability distribution over possible out- comes or at least a confidence estimation of the best outcome. Using a proba- bilistic graph representation allows the seamless integration of such qualified information.
10 Additionally, probability calculus is a well established field of mathematics, and many other methods are built on its foundations. Although this is beyond the scope of this article, it is thinkable that one could uses Bayesian Networks for representing frames (Jensen, 2001). Also, probability calculus gives us a natural way to extend the models presented here by making the (conditional) probability models more complicated, some examples of which will be given below. Another possibility is using a generative model for the media producer, viewing media content as something produced based on an internal state of the media producer using. This could be a natural way to estimate confidence of media data and might be a useful way to model theories on media production. Although all of this would require substantial theoretical and methodological work before yielding results, building the graph representation on a probabilistic foundation makes it easier to use such established methods and might be a first step in fruitful interdisciplinary research. Another important choice was to use an asymmteric association measure. The main argument for that is substantive: currently John Bolton only appears in the news in articles on the United Nations, making the association from Bolton to UN very strong, while the reverse association is fairly weak. Tversky (1977) also notes that the semantic distance of one concept to another is often different from its inverse, for example they find that Hospital is more similar to Building than the other way around. This choice rules out many existing metrics such as correlation and the cosine distance often used in Information Retrieval systems. As a final note we would like to state that our association metric is fairly sim- ilar to metrics like cosine distance, correlation, and regression. All these metrics are based on the dot product of the variable vectors with some normalization. Cosine distance and correlation both normalize on the length (standard devia- tion) of both vectors, while corrleation is also based on mean centered variables. Regression coefficients model only on the standard deviation of the target vari- able, making it equivalent to our metric except for the centering. A statistic often used in linguistic co-occurrence analysis, the χ2 test, is a measure of the significance rather than the strength of the association (Manning and Sch¨utze, 1999), making it less relevant for describing media frames (although it could be used to test whether such found frames deviate from some prior expected distribution, although QAP test or an explicit modeling of message production as a Bernoulli process might be more useful for this purpose).
Case Study: Islamic Terrorism and Terrorist Islamists
As a case study and proof of concept of the method described above, we per- formed an explorative Associative Framing analysis of Islam and Terror in Dutch, British and U.S. newspapers from 2000 to 2005. This section does not attempt to give a full description of this analysis or substantively add to the body of
11 knowledge on newspaper coverage of this issue.3 Rather, it is meant to showcase the possibilities of the Associative Framing method. We analyzed one ‘popular’ and one ‘quality’ newspaper in each country. For the U.S., these were the USA Today and The Washington Post; for the U.K. The Guardian and The Sun; and for the Netherlands de Volkskrant and De Telegraaf. We selected all articles mentioning Terrorism or Islam, and words related to these concepts, from January 2001 until September 2005. In total, this yielded 114,751 articles containing over half a million mentions of either Terrorism or Islam. Table 1 gives an quantitative summary of the corpus.
Table 1: Overview of the analyzed newspapers
Newspaper Country Type #articles #hits Sum of hits De Telegraaf Netherlands ‘Popular’ 7,025 12,215 27,613 De Volkskrant Netherlands ‘Quality’ 16,528 29,340 81,560 The Sun United Kingdom ‘Popular’ 14,499 21,644 40,962 The Guardian United Kingdom ‘Quality’ 21,567 35,920 95,860 USA Today United States ‘Popular’ 12,003 19,882 60,612 The Washington Post United States ‘Quality’ 43,129 71,813 206,097 Total 114,751 190,814 512,704
This corpus was analyzed by counting occurrences of a number of concepts, including Islam, Terror, Government Actors, Legislative Actors, and a number of word lists for positive and negative associations and ‘patriotic’ phrases such as ‘our great nation’ and ‘God bless America’. These keyword counts were trans- 1 formed using the formula listed above and b = 0.25. As we selected articles on the first two concepts, we can only report visibility of these concepts and associations of these concepts with each other and with the other concepts. Figure 2 displays the unnormalized visibility of Islam and Terror in the three countries during the examined time period. The vertical scale is the total number of articles about the concept, with the year and month on the horizontal scale. In all countries, there was a steep increase in visibility of both Terror and Islam after 9/11, and again after the attacks in Madrid on 3/11/2004 and the attacks in London on July 7 2005 London Bombings . In all three countries, but most noticeably so in the U.S. newspaper, visibility of both concepts were at a steadily higher level after 9/11 than before. Islam peaks at each of the three terror events, but also around the spring of 2002, a particularly violent period in Israel, and in the Netherlands after the murder on the filmmaker Theo van Gogh by a fundamentalist Muslim in November 2004. Figure 3 shows the associaton patterns of Islam and Terror with the political actors and with Positive, Negative, and Patriotic words. The middle column
3 For this purpose, please see a recent conference paper Ruigrok and Van Atteveldt (2006) or contact the authors for a preview of a recently submitted article on this topic.
12 (a) U.S. (b) U.K.
(c) Netherlands
Fig. 2: Visibility of Terror and Islam
shows the pattern during the first three months after 9/11 and the left and right columns show the period before and after that, respectively. Associations below 10% are not shown. Both the UK and the Netherlands whow an increase in the association of Is- lam with negativity in the period directly after 9/11, falling back to the pre-9/11 level after three months. The US started out with a high level of negative asso- ciations, although that also drops from 2002. Associations with positive words decrease in all countries. In all countries both Terror and Islam are more strongly associated with the executive than with the legislative branch, with only the US showing a pattern of associating (the fight against) Terror with legislative actors. The use of patriotic words in both Terror and Islam contexts increases strongly in the American press, falling back after the first three months but staying higher
13 Government Legislative Government Legislative Government Legislative 0.40 0.46 0.15 0.38 0.37 0.13 0.48 0.46 0.19 0.11 0.28 0.16 0.15 0.27 Terror Islam 0.43 Terror Islam 0.70 Terror Islam
0.25 0.35 0.28 0.19 0.43 0.30 0.33 0.28 0.44 0.44 0.21 0.31 0.29 0.36 0.10 0.37 0.22
Pattriotism Negative Positive Pattriotism Negative Positive Pattriotism Negative Positive
(a) US, before 9/11 (b) US, 9/11 – 12/11 (c) US, after 12/11
Government Legislative Government Legislative Government Legislative 0.30 0.35 0.28 0.28 0.20 0.22 0.16 0.16 0.13 Terror Islam Terror Islam Terror Islam 0.55 0.35 0.21
0.18 0.26 0.23 0.13 0.33 0.25 0.24 0.30 0.33 0.41 0.20 0.20 0.28 0.27 0.34 0.19
Pattriotism Negative Positive Pattriotism Negative Positive Pattriotism Negative Positive
(d) UK, before 9/11 (e) UK, 9/11 – 12/11 (f) UK, after 12/11
Government Legislative Government Legislative Government Legislative 0.28 0.22 0.24 0.18 0.11 0.14 0.20 0.15 0.27 0.28 0.21 0.21 Terror Islam Terror Islam Terror Islam 0.45
0.10 0.24 0.13 0.12 0.26 0.17 0.18 0.32 0.19 0.35 0.11 0.11 0.27 0.14 0.27 0.11
Pattriotism Negative Positive Pattriotism Negative Positive Pattriotism Negative Positive
(g) Neth, before 9/11 (h) Neth, 9/11 – 12/11 (i) Neth, after 12/11
Fig. 3: Associations of Terror and Islam than before 9/11. This pattern is also seen to a lesser degree in the British and Dutch press. Finally, Figure 4 plots the change in association between terrorism and Islam and the other way around over the studied time period. In both the UK and the US the association of Terror with Islam is fairly steady around 0.15 throughout the investigated period. The reverse association clearly peaks at 9/11 and stays high in the US newspapers. In the British press it also peaks, but falls steadily after that, coming to almost pre-9/11 levels right before the London bombings, after which it shoots up again. The Netherlands shows a different picture. The association of Terror with Islam is higher than in the other countries, although this is difficult to compare directly due to the different keywords (as it is a differ- ent language). The change in value is easier to compare. Although the association of Islam with terror also peaks after 9/11, it actually declines after the murder on Van Gogh, which is unexpected since a local ’terror’ event was expected to lead to an increase of association of Islam with terror. The reverse association did increase, indicating that terror was mainly discussed in the context of the (fundamentalist) Islam, even though the Islam itself was associated with other concepts, as the debate shifted to integration, culture, and civil rights.
14 (a) U.S. (b) U.K.
(c) Netherlands
Fig. 4: Changes in association between Terror and Islam
Use Cases and Extenstions
The results presented in the previous section give a high-level overview of the associations of a small number of concepts in a very large time span. This sec- tion will attempt to answer the question of what we can do with this kind of data by describing a number of use cases for this data. Moreover, it will briefly describe some possible extensions to the method within the probabilistic associ- ation framework.
Explorative Research The method presented here can be used as a relatively quick and cheap explorative step to take before a more detailed investigation. It clearly indicates the time spans in which the ‘action’ occurs and can also be used
15 to select countries or newspapers to study. For example, suppose one would like to investigate how the Islam is portrayed in the international press, this suggests that the beginning of 2002 might be an interesting period to study next to the ‘obvious’ periods around the attacks, and that a comparison of Dutch and US press captures more variety than US and UK press. These are not answers to research questions per se, but can help focus research on periods where answers on such questions can be found and avoid some ‘sampling bias’ in selecting periods and newspapers to investigate.
Direct hypothesis testing A second possibility is to use this data to answer a number of research questions quantitatively. As described in (Ruigrok and Van Atteveldt, 2006), chi squared analyses can determine whether differences between periods or associations are significant. Also, time series analysis could be run to detect a ‘pack journalism’ or intra-media Agenda Setting effect, either at the visibility or association level.
Part of a Model Thirdly, and most interestingly, would be to use this data as a variable in a model including information on audience or political attitudes and beliefs, such as survey data or a text analysis of open survey questions, political speeches, or text from Internet fora or newsgroups. This can be used to test different linear or non-linear models of media effects at the associational level.
Possible Extensions
The above sections described a simple mechanism to extract and represent Asso- ciative Frames. Although these Frames are a powerful tool for both explorative research and hypothesis testing, we realize that for certain research questions it might be necessary to measure more complex frames. This section will discuss a two possible extensions to the basic model.
Typed or signed edges In the current proposals, edges are limited to a quan- titative representation of the strength of association between nodes. The mea- surement of the evaluative content of text is difficult, but recently a lot of work has been done on this in Computational Linguistics (Van Atteveldt et al., 2004; Esuli, 2006). If these techniques are sufficiently accurate, it is quite easy to ex- tend our model to incorporate them. In the section above we calculated the direct association of Islam with posi- tive and negative keywords. This effectively measures an attribute of Islam such as described by Second Level Agenda Setting. We can also use this to enrich the relation between Terror and Islam. If we determine the probability of encoun- tering a negative word given that we encountered both concepts, we essentially capture the ‘mood’ of the association. This can be used to create typed (multi- plex) edges rather than just associations. Also, we can substract the association
16 with negative from the association with positive, and create a signed association from a set of antonyms.
Integrating Grammatical Features The extraction mechanism proposed in this article uses surface proximity to determine relations. Since two words need to be relatively close to express a relation between them, this is actually not that bad an indicator of relatedness. However, if the grammatical structure of sentences is available, it might be useful to base relatedness on syntactic structure rather than surface structure. This is especially true if one is interested in more specific relationship types, such as negative or causal relations, as it is quite possible for a negative word to be used in a sentence without applying to the concept under investigation. The probabilistic model presented here can also be adapted to those circum- stances. For example, if one has the syntactic tree structure of all sentences under investigation, instead of measuring what the chance is of two items co-occurring within a paragraph or 10-word window, we can measure the co-occurrence within a clause. Alternatively, we can count the number of ‘steps’ or edges in the syn- tactic or dependency tree and use that instead of surface word distance.
Conclusion
This paper presented Associative Frames, a probabilistic Relational Content Analysis method based on keyword co-occurrence. Agenda Setting, Second Level Agenda Setting, and Emphasis or Issue Framing can all be seen as theories about the transfer of association patterns or networks from the media to the receiver. By measuring the individual concepts rather than the whole frame, we are able to use computer techniques to automate this measurement. Moreover, the data obtained using this method is less dependent on the specific definition of the measured Frames, increasing data sets reusability and making it easier to test different theories on the same data. In this way, we have given a clear interpretation of linguistic co-occurrence in terms of current Communication Theory, which helps to bridge the gap between theoretical sophistication and measurement techniques. We have also presented a methodology for calculating association scores as a conditional reading probability. This measure is asymmetric, conforming to substantive intuitions about the nature of association. Moreover, probability calculus is a well understood field of mathematics, making the model easy to extend and compare to other methods. Finally, we gave an example content analysis of the associations between Islam and Terror in the Dutch, British and American press between 2000 and 2005, showcasing the power of this method to analyze large amounts of text. We also proposed a number of other research use cases possible with this method, and ways to extend this method to incorporate more sophisticated linguistic knowledge.
17 This method has some limitations, however. First, we accept that not all frames can be expressed as simple association patterns. For some frames, it will be necessary to extend the current model, for example by distinguishing between types of relations or by measuring whether the relation is positive or negative. Although these measurements are difficult linguistically, a lot of work is being done on such problems in the Computational Linguistics community, and the model can easily be extended as soon as acceptable accuracy is reached on the linguistic extraction. Also, certain Framing theories, such as Equivalency Framing, are even more difficult to fit into this model, since they are not directly based on association networks. Another limitation is the difficulty of interpreting co-occurrence measures on keywords. An overly broad or narrow set of keywords for a concept can easily skew the results or measure something completely different from the intended concept. For this reason, it is very important to go ‘back to the text’ and qual- itatively assess whether the found contexts are actually expressing the relation that one is interested in. This makes the initial phase of creating keyword lists more labor-intensive that an ‘automatic approach’ might suggest, but once the lists are in place and well tested, there are little extra costs in analysing more documents, making this approach particularly well suited for resarch topics that are ongoing or cover a large amount of texts. The creation and evaluation of key- word lists can be done more systematically using Keyword-in-context programs or manual coding of a subset of documents, but it is ultimately the interaction between the quantitative measurement and the qualitative control that ensures a correct interpretation of the results. These limitations notwithstanding, this paper provides a clear communica- tion theoretical interpretation and probabilistic operationalization of co-occurrence. This yields a powerful and flexible method for the automatic analysis of text, which is a contribution to the measurement techniques currently available to the Communication Scientist. This will aid theory development by allowing multiple theories to be tested simultaneously on large corpora.
18 References
Beniger, J. (1993). Communication: embrace the subject, not the field. Journal of Communication 43(3), 18–25. Carley, K. (1993). Coding choices for textual analysis: A comparison of content analysis and map analysis. Sociological Methodology 23, 75–126. Carley, K. (1997). Network text analysis: The network position of concepts. In C. Roberts (Ed.), Text Analysis for the Social Sciences, pp. 79–100. Mahwah, NJ: Lawerence Erlbaum Associates. Cohen, B. (1963). The press and foreign policy. Princeton, NJ: Princeton Uni- versity Press. Collins, A. and Loftus (1975). A spreading activation theory of semantic memory. Psychological Review 82, 407–428. Collins, A. and M. Quillian (1969). Retrieval time from semantic memory. Jour- nal of Verbal Learning and Verbal Behavior 8, 240–248. D’Angelo, P. (2002). News framing as a multi-paradigmatic research program: A response to entman. Journal of Communication 52(4), 870–888. De Vreese, C. (2002). Framing Europe: Television News and European Integra- tion. Amsterdam: Aksant. Dearing, J. and E. Rogers (1996). Agenda setting. Thousand Oaks, CA: Sage. Deerwester, S. C., S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman (1990). Indexing by latent semantic analysis. Journal of the Amer- ican Society of Information Science 41 (6), 391–407. Diesner, J. and K. Carley (2004). Automap1.2 - extract, analyze, represent, and compare mental models from texts. Technical Report CMU-ISRI-04-100, Carnegie Mellon University, School of Computer Science, Institute for Software Research International. Druckman, J. N. (2001). The implications of framing effects for citizen compe- tence. Political Behavior September 2001, 225–256. Druckman, J. N. (2004). Political preference formation: Competition, deliber- ation, and the (ir)relevance of framing effects. American Political Science Review 98(4), 671–686. Entman, R. M. (1993). Framing: Toward clarification of a fractured paradigm. Journal of Communication 43(4), 51–58. Esuli, A. (2006). Sentiment classification bibliography. Annotated bibliography maintained at http://liinwww.ira.uka.de/bibliography/Misc/Sentiment.html. Fiske, S. and S. Taylor (1991). Social Cognition, 2nd Ed. New York: McGraw- Hill. Goffman, E. (1974). Frame analysis: an essay on the organization of experience. Boston: Northeastern University Press. Graber, D. (1988). Processing the News: How People Tame the Information Tide. Lanham, MD: University Press of America. Holsti, O. (1969). Content Analysis for the Social Sciences and Humanities. Reading MA: Addison-Wesley. Jensen, F. V. (2001). Bayesian Networks and Decision Graphs. Springer. Kleinnijenhuis, J. (2006). Applications of graph theory to cognitive communica- tion research. In K. Krippendorff and M. Bock (Eds.), The Content Analysis Reader (forthcoming). Thousand Oaks: Sage. Kleinnijenhuis, J., J. De Ridder, and E. Rietberg (1997). Reasoning in economic discourse: an application of the network approach to the Dutch press. In C. Roberts (Ed.), Text Analysis for the Social Sciences; Methods for Drawing Statistical Inferences from Texts and Transcripts, pp. 191–207. Mahwah, New Jersey: Lawrence Erlbaum Associate. Krackhardt, D. (1987). Qap partialling as a test of spuriousness. Social Net- works 9, 171–186. Krippendorff, K. (2004). Content Analysis: An Introduction to Its Methodology (second edition). Sage Publications. Levin, I., S. Schneider, and G. Gaeth (1998). All frames are not created equal: A typology and critical analysis of framing effects. Organizational Behavior & Human Decision Processes 76, 149–88. Manning, C. and H. Sch¨utze(1999). Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press. McCombs, M. and T. Bell (1996). the agenda-setting role of mass communica- tion. In M. Salwen and D. Stacks (Eds.), An integrated approach to commu- nication theory and research, pp. 93–110. Mahwah, NJ: Lawrence Erlbaum. McCombs, M. and G. Estrada (1997). The news media and the pictures in our heads. In S. Iyengar and R. Reeves (Eds.), Do the media govern?, pp. 237–247. London: Sage. McCombs, M. and S. Ghanem (2001). The convergence of agenda setting and framing. In S. Reese, O. Gandy, and A. Grant (Eds.), Framing public life, pp. 95–106. Mahwah, NJ: Lawrence Erlbaum. McCombs, M. and D. Shaw (1993). The evolution of agenda-setting research: Twenty-five years in the marketplace of ideas. Journal of communica- tion 43 (2), 58–67. McCombs, M. E. and D. L. Shaw (1972). The agenda-setting function of mass media. Public Opinion Quarterly 36, 176–187. Minsky, M. (1975). A framework for representing knowledge. In P. H. Winston (Ed.), The Psychology of Computer Vision. New York: McGraw-Hill. Nelson, T. E., Z. Oxley, and R. A. Clawson (1997). Toward a psychology of framing effects. Political Behavior 19 (3), 221–46. Price, V., D. Tewksbury, and E. Power (1997). Switching trains of thought. the impact of news frames on readers’ cognitive responses. Communication Research 24, 481–506. Quattrone, G. and A. Tversky (1988). Contrasting rational and psychological analyses of political choice. American Political Science Review 82, 719–736. Rhee, J. (1997). Strategy and issue frames in election campaign coverage: A social cognitive account of framing effects. Journal of Communication 47,
20 26–48. Roberts, C. W. (Ed.) (1997). Text Analysis for the Social Sciences: Methods for Drawing Statistical Inferences from Texts and Transcript. Mahwah, NJ: Lawrence Erlbaum. Rogers, E., J. Dearing, and D.Bregman (1993). The anatomy of agenda-setting research. Journal of Communication 43(2A), 68–84. Rosengren, K. E. (1993). From field to frog ponds. Journal of Communica- tion 43(3), 6–17. Ruigrok, N. and W. Van Atteveldt (2006). Global angling with a local angling: How us, british and dutch newspapers frame global and local terrorist attacks. In Presentation at the 47th Annual Convention of the International Studies Association (ISA), 22–25 March, San Diego. Scheufele, D. (1999). Framing as a theory of media effects. Journal of Commu- nication 29, 103–123. Schrodt, P. (2001). Automated coding of international event data using sparse parsing techniques. In Annual meeting of the International Studies Associa- tion, Chicago. Schrodt, P. and D. Gerner (1994). Validity assessment of a machine-coded event data set for the middle east, 1982-1992. American Journal of Political Sci- ence 38(3), 825–854. Semetko, H. A. and P. M. Valkenburg (2000). Framing european politics: A content analysis of press and television news. Journal of Communication 50 (2), 93–109. Snow, D. A. and R. D. Benford (1988). Ideology, frame resonance, and partici- pant mobilization. International Social Movement Research 1, 197–217. Snow, D. A. and R. D. Benford (1992). Master frames and cycles of protest. In A. D. Morris and C. M. Mueller (Eds.), Frontiers in Social Movement Theory, pp. 133–155. New Haven: Yale University Press. Snow, D. A., E. B. Rochford, S. K. Worden, and R. D. Benford (1986). Frame alignment processes, micromobilization, and movement participation. Ameri- can Sociological Review 51, 464–481. Tversky, A. (1977). Features of similarity. Psychological Review 84 (4), 327–352. Tversky, A. and D. Kahneman (1981). The framing of decisions and the psy- chology of choice. Science 211, 453–458. Van Atteveldt, W., J. Kleinnijenhuis, and K. Carley (2006, 19-23 june). Rcadf: Towards a relational content analysis standard. In Presentated at the Inter- national Communication Association (ICA), Dresden. Van Atteveldt, W., D. Oegema, E. van Zijl, I. Vermeulen, and J. Kleinnijenhuis (2004). Extraction of semantic information: New models and old thesauri. In Proceedings of the RC33 Conference on Social Science Methodology, Amster- dam. Van Cuilenburg, J., J. Kleinnijenhuis, and J. De Ridder (1988). Tekst en Betoog: naar een Computergestuurde Inhoudsanalyse van Betogende Teksten. Muider- berg (Netherlands): Coutinho.
21