Natural Language Processing for Framing

Noah Smith University of Washington [email protected]

Collaborators: David Bamman (UCB), Amber Boydstun (UCD), Dallas Card (CMU), Justin Gross (UMass), Brendan O’Connor (UMass), Philip Resnik (UMD)

May 24, 2016

These slides: http://tinyurl.com/framing-noah Outline A R K

I Motivation: a study in which we’re using NLP I Building a text classifier: 1. Define the classes 2. Annotate training examples 3. Featurize data

I Brief tangent: creating new features 4. Learn to classify 5. Evaluate the classifier

I Looking ahead Some Terminology A R K

Natural language processing (NLP):

Algorithms that do useful things with text.

(for someone) (or other linguistic data)

Framing is choosing “a few elements of perceived reality and assembling a narrative that highlights connections among them to promote a particular interpretation.” Entman (1993, 2007) Media Framing and Public Opinion A R K

I We know that framing works . . . sometimes.

I Lack of systematic tests of framing effects on public opinion

When do media framing and public opinion covary? Hypotheses A R K

H1: Issue Salience The covariance between media framing of immigration and public opinion will be stronger during periods of time when immigration is highly salient in the media, relative to periods of time when the issue is not highly salient. Hypotheses A R K

H1: Issue Salience The covariance between media framing of immigration and public opinion will be stronger during periods of time when immigration is highly salient in the media, relative to periods of time when the issue is not highly salient.

H2: Frame Competition The more diffuse media coverage of immigration is across competing frames, the weaker the covariance between media framing of the issue and public opinion about the issue will be. I Media tone: count(pro) – count(anti)

I High salience: ≥ 350 articles published in the quarter? (binary)

I Frame competition: Shannon entropy across emphasis framing dimensions in the quarter

Variables A R K

I Public mood (dependent variable) from Stimson (2014) (higher is more liberal) I Media tone: count(pro) – count(anti) I High salience: ≥ 350 articles published in the quarter? (binary) I Frame competition: Shannon entropy across emphasis framing dimensions in the quarter

Variables A R K

I Public mood (dependent variable) from Stimson (2014) (higher is more liberal)

75

70

65

60

55

50

45

40 1952 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 2012 A R K

I 13 U.S. newspapers (e.g., NYT, USA Today)

I 1980–2012 (132 quarters)

I 38,283 articles I Annotated for tone (pro/anti-immigration) and 14 emphasis framing “dimensions”

I Random subset of 4,154 manually annotated I Automatic annotation of the rest I (More about this later!) I Media tone: count(pro) – count(anti)

I High salience: ≥ 350 articles published in the quarter? (binary)

I Frame competition: Shannon entropy across emphasis framing dimensions in the quarter

Variables A R K

I Public mood (dependent variable) from Stimson (2014) (higher is more liberal) From the text corpus (38,283 articles): I High salience: ≥ 350 articles published in the quarter? (binary)

I Frame competition: Shannon entropy across emphasis framing dimensions in the quarter

Variables A R K

I Public mood (dependent variable) from Stimson (2014) (higher is more liberal) From the text corpus (38,283 articles):

I Media tone: count(pro) – count(anti) I High salience: ≥ 350 articles published in the quarter? (binary) I Frame competition: Shannon entropy across emphasis framing dimensions in the quarter

Variables A R K

I Public mood (dependent variable) from Stimson (2014) (higher is more liberal) From the text corpus (38,283 articles): I Media tone: count(pro) – count(anti) 750 Pro 500 250

7500 Neutral 500 250

7500 Anti 500 250 0 1980 1985 1990 1995 2000 2005 2010 I Frame competition: Shannon entropy across emphasis framing dimensions in the quarter

Variables A R K

I Public mood (dependent variable) from Stimson (2014) (higher is more liberal) From the text corpus (38,283 articles):

I Media tone: count(pro) – count(anti)

I High salience: ≥ 350 articles published in the quarter? (binary) I Frame competition: Shannon entropy across emphasis framing dimensions in the quarter

Variables A R K

I Public mood (dependent variable) from Stimson (2014) (higher is more liberal) From the text corpus (38,283 articles):

I Media tone: count(pro) – count(anti)

I High salience: ≥ 350 articles published in the quarter? (binary)

H1: Public mood ∝ Media tone × High salience Variables A R K

I Public mood (dependent variable) from Stimson (2014) (higher is more liberal) From the text corpus (38,283 articles):

I Media tone: count(pro) – count(anti)

I High salience: ≥ 350 articles published in the quarter? (binary)

I Frame competition: Shannon entropy across emphasis framing dimensions in the quarter Framing Dimensions over Time A R K Variables A R K

I Public mood (dependent variable) from Stimson (2014) (higher is more liberal) From the text corpus (38,283 articles):

I Media tone: count(pro) – count(anti)

I High salience: ≥ 350 articles published in the quarter? (binary)

I Frame competition: Shannon entropy across emphasis framing dimensions in the quarter

H2: Public mood ∝ –(Media tone × Frame competition) Regression A R K

coefficient standard error Public mood (lagged) 0.83 0.05 Media tone 222.09 108.53 High salience 0.30 1.26 Media tone × high salience 9.57 5.00 H1 Frame competition –10.06 10.86 Media tone × frame competition –87.41 43.60 H2 Constant 32.48 27.61

N = 132; adjusted R2 = 0.759, RMSE = 3.772, p < 0.05 in bold Discussion A R K

I Public opinion on immigration ∝ Media tone on immigration

I . . . more when immigration is a salient issue I . . . less when frame competition is high I Still to be accounted for:

I Demographic shifts I Major events I ... Outline A R K

I Motivation: a study in which we’re using NLP I Building a text classifier:  1. Define the classes 2. Annotate training examples 3. Featurize data

I Brief tangent: creating new features 4. Learn to classify 5. Evaluate the classifier

I Looking ahead Text Classification A R K

Mosteller and Wallace (1963) automatically inferred the authors of the disputed Federalist Papers. Many other examples:

I News: politics vs. sports vs. business vs. technology ...

I Reviews of films, restaurants, products: postive vs. negative

I Email: spam vs. not

I What is the reading level of a piece of text?

I Will a scientific paper be cited?

I Will a piece of proposed legislation pass? Media Frames Codebook: Framing Dimensions Boydstun et al. (2014) A R K

Economic: costs, benefits, or other Health and safety: health care, sanitation, financial implications and public safety Capacity and resources: availability of Quality of life: threats and opportunities physical, human, or financial resources for the individual’s health, happiness, and Morality: religious or ethical implications well-being Fairness and equality: balance or Cultural identity: traditions, customs, or distribution of rights, responsibilities, and values of a social group in relation to a resources policy issue Legality, constitutionality and Public opinion: attitudes and opinions of jurisprudence: rights, freedoms, and the the general public, including polling and authority of government demographics Policy prescription and evaluation: Political: considerations related to politics discussion of specific policies aimed at and politicians, including lobbying, addressing problems elections, and attempts to sway voters Crime and punishment: effectiveness and External regulation and reputation: implications of laws and their enforcement international reputation or foreign policy of Security and defense: threats to welfare of the United States the individual, community, or nation Other: any coherent group of frames not covered by the above categories Outline A R K

I Motivation: a study in which we’re using NLP I Building a text classifier:  1. Define the classes 2. Annotate training examples  3. Featurize data

I Brief tangent: creating new features 4. Learn to classify 5. Evaluate the classifier

I Looking ahead Media Frames Corpus Card et al. (2015) A R K

I Articles selected by keyword search across thirteen newspapers, 1980–2012, on three issues I Annotated for primary framing dimension, overall tone (i.e., stance on the issue, pro-/anti-/neutral), and arbitrary spans that evoke framing dimensions

I 5,549 (immigration) I 6,298 (same-sex marriage) I 4,077 (smoking)

I https://github.com/dallascard/media_frames_corpus Example (Denver Post, 2006) A R K

[WHERE THE JOBS ARE]Economic [Critics of illegal immigration can make many cogent arguments to support the position that the U.S. Congress and the Colorado legislature must develop effective and well-enforced immigration policies that will restrict the number of people who migrate here legally and illegally.]Policy prescription [It’s true that all forms of[immigration exert influence over our economic and cultural make-up.]Cultural identity In some ways, immigration improves our economy by adding laborers, taxpayers and consumers, and in other ways immigration detracts from our economy by increasing the number of students, health care recipients and other beneficiaries of public services.]Economic [Some economists say that immigrants, legal and illegal, produce a net economic gain, while others say that they create a net loss]Economic. There are rational arguments to support both sides of this debate, and it’s useful and educational to hear the varying positions. Example (Denver Post, 2006) A R K

[WHERE THE JOBS ARE]Economic [Critics of illegal immigration can make many cogent arguments to support the position that the U.S. Congress and the Colorado legislature must develop effective and well-enforced immigration policies that will restrict the number of people who migrate here legally and illegally.]Public opinion [It’s true that all forms of immigration exert influence over our economic and[cultural make-up.] Cultural identity In some ways, immigration improves our economy by adding laborers, taxpayers and consumers, and in other ways[immigration detracts from our economy by increasing the number of students, health care recipients and other beneficiaries of public services.]Capacity]Economic [Some economists say that immigrants, legal and illegal, produce a net economic gain, while others say that they create a net loss.]Economic There are rational arguments to support both sides of this debate, and it’s useful and educational to hear the varying positions. Interannotator Agreement Card et al. (2015) A R K

1.0 Immigration Smoking 0.8 Same-sex marriage

0.6

0.4

0.2 Krippendorff’s Alpha Stage 1 Stage 2 Stage 3 0.0 0 10 20 30 40 Round Outline A R K

I Motivation: a study in which we’re using NLP I Building a text classifier:  1. Define the classes 2. Annotate training examples  3. Featurize data  I Brief tangent: creating new features 4. Learn to classify 5. Evaluate the classifier

I Looking ahead “Featurizing” Text Data A R K

protest, rally, poll, march, protester, boycott, voter “Featurizing” Text Data A R K

words protest, rally, poll, march, protester, boycott, voter

Blei et al. (2003) “protest/religion” (rally, rallies, marchers, church, los) topics “non-profits” (raza, advocacy, sierra, coalition) “Featurizing” Text Data A R K

words protest, rally, poll, march, protester, boycott, voter

Blei et al. (2003) “protest/religion” (rally, rallies, marchers, church, los) topics “non-profits” (raza, advocacy, sierra, coalition)

Schneider and deal with, los angeles, a number of, day laborers, im- Smith (2015) migration status, new yorkers, federal officials, hunger multiwords strike, service employees international union “Featurizing” Text Data A R K

words protest, rally, poll, march, protester, boycott, voter

Blei et al. (2003) “protest/religion” (rally, rallies, marchers, church, los) topics “non-profits” (raza, advocacy, sierra, coalition)

Schneider and deal with, los angeles, a number of, day laborers, im- Smith (2015) migration status, new yorkers, federal officials, hunger multiwords strike, service employees international union

FrameNet/Das Taking sides, Emotion directed, Attack, Colonization, et al. (2014) Discussion, Judgment communication predicates “Featurizing” Text Data A R K

words protest, rally, poll, march, protester, boycott, voter

Blei et al. (2003) “protest/religion” (rally, rallies, marchers, church, los) topics “non-profits” (raza, advocacy, sierra, coalition)

Schneider and deal with, los angeles, a number of, day laborers, im- Smith (2015) migration status, new yorkers, federal officials, hunger multiwords strike, service employees international union

FrameNet/Das Taking sides, Emotion directed, Attack, Colonization, et al. (2014) Discussion, Judgment communication predicates

. . . , Manning et al. (2014) syntactic classes, dependencies, sentiment, and named entities; Brown et al. (1992) clusters, Wikipedia page titles (Singh et al., 2012), . . . Pop Quiz: Which Dimension’s Classifier Was That? A R K

Economic: costs, benefits, or other financial implications Health and safety: health care, sanitation, Capacity and resources: availability of and public safety physical, human, or financial resources Quality of life: threats and opportunities Morality: religious or ethical implications for the individual’s health, happiness, and Fairness and equality: balance or well-being distribution of rights, responsibilities, and Cultural identity: traditions, customs, or resources values of a social group in relation to a Legality, constitutionality and policy issue jurisprudence: rights, freedoms, and the Public opinion: attitudes and opinions of authority of government the general public, including polling and Policy prescription and evaluation: demographics discussion of specific policies aimed at Political: considerations related to politics addressing problems and politicians, including lobbying, Crime and punishment: effectiveness and elections, and attempts to sway voters implications of laws and their enforcement External regulation and reputation: Security and defense: threats to welfare of international reputation or foreign policy of the individual, community, or nation the United States Outline A R K

I Motivation: a study in which we’re using NLP I Building a text classifier:  1. Define the classes 2. Annotate training examples  3. Featurize data  I Brief tangent: creating new features 4. Learn to classify 5. Evaluate the classifier

I Looking ahead I Preprocessing: dependency , multiword analysis, named entity recognition, pronominal coreference, and lots of filtering.

I Applied to the immigration articles (k = 50), we find “personas” for:

I Immigrants I Protesters

I Government agencies, I Terrorists administrations, police/officials, I Green cards, courts/judges laws/policies, law suits

I Political candidates I Countries

I Employers I Information sources

I Universities/schools I Eli´anGonz´alez

Framing through Personas A R K

I Bamman et al. (2013) introduced a latent variable model for clustering mentions of entities into personas I Applied to the immigration articles (k = 50), we find “personas” for:

I Immigrants I Protesters

I Government agencies, I Terrorists administrations, police/officials, I Green cards, courts/judges laws/policies, law suits

I Political candidates I Countries

I Employers I Information sources

I Universities/schools I Eli´anGonz´alez

Framing through Personas A R K

I Bamman et al. (2013) introduced a latent variable model for clustering mentions of entities into personas

I Preprocessing: dependency parsing, multiword analysis, named entity recognition, pronominal coreference, and lots of filtering. Framing through Personas A R K

I Bamman et al. (2013) introduced a latent variable model for clustering mentions of entities into personas

I Preprocessing: dependency parsing, multiword analysis, named entity recognition, pronominal coreference, and lots of filtering.

I Applied to the immigration articles (k = 50), we find “personas” for:

I Immigrants I Protesters

I Government agencies, I Terrorists administrations, police/officials, I Green cards, courts/judges laws/policies, law suits

I Political candidates I Countries

I Employers I Information sources

I Universities/schools I Eli´anGonz´alez Immigrant “Personas” A R K

2 deport live come detain leave hold arrest re- lease face arrive go take flee allow tell want send get return old woman man people immigrant family

44 foreign skilled american hire high-tech allow bring temporary many find need import work do- mestic hire recruit get pay new seasonal worker company student immigrant employer

50 illegal criminal deport commit arrest immigrant convict convict legal commit deport serious re- lease identify hold arrest detain violent undocu- mented remove immigrant alien crime people deportation Outline A R K

I Motivation: a study in which we’re using NLP I Building a text classifier:  1. Define the classes 2. Annotate training examples  3. Featurize data  I Brief tangent: creating new features  4. Learn to classify  5. Evaluate the classifier

I Looking ahead Learning Classifiers for Framing in Text A R K

I 4,154 training documents, with annotations converted to presence/absence per framing dimension (10% held out for testing)

I Logistic regression to predict presence/absence for each dimension I Bayesian optimization (Yogatama et al., 2015) with the ladder (Blum and Hardt, 2015):

I Which features to include? I Minimum count I Binarize, tfidf, or no transformation? I Downcase? I Paragraph and sentence “pseudodocuments,” with weights (Zaidan et al., 2007) I Regularization strength (`1 and `2) for each feature type Outline A R K

I Motivation: a study in which we’re using NLP I Building a text classifier:  1. Define the classes 2. Annotate training examples  3. Featurize data  I Brief tangent: creating new features  4. Learn to classify  5. Evaluate the classifier  I Looking ahead 2. Absolute error in aggregate proportion estimation: 4.8%± 1.8; 1.5% ± 1.3 with Hopkins and King (2010) correction 3. Different features selected for every dimension!

Evaluation A R K

1. Accuracy of binary classifiers, across 14 dimensions, computed on test cases where annotators agree: 90.4% ± 5.0 Better measure, F1: 67.1% ± 12.9 2. Absolute error in aggregate proportion estimation: 4.8%± 1.8; 1.5% ± 1.3 with Hopkins and King (2010) correction 3. Different features selected for every dimension!

Evaluation A R K 1. Accuracy of binary classifiers, across 14 dimensions, computed on test cases where annotators agree: 90.4% ± 5.0 Better measure, F1: 67.1% ± 12.9 0.8 0.7 0.6 0.5

200 400 600 800 1000 1200

F1 as a function of the number of positive training examples 3. Different features selected for every dimension!

Evaluation A R K

1. Accuracy of binary classifiers, across 14 dimensions, computed on test cases where annotators agree: 90.4% ± 5.0 Better measure, F1: 67.1% ± 12.9 2. Absolute error in aggregate proportion estimation: 4.8%± 1.8; 1.5% ± 1.3 with Hopkins and King (2010) correction 3. Different features selected for every dimension!

Evaluation A R K 1. Accuracy of binary classifiers, across 14 dimensions, computed on test cases where annotators agree: 90.4% ± 5.0 Better measure, F1: 67.1% ± 12.9 2. Absolute error in aggregate proportion estimation: 4.8%± 1.8; 1.5% ± 1.3 with Hopkins and King (2010) correction 3. Different features selected for every dimension!

Evaluation A R K

1. Accuracy of binary classifiers, across 14 dimensions, computed on test cases where annotators agree: 90.4% ± 5.0 Better measure, F1: 67.1% ± 12.9 2. Absolute error in aggregate proportion estimation: 4.8%± 1.8; 1.5% ± 1.3 with Hopkins and King (2010) correction Evaluation A R K

1. Accuracy of binary classifiers, across 14 dimensions, computed on test cases where annotators agree: 90.4% ± 5.0 Better measure, F1: 67.1% ± 12.9 2. Absolute error in aggregate proportion estimation: 4.8%± 1.8; 1.5% ± 1.3 with Hopkins and King (2010) correction 3. Different features selected for every dimension! Features Selected A R K

uni. bi. JK POS NER dep. sent. fr. Br. AM. Wl. LDA P AS AP Capacity & resources L L Crime & punishment L     Cultural identity L  SS    Economic L         External regulation W    MWE    Fairness & equality       Health & safety L    Legality, jurisdiction    SS    Morality     Policy prescription L    SS   Political L W    Public sentiment L L   MWE      Quality of life W W      SS    Security & defense W W     SS      Outline A R K

I Motivation: a study in which we’re using NLP I Building a text classifier:  1. Define the classes  2. Annotate training examples  3. Featurize data  I Brief tangent: creating new features  4. Learn to classify  5. Evaluate the classifier  I Looking ahead  Looking Ahead A R K

I Taken together, frames imply a rich landscape of perspectives—can we map it?

I Apps for “unframing” the news and other political discourse

I Fine-grained frames?

I Frame retrieval: expert describes frame, retrieve instances from a corpus

I Framing as a strategic choice (Sim et al., 2015b,a) I An inventory of the linguistic tools for framing and attention?

I Syntax (Greene and Resnik, 2009) I Frame (Fillmore, 1982) I Discourse (Grosz and Sidner, 1986) . . . or just representation learning? A R K

Thank you!

Sponsors: NSF, Google, UW Innovation Award

More details: Bamman et al. (2013); Boydstun et al. (2014); Card et al. (2015)

These slides: http://tinyurl.com/framing-noah ReferencesI A R K

Bamman, D., O’Connor, B., and Smith, N. A. (2013). Learning latent personas of film characters. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993–1022. Blum, A. and Hardt, M. (2015). The ladder: A reliable leaderboard for machine learning competitions. http://arxiv.org/abs/1502.04585. Boydstun, A. E., Card, D., Gross, J. H., Resnik, P., and Smith, N. A. (2014). Tracking the development of media frames within and across policy issues. Presented at the American Political Science Association. Brown, P. F., deSouza, P. V., Mercer, R. L., Pietra, V. J. D., and Lai, J. C. (1992). Class-based n-gram models of natural language. Computational Linguistics, 18(4):467–479. Card, D., Boydstun, A. E., Gross, J. H., Resnik, P., and Smith, N. A. (2015). The Media Frames Corpus: Annotations of frames across issues. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. Das, D., Chen, D., Martins, A. F. T., Schneider, N., and Smith, N. A. (2014). Frame-semantic parsing. Computational Linguistics, 40(1):9–56. Entman, R. M. (1993). Framing: Toward clarification of a fractured paradigm. Journal of Communication, 43(4):51–58. ReferencesII A R K Entman, R. M. (2007). Framing bias: Media in the distribution of power. Journal of Communication, 57:163–173. Fillmore, C. (1982). Frame semantics. In Linguistics in the Morning Calm, pages 111–137. Hanshin. Greene, S. and Resnik, P. (2009). More than words: Syntactic packaging and implicit sentiment. In Proceedings of HLT-NAACL. Grosz, B. J. and Sidner, C. L. (1986). Attention, intentions, and the structure of discourse. Computational Linguistics, 12(3):175–204. Hopkins, D. J. and King, G. (2010). A method of automated nonparametric content analysis for social science. American Journal of Political Science, 54(1):229–247. Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J. R., Bethard, S., and McClosky, D. (2014). The Stanford CoreNLP natural language processing toolkit. In Proc. of ACL. Mosteller, F. and Wallace, D. L. (1963). Inference in an authorship problem: A comparative study of discrimination methods applied to the authorship of the disputed Federalist Papers. Journal of the American Statistical Association, 58(302):275–309. Schneider, N. and Smith, N. A. (2015). A corpus and model integrating multiword expressions and supersenses. In Proceedings of NAACL. Sim, Y., Routledge, B. R., and Smith, N. A. (2015a). A utility model of authors in the scientific community. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ReferencesIII A R K

Sim, Y., Routledge, B. R., and Smith, N. A. (2015b). The utility of text: The case of amicus briefs and the Supreme Court. In Proceedings of the AAAI Conference on Artificial Intelligence. Singh, S., Subramanya, A., Pereira, F., and McCallum, A. (2012). Wikilinks: A large-scale cross-document coreference corpus labeled via links to Wikipedia. Technical Report UMASS-CS-2012-015, University of Massachusetts. Stimson, J. (2014). Public policy mood data. Yogatama, D., Kong, L., and Smith, N. A. (2015). Bayesian optimization of text representations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Zaidan, O., Eisner, J., and Piatko, C. D. (2007). Using “annotator rationales” to improve machine learning for text categorization. In Proceedings of HLT-NAACL. What about Deep Learning? A R K

Deep learning refers to a set of tools for discovering continuous features, based mostly on neural networks (non-linear, parameterized functions). What about Deep Learning? A R K

Deep learning refers to a set of tools for discovering continuous features, based mostly on neural networks (non-linear, parameterized functions).

Pros: Cons: I Features that improve I Computational expense accuracy, when you have enough data. I Lack of interpretability What about Deep Learning? A R K

Deep learning refers to a set of tools for discovering continuous features, based mostly on neural networks (non-linear, parameterized functions).

Pros: Cons: I Features that improve I Computational expense accuracy, when you have enough data. I Lack of interpretability

Cf. Bamman personas, Blei topics, and Brown clusters, which offer discrete features based on probabilistic graphical models and explicit assumptions.