Proceedings of the Third Workshop on Narrative Understanding
Total Page:16
File Type:pdf, Size:1020Kb
NAACL HLT 2021 Workshop on Narrative Understanding (WNU) Proceedings of the Third Workshop June 11, 2021 ©2021 The Association for Computational Linguistics Order copies of this and other ACL proceedings from: Association for Computational Linguistics (ACL) 209 N. Eighth Street Stroudsburg, PA 18360 USA Tel: +1-570-476-8006 Fax: +1-570-476-0860 [email protected] ISBN 978-1-954085-42-8 ii Introduction Welcome to the 3rd Workshop on Narrative Understanding! This is the 3rd iteration of the workshop, which brings together an interdisciplinary group of researchers from AI, ML, NLP, Computer Vision and other related fields, as well as scholars from the humanities to discuss methods to improve automatic narrative understanding capabilities. We are happy to present 10 papers on this topic (along with 3 non-archival papers to be presented only at the workshop). These papers take on the complex challenges presented by diverse texts in areas of film, dialogue and literature as they look to improve methods for event extraction, gender and representation bias, controllable generation, quality assessment, and other tasks related to the workshop theme. We would like to thank everyone who submitted their work to this workshop and the program committee for their helpful feedback. We would also like to thank our invited speakers for their participation in this workshop: David Bamman, Nate Chambers, Nasrin Mostafazadeh, Nanyun Peng, Laure Thompson, and Prashant Pandey. Elizabeth, Faeze, Lara, Mohit, Nader and Snigdha iii Organizers: Nader Akoury, University of Massachusetts Amherst Faeze Brahman, University of California, Santa Cruz Snigdha Chaturvedi, University of North Carolina, Chapel Hill Elizabeth Clark, University of Washington Mohit Iyyer, University of Massachusetts Amherst Lara J. Martin, Georgia Institute of Technology Program Committee: Apoorv Agarwal, Text IQ Antoine Bosselut, Stanford University Jan Buys, University of Cape Town Somnath Basu Roy Chowdhury, Microsoft, India Saadia Gabriel, University of Washington Seraphina Goldfarb-Tarrant, University of Edinburgh Andrew Gordon, University of Southern California Ari Holtzman, University of Washington Adam Jatowt, Kyoto University Yangfeng Ji, University of Virginia Evgeny Kim, University of Stuttgart, Germany Roman Klinger, University of Stuttgart, Germany Rik Koncel-Kedziorski, University of Washington Faisal Ladhak, Columbia University Ashutosh Modi, Indian Institute of Technology, Kanpur, India Pedram Hosseini, George Washington University Mark Riedl, Georgia Tech Melissa Roemmele, SDL Research Mrinmaya Sachan, ETH Zurich Maarten Sap, University of Washington Joao Sedoc, NYU Shashank Srivastava, UNC Chapel Hill Katherine Thai, UMass Amherst Shufan Wang, UMass Amherst Chao Zhao, UNC Chapel Hill Guanhua Zhang, Tencent AI, China Invited Speakers: David Bamman, University of California, Berkeley Nate Chambers, US Naval Academy Nasrin Mostafazadeh, Verneek Nanyun Peng, University of California, Los Angeles Laure Thompson, University of Massachusetts, Amherst Prashant Pandey, Screenwriter, Bollywood v Table of Contents Hierarchical Encoders for Modeling and Interpreting Screenplays Gayatri Bhat, Avneesh Saluja, Melody Dye and Jan Florjanczyk . .1 FanfictionNLP: A Text Processing Pipeline for Fanfiction Michael Yoder, Sopan Khosla, Qinlan Shen, Aakanksha Naik, Huiming Jin, Hariharan Muralidha- ran and Carolyn Rosé . 13 Learning Similarity between Movie Characters and Its Potential Implications on Understanding Human Experiences Zhilin Wang, Weizhe Lin and Xiaodong Wu . 24 Document-level Event Extraction with Efficient End-to-end Learning of Cross-event Dependencies Kung-Hsiang Huang and Nanyun Peng . 36 Gender and Representation Bias in GPT-3 Generated Stories Li Lucy and David Bamman . 48 Transformer-based Screenplay Summarization Using Augmented Learning Representation with Dia- logue Information Myungji Lee, Hongseok Kwon, Jaehun Shin, WonKee Lee, Baikjin Jung and Jong-Hyeok Lee . 56 Plug-and-Blend: A Framework for Controllable Story Generation with Blended Control Codes Zhiyu Lin and Mark Riedl . 62 Automatic Story Generation: Challenges and Attempts Amal Alabdulkarim, Siyan Li and Xiangyu Peng. .72 Fabula Entropy Indexing: Objective Measures of Story Coherence Louis Castricato, Spencer Frazier, Jonathan Balloch and Mark Riedl . 84 Towards a Model-Theoretic View of Narratives Louis Castricato, Stella Biderman, David Thue and Rogelio Cardona-Rivera . 95 vii Workshop Program Friday, June 11, 2021 8:45–9:00 Opening Remarks 9:00–10:30 Invited Talks 10:30–11:30 Poster Session 1 11:30–13:00 Lunch 13:00–14:00 Invited Talks 14:00–15:00 Panel Discussion 15:00–16:00 Poster Session 2 Invited Speakers and Panelists David Bamman Nate Chambers Nasrin Mostafazadeh Prashant Pandey Nanyun Peng ix Friday, June 11, 2021 (continued) Laure Thompson Papers (Archival) Hierarchical Encoders for Modeling and Interpreting Screenplays Gayatri Bhat, Avneesh Saluja, Melody Dye and Jan Florjanczyk FanfictionNLP: A Text Processing Pipeline for Fanfiction Michael Yoder, Sopan Khosla, Qinlan Shen, Aakanksha Naik, Huiming Jin, Hari- haran Muralidharan and Carolyn Rosé Learning Similarity between Movie Characters and Its Potential Implications on Understanding Human Experiences Zhilin Wang, Weizhe Lin and Xiaodong Wu Document-level Event Extraction with Efficient End-to-end Learning of Cross-event Dependencies Kung-Hsiang Huang and Nanyun Peng Gender and Representation Bias in GPT-3 Generated Stories Li Lucy and David Bamman Transformer-based Screenplay Summarization Using Augmented Learning Repre- sentation with Dialogue Information Myungji Lee, Hongseok Kwon, Jaehun Shin, WonKee Lee, Baikjin Jung and Jong- Hyeok Lee Plug-and-Blend: A Framework for Controllable Story Generation with Blended Control Codes Zhiyu Lin and Mark Riedl Automatic Story Generation: Challenges and Attempts Amal Alabdulkarim, Siyan Li and Xiangyu Peng Fabula Entropy Indexing: Objective Measures of Story Coherence Louis Castricato, Spencer Frazier, Jonathan Balloch and Mark Riedl Towards a Model-Theoretic View of Narratives Louis Castricato, Stella Biderman, David Thue and Rogelio Cardona-Rivera x Friday, June 11, 2021 (continued) Papers (Non-Archival) Telling Stories through Multi-User Dialogue by Modeling Character Relations Wai Man Si, Prithviraj Ammanabrolu and Mark Riedl Inferring the Reader: Guiding Automated Story Generation with Commonsense Reasoning Xiangyu Peng, Siyan Li, Sarah Wiegreffe and Mark Riedl Tell Me A Story Like I’m Five: Story Generation via Question Answering Louis Castricato, Spencer Frazier, Jonathan Balloch, Nitya Tarakad and Mark Riedl xi Hierarchical Encoders for Modeling and Interpreting Screenplays Gayatri Bhat∗ Avneesh Saluja Melody Dye Jan Florjanczyk Bloomberg Netflix New York, NY, USA Los Angeles, CA, USA [email protected] {asaluja,mdye,jflorjanczyk}@netflix.com Abstract Instead, models either rely on a) a pipeline that provides a battery of syntactic and semantic in- While natural language understanding of long- formation from which to craft features (e.g., the form documents remains an open challenge, BookNLP pipeline (Bamman et al., 2014) for lit- such documents often contain structural infor- mation that can inform the design of models erary text, graph-based features (Gorinski and La- encoding them. Movie scripts are an exam- pata, 2015) for movie scripts, or outputs from a ple of such richly structured text – scripts are discourse parser (Ji and Smith, 2017) for text cat- segmented into scenes, which decompose into egorization) and/or b) the linguistic intuitions of dialogue and descriptive components. In this the model designer to select features relevant to the work, we propose a neural architecture to en- task at hand (e.g., rather than ingest the entire text, code this structure, which performs robustly Bhagavatula et al.(2018) only consider certain sec- on two multi-label tag classification tasks with- tions like the title and abstract of an academic pub- out using handcrafted features. We add a layer of insight by augmenting the encoder with an lication). While there is much to recommend these unsupervised ‘interpretability’ module, which approaches, E2E neural modeling offers several can be used to extract and visualize narra- key advantages: it obviates the need for auxiliary tive trajectories. Though this work specifically feature-generating models, minimizes the risk of tackles screenplays, we discuss how the under- error propagation, and offers improved generaliza- lying approach can be generalized to a range tion across large-scale corpora. This work explores of structured documents. how the inherent structure of a document class can be leveraged to facilitate an E2E approach. We 1 Introduction focus on screenplays, investigating whether we can As natural language understanding of sentences and effectively extract key information by first segment- short documents continues to improve, interest in ing them into scenes, and further exploiting the tackling longer-form documents such as academic structural regularities within each scene. papers (Ren et al., 2014; Bhagavatula et al., 2018), With an average of >20k tokens per script in novels (Iyyer et al., 2016) and screenplays (Gorin- our evaluation corpus, extracting salient aspects ski and Lapata, 2018) has been growing. Analyses is far from trivial. Through a series of carefully of such documents can take place at multiple levels, controlled experiments, we show that a structure- e.g. identifying both document-level labels (such aware approach