Screenplay Summarization Using Latent Narrative Structure
Total Page:16
File Type:pdf, Size:1020Kb
Edinburgh Research Explorer Screenplay Summarization Using Latent Narrative Structure Citation for published version: Papalampidi, P, Keller, F, Frermann, L & Lapata, M 2020, Screenplay Summarization Using Latent Narrative Structure. in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, pp. 1920-1933, 2020 Annual Conference of the Association for Computational Linguistics, Virtual conference, United States, 5/07/20. <https://www.aclweb.org/anthology/2020.acl-main.174> Link: Link to publication record in Edinburgh Research Explorer Document Version: Publisher's PDF, also known as Version of record Published In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics General rights Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorer content complies with UK legislation. If you believe that the public display of this file breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Download date: 24. Sep. 2021 Screenplay Summarization Using Latent Narrative Structure Pinelopi Papalampidi1 Frank Keller1 Lea Frermann2 Mirella Lapata1 1Institute for Language, Cognition and Computation School of Informatics, University of Edinburgh 2School of Computing and Information Systems University of Melbourne [email protected], [email protected], [email protected], [email protected] Victim: Mike Kimble, found in a Body Farm. Died 6 Abstract Setup hours ago, unknown cause of death. CSI discover cow tissue in Mike's body. Opportunity Most general-purpose extractive summariza- Cross-contamination is suggested. Probable cause of death: Mike's house has been set on tion models are trained on news articles, which New fire. CSI finds blood: Mike was murdered, fire was Situation are short and present all important information a cover up. First suspects: Mike's fiance, Jane upfront. As a result, such models are biased by and her ex-husband, Russ. CSI finds photos in Mike's house of Jane's Change of position and often perform a smart selection daughter, Jodie, posing naked. Plans Mike is now a suspect of abusing Jodie. Russ of sentences from the beginning of the doc- Progress allows CSI to examine his gun. ument. When summarizing long narratives, CSI discovers that the bullet that killed Mike Point of was made of frozen beef that melt inside him. which have complex structure and present in- no Return formation piecemeal, simple position heuris- They also find beef in Russ' gun. Russ confesses that he knew that Mike was Complications tics are not sufficient. In this paper, we pro- abusing Jody, so he confronted and killed him. pose to explicitly incorporate the underlying Russ is given bail, since no jury would convict Major a protective father. Setback structure of narratives into general unsuper- CSI discovers that the naked photos were taken The final push vised and supervised extractive summarization on a boat, which belongs to Russ. CSI discovers that it was Russ who was models. We formalize narrative structure in abusing his daughter based on fluids found in Climax terms of key narrative events (turning points) his sleeping bag and later killed Mike who tried to help Jodie. and treat it as latent in order to summarize Russ receives a mandatory life sentence. Aftermath screenplays (i.e., extract an optimal sequence of scenes). Experimental results on the CSI Figure 1: Example of narrative structure for episode corpus of TV screenplays, which we augment “Burden of Proof” from TV series Crime Scene Inves- with scene-level summarization labels, show tigation (CSI); turning points are highlighted in color. that latent turning points correlate with im- portant aspects of a CSI episode and improve summarization performance over general ex- ements of a story in the beginning and support- tractive algorithms, leading to more complete ing material and secondary details afterwards. The and diverse summaries. rigid structure of news articles is expedient since important passages can be identified in predictable 1 Introduction locations (e.g., by performing a “smart selection” Automatic summarization has enjoyed renewed of sentences from the beginning of the document) interest in recent years thanks to the popular- and the structure itself can be explicitly taken into ity of modern neural network-based approaches account in model design (e.g., by encoding the rel- (Cheng and Lapata, 2016; Nallapati et al., 2016, ative and absolute position of each sentence). 2017; Zheng and Lapata, 2019) and the avail- In this paper we are interested in summarizing ability of large-scale datasets containing hundreds longer narratives, i.e., screenplays, whose form of thousands of document–summary pairs (Sand- and structure is far removed from newspaper ar- haus, 2008; Hermann et al., 2015; Grusky et al., ticles. Screenplays are typically between 110 and 2018; Narayan et al., 2018; Fabbri et al., 2019; Liu 120 pages long (20k words), their content is bro- and Lapata, 2019). Most efforts to date have con- ken down into scenes, which contain mostly dia- centrated on the summarization of news articles logue (lines the actors speak) as well as descrip- which tend to be relatively short and formulaic tions explaining what the camera sees. Moreover, following an “inverted pyramid” structure which screenplays are characterized by an underlying places the most essential, novel and interesting el- narrative structure, a sequence of events by which 1920 Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1920–1933 July 5 - 10, 2020. c 2020 Association for Computational Linguistics Screenplay Latent Narrative Structure ermann et al., 2018) which revolves around a team TP1: Introduction of forensic investigators solving criminal cases. TP2: Goal definition Such programs have a complex but well-defined structure: they open with a crime, the crime scene TP3: Commitment is examined, the victim is identified, suspects are TP4: Setback introduced, forensic clues are gathered, suspects TP5: Ending are investigated, and finally the case is solved. In this work, we adapt general-purpose extrac- tive summarization algorithms (Nallapati et al., Summary scenes 2017; Zheng and Lapata, 2019) to identify infor- Video summary relevant mative scenes in screenplays and instill in them to TP2 knowledge about narrative film structure (Hauge, 2017; Cutting, 2016; Freytag, 1896). Specifically, we adopt a scheme commonly used by screen- irrelevant writers as a practical guide for producing success- ful screenplays. According to this scheme, well- relevant structured stories consist of six basic stages which to TP5 are defined by five turning points (TPs), i.e., events which change the direction of the narrative, and Figure 2: We first identify scenes that act as turning determine the story’s progression and basic the- points (i.e., key events that segment the story into sec- matic units. In Figure1, TPs are highlighted for tions). We next create a summary by selecting informa- a CSI episode. Although the link between turning tive scenes, i.e.,semantically related to turning points. points and summarization has not been previously made, earlier work has emphasized the importance a story is defined (Cutting, 2016), and by the of narrative structure for summarizing books (Mi- story’s characters and their roles (Propp, 1968). halcea and Ceylan, 2007) and social media content Contrary to news articles, the gist of the story in a (Kim and Monroy-Hernandez´ , 2015). More re- screenplay is not disclosed at the start, information cently, Papalampidi et al.(2019) have shown how is often revealed piecemeal; characters evolve and to identify turning points in feature-length screen- their actions might seem more or less important plays by projecting synopsis-level annotations. over the course of the narrative. From a modeling Crucially, our method does not involve man- perspective, obtaining training data is particularly ually annotating turning points in CSI episodes. problematic: even if one could assemble screen- Instead, we approximate narrative structure au- plays and corresponding summaries (e.g., by min- tomatically by pretraining on the annotations of ing IMDb or Wikipedia), the size of such a corpus the TRIPOD dataset of Papalampidi et al.(2019) would be at best in the range of a few hundred and employing a variant of their model. We find examples not hundreds of thousands. Also note that narrative structure representations learned on that genre differences might render transfer learn- their dataset (which was created for feature-length ing (Pan and Yang, 2010) difficult, e.g., a model films), transfer well across cinematic genres and trained on movie screenplays might not generalize computational tasks. We propose a framework for to sitcoms or soap operas. end-to-end training in which narrative structure is Given the above challenges, we introduce a treated as a latent variable for summarization. We number of assumptions to make the task feasible. extend the CSI dataset (Frermann et al., 2018) with Firstly, our