Arxiv:1908.09083V2 [Cs.CL] 8 Oct 2020 Existing Supervised Approaches Suffer from Two Sig- Get Reviews fluctuates a Lot
Total Page:16
File Type:pdf, Size:1020Kb
Multi-view Story Characterization from Movie Plot Synopses and Reviews Sudipta Kar|, Gustavo Aguilar|, Mirella Lapata♠, Thamar Solorio| | University of Houston ♠ ILCC, University of Edinburgh fskar3, [email protected], [email protected], [email protected] Abstract Plot Synopsis In late summer 1945, guests are gathered for the wedding reception of Don Vito Corleone's daughter Connie (Talia Shire) and Carlo Rizzi This paper considers the problem of charac- (Gianni Russo). ... ... The film ends with Clemenza and new ca- terizing stories by inferring properties such as poregimes Rocco Lampone and Al Neri arriving and paying their re- spects to Michael. Clemenza kisses Michael's hand and greets him as theme and style using written synopses and "Don Corleone." As Kay watches, the office door is closed." reviews of movies. We experiment with a Review multi-label dataset of movie synopses and a Even if the viewer does not like mafia type of movies, he or she will tagset representing various attributes of sto- watch the entire film ... ... Its about family, loyalty, greed, relationships, and real life. This is a great mix, and the artistic style make the film ries (e.g., genre, type of events). Our pro- memorable. posed multi-view model encodes the synopses violence action murder atmospheric and reviews using hierarchical attention and revenge mafia family loyalty shows improvement over methods that only greed relationship artistic use synopses. Finally, we demonstrate how can we take advantage of such a model to ex- Figure 1: Example snippets from plot synopsis and review of tract a complementary set of story-attributes The Godfather and tags that can be generated from these. from reviews without direct supervision. We have made our dataset and source code pub- licly available at https://ritual.uh.edu/ tions, loyalty, greed, and mafia, whereas the gold multiview-tag-2020. standard tags from the plot are violence, murder, atmospheric, action, and revenge. In this paper, 1 Introduction we show that such information in reviews can sig- nificantly strengthen a supervised synopses to tag A high-level description of stories represented by prediction system, hence alleviating the first limita- a tagset can assist consumers of story-based me- tion. To address the second limitation, we propose dia (e.g., movies, books) during the selection pro- to also rely on the content provided by the reviews. cess. Although collecting tags from users is time- We extract new tags from reviews and thus comple- consuming and often suffers from coverage issues ment the predefined tagset. (Katakis et al., 2008), NLP techniques like those in Kar et al.(2018b) and Gorinski and Lapata(2018) A potential criticism of an approach that relies on can be employed to generate tags automatically user reviews is that it is not practical to wait for user from written narratives such as synopses. However, reviews to accumulate. The speed at which movies arXiv:1908.09083v2 [cs.CL] 8 Oct 2020 existing supervised approaches suffer from two sig- get reviews fluctuates a lot. Therefore, we pro- nificant weaknesses. Firstly, the accuracy of the pose a system that learns to predict tags by jointly extracted tags is subject to the quality of the syn- modeling the movie synopsis and its reviews, when opses. Secondly, the tagset is predefined by what the reviews are available. But if a movie has not was present in the training and development sets accumulated reviews yet, our system can still pre- and thus is brittle; story attributes are unbounded in dict tags from a predefined tagset using only the principle and grow with the underlying vocabulary. synopsis without any configuration change. To address the weaknesses presented above, we Tag extraction from reviews can be modeled as propose to exploit user reviews. We have found a supervised aspect extraction problem (Liu, 2012) that movie reviews often discuss many aspects of that requires a considerably large amount of anno- the story. For example, in Figure1, a reviewer tated tags in the reviews. To get rid of this anno- writes that The Godfather is about family, rela- tation burden, we formulate the problem from the perspective of Multiple Instance Learning (MIL; Train Val Test Instances 9; 746 2; 437 3; 046 Keeler and Rumelhart, 1992). As a result, our Tags per instance 3 3 3 model learns to spot story attributes in reviews in Reviews per movie 72 74 72 S Sentence per document 50 53 51 a weakly supervised fashion and does not expect S direct tag level supervision. Note that these com- Words per sentence 21 21 21 R Sentence per document 117 116 116 plementary open-vocabulary tags extracted from R Words per sentence 27 27 27 the reviews are separate from the predefined tagset, and we can only generate this set when a movie has Table 1: Statistics of the dataset. S denotes synopses and R denotes review summaries. reviews. As we show in Section 6.1, tags generated by our system can quickly describe a story helping attributes of movie storylines like mood, plot type, users decide whether to watch a movie or not. and possible feeling of consumers in a supervised Our contributions in this work can be summa- fashion where the number of predictable categories rized as follows: is more extensive and more comprehensive com- pared to genre classification. Even though these • We collect ≈1.9M user reviews to enrich an systems can retrieve comparatively larger sets of existing dataset of movie plot synopses and story attributes, the predictable attributes are lim- tags. ited in a closed group of tags. In contrast, in real life, these attributes can be unlimited. • We propose a multi-view multi-label tag pre- diction system that learns to predict relevant Movie Review Mining There is a subtle distinc- tags from a predefined tagset by exploiting tion between the reviews of typical material prod- the synopsis and the reviews of a movie when ucts (e.g. phone, TV, furniture) and story-based available. We show that utilizing reviews can items (e.g. literature, film, blog). In contrast to provide ≈4% increase in F1 over a system the usual aspect based opinions (e.g. battery, reso- using only synopses to predict tags. lution, color), reviews of story-based items often contain end users’ feelings, important events of • We demonstrate a technique to extract open- stories, or genre related information, which are ab- vocabulary descriptive tags (i.e., not part of stract in nature (e.g. heart-warming, slasher, melo- the predefined tagset) from reviews using our dramatic) and do not have a very specific target trained model. While review-mining is typi- aspect. Extraction of such opinions about stories cally approached as supervised learning, we has been approached by previous work using re- push this task to an unsupervised direction to views of movies (Zhuang et al., 2006; Li et al., avoid the annotation burden. 2010) and books (Lin et al., 2013). Such attempts are broadly divided into two categories. The first We verify our proposed method against multiple category deals with spotting words or phrases (ex- competitive baselines and conduct a human evalua- cellent, fantastic, boring) used by people to express tion to confirm our tags’ effectiveness for a set of how they feel about the story. And the second cate- movies. gory focuses on extracting important opinionated 2 Background sentences from reviews and generating a summary. In our work, while the primary task is to retrieve Prior art related to this paper’s work includes relevant tags from a pre-defined tagset by super- story analysis of movies and mining opinions from vised learning, our model provides the ability to movie reviews. In this section, we briefly discuss mine story aspects from reviews without any direct these lines of work. supervision. Story Analysis of Movies Over the years, high- 3 Dataset level story characterization approaches evolved around the problem of identifying genres (Biber, Our starting data set is the MPST corpus (Kar et al., 1992; Kessler et al., 1997; Petrenz, 2012; Worsham 2018a) which contains approximately 15K movies 1 and Kalita, 2018). Genre information is helpful but and a set of tags assigned by IMDB and Movie- 2 not very expressive most of the time as it is a broad Lens users. The tagset contains 71 labels that way to categorize items. Recent work (Gorinski 1http://imdb.com and Lapata, 2018; Kar et al., 2018b) retrieves other 2http://movielens.org are representative of story related attributes (e.g., sentences and words are more helpful for identify- thought-provoking, inspiring, violence) and the cor- ing relevant tags from the synopsis. With this in pus is free from any metadata (e.g., cast, release mind, we adapt the hierarchical encoding technique year). of Yang et al.(2016) that learns to weight impor- We extended the dataset by collecting up to 100 tant words and sentences and use this information most helpful reviews per movie from IMDB. Out to create a high-level document representation. Ad- of the 15K movies in MPST, we did not find any ditionally, to efficiently capture various important reviews for 285 films. The collected reviews often story aspects in long synopses and reviews, we narrate the plot summary and describe opinions model our task from the perspective of Multiple about movies. We noticed that reviews can be very Instance Learning (MIL). long and sometimes contain repetitive plot sum- We assume that each synopsis and review is a maries and opinions. Some reviews can be even bag of instances (i.e., sentences in our task), where uninformative about the story type. Moreover, the labels are assigned at the bag level. In such cases, a number of reviews for movies varies greatly, creat- prediction is made for the bag by either learning to ing a challenge for modeling them computationally.