Telling Stories with : An Empirical Analysis of in

Jon Gillick David Bamman School of Information School of Information University of California, Berkeley University of California, Berkeley [email protected] [email protected]

Abstract (Guha et al., 2015; Kociskˇ y` et al., 2017), natural language understanding (Frermann et al., 2017), Soundtracks play an important role in carry- ing the story of a film. In this work, we col- summarization (Gorinski and Lapata, 2015) and lect a corpus of movies and shows image captioning (Zhu et al., 2015; Rohrbach matched with subtitles and soundtracks and et al., 2015, 2017; Tapaswi et al., 2015), the analyze the relationship between story, , modalities examined are almost exclusively lim- and reception. We look at the con- ited to text and image. In this work, we present tent of a film through the lens of its latent top- a new perspective on multimodal storytelling by ics and at the content of a song through de- focusing on a so-far neglected aspect of narrative: scriptors of its musical attributes. In two ex- the role of music. periments, we find first that individual topics are strongly associated with musical attributes, We focus specifically on the ways in which 1 and second, that musical attributes of sound- soundtracks contribute to films, presenting a first tracks are predictive of film ratings, even after look from a computational modeling perspective controlling for topic and genre. into soundtracks as storytelling devices. By devel- oping models that connect films with musical pa- 1 Introduction rameters of soundtracks, we can gain insight into The medium of film is often taken to be a canon- musical choices both past and future. While a ical example of narrative multimodality: it com- great film score is in part determined by how well bines the written narrative of dialogue with the vi- it fits with the context of the story (Byrne, 2012), sual narrative of its imagery. While words bear we are also interested in uncovering musical as- the burden alone for creating compelling charac- pects that, in general, work better in support of a ters, scenes, and plot in textual narratives like lit- film. erary , in film, this responsibility is shared To move toward understanding both what by each of the contributors, including the screen- makes a film fit with a particular kind of song and , director, music supervisor, special effects what musical aspects can broadly be effective in engineers, and many others. Working together to the service of telling a story, we make the follow- support the overall story tends to make for more ing contributions: successful component parts; the Academy Award winner for Best Picture, for example, often also 1. We present a dataset of 41,143 films paired collects many other awards and nominations— with their soundtracks. Metadata for the films including acting, cinemotography, and sound de- is drawn from IMDB, linked to subtitle infor- sign. mation from OpenSubtitles2016 data (Lison While film has recently been studied as a tar- and Tiedemann, 2016), and data get in natural language processing and computer is linked to structured audio information from vision for such tasks as characterizing gender rep- Spotify. resentation in dialogue (Ramakrishna et al., 2015; 2. We present empirical results demonstrating Agarwal et al., 2015), inferring types the relationship between audio qualities of from plot summaries (Bamman et al., 2013), mea- the soundtrack and viewers’ responses to suring the memorability of phrasing (Danescu- 1We use the word film to refer to both movies and televi- Niculescu-Mizil et al., 2012), question answering sion shows interchangeably.

33 Proceedings of the First Workshop on Storytelling, pages 33–42 New Orleans, Louisiana, June 5, 2018. c 2018 Association for Computational Linguistics the films they appear in; soundtracks with dard for cinematic scoring practices and have been more instrumental or acoustic generate extensively analyzed in the film music literature higher ratings; “danceable” songs lower the (Slowik, 2012). Following this period, with the average ratings for films they appear in. rise of rock and roll, popular music began to make 3. We present empirical results demonstrating its way into film soundtracks in addition to the the relationship between the topics that make scores written specifically for the movie. As Rod- a film script and the audio qualities of man(2006) points out, this turn coincided with di- the soundtrack. with settings in high rectors seeing the potential for songs to contribute school or college, for example, tend to have to the narrative meaning of the film: electric instrumentation and singing; sound- tracks with faster tempos appear both in films In The Blackboard Jungle, Bill Haley’s about zombies and vampires and in films in rock and roll anthem, ‘Rock Around the which the word dude appears frequently. Clock,’ was used in the opening cred- its, not only to capture the attention of 2 The Narrative Role of Music in Film the teenage audience, but also to sig- nify the rebellious energy of teenagers in The first films appeared around 1890, before the the 1950s. . . . The Graduate relied upon development of technology that enabled synchro- the music and poetry of Simon and Gar- nization of picture with sound (Buhler et al., funkel to portray the alienation of Amer- 2010). While silent films featured no talking or ican youth of the 1960s. Easy Rider music in the film itself, they were often accom- took a more aggressive countercultural panied by music during live performances in the- stance by using the of Hoyt atres. Rather than playing set scores, these live ac- Axton, Steppenwolf, The Byrds, and companiments were largely improvised; practical Jimi Hendrix to portray the youth rebel- catalogues for such performances describe the mu- lion in American society, complete with sical elements appropriate for emotions and narra- communes, long hair and drugs (Rod- tive situations in the film (Becce, 1919; Erdmann man, 2006, 123) et al., 1927). For example, Lang and West(1920) note that a string accompaniment with tremolo In recent years, the boundaries between popu- (trembling) effect is appropriate for “suspense and lar music and film music in the traditional sense impending disaster”; an organ tone with heavy have become increasingly blurred, pushed for- pedal is appropriate for “church scenes” and for ward especially by more affordable music produc- generally connoting “impressive dignity”; flutes tion technology including synthesizers and pre- are fitting for conveying “happiness,” “spring- recorded samples that allow a broad range of time” or “sunshine.” to use sounds previously reserved for With the rise of talkies in the late 1920’s those with access to a full (Pinch et al., (Slowik, 2012), music could be incorporated di- 2009). Though pioneers like rectly into the production of the film, and was of- Wendy Carlos have been composing for film since ten composed specifically for it; Gorbman(1987) the late 1960’s (Pinch et al., 2009), pop and elec- describes that in the classical model of film pro- tronic have only gradually been recog- duction, scored music is “not meant to be heard nized as film composers in their own right, with consciously,” primarily acts as a signifier of emo- Daft Punk’s original score for Tron: Legacy in tion, and provides referential and narrative cues, 2010 marking a breakthrough into the mainstream such as establishing the setting or character. The (Anderson, 2012). use of Wagnerian —the repeated associ- ation of a musical phrase with a film element, 3 Data such as a character—is common in original scores, In order to begin exploring the relationship be- especially in those for films (Prendergast, tween films and their soundtracks, we gather data 1992). from several different sources. First, we draw Works from the “Golden Age” of film mu- on the OpenSubtitles2016 data of parallel texts sic (the period between 1935–1950, shortly af- for film subtitles (Lison and Tiedemann, 2016); ter the rise of synchronized sound) set the stan- this dataset includes scripts for a wide variety of

34 movies and episodes of television shows (106,609 of Golden Slumbers by Jennifer Hudson (from the total in English) and contains publicly available movie Sing) were not in Spotify’s catalogue, it subtitle data. Each film in the OpenSubtitles2016 would match the performance by The Beatles on data is paired with its unique IMDB identifier; us- . ing this information, we extract IMDB metadata Spotify provides a number of extracted audio for the film, including title, year of release, av- features for each song; from a set of 13 we chose erage user rating (a real number from 0-10), and 5 that we hypothesized would be predictive of genre (a set of 28 categories, ranging from drama viewer preferences and whose descriptors are also and comedy to war and film-noir). interpretable enough to enable discussion. Those Most importantly, we also identify soundtrack that we include in our analysis are the follow- information on IMDB using this identifier; sound- ing, with descriptions drawn from Spotify’s Track tracks are listed on IMDB in the same form as they API:3 appear in the movie/television credits (generally also in the order of appearance of the song). A Mode. “Mode indicates the modality (major • typical example is the following: or minor) of a track, the type of scale from which its melodic content is derived. Major Golden Slumbers is represented by 1 and minor is 0.” Written by John Lennon and Paul McCartney Tempo. “The overall estimated tempo of a • Performed by Jennifer Hudson track in beats per minute (BPM). In musical Jennifer Hudson appears courtesy of Epic terminology, tempo is the speed or pace of a Records given piece and derives directly from the av- Produced by Harvey Mason Jr. erage beat duration.” In the raw data, these range from 36 to 240; we divide by the max- This structured format is very consistent across imum value of 240 to give a range between films (owing to the codification of the appear- 0.15 and 1. ance of this information in a film’s , Danceability. “Danceability describes how which is thereby preserved in the user transcrip- • tion on IMDB2). For each song in a soundtrack suitable a track is for dancing based on a for a film, we extract the title, performers and writ- combination of musical elements including ers through regular expressions (which are precise tempo, rhythm stability, beat strength, and given the structured format). overall regularity. A value of 0.0 is least We then identify target candidate matches for danceable and 1.0 is most danceable.” a source soundtrack song by querying the public Instrumentalness. “Instrumentalness pre- • Spotify API for all target songs in the Spotify cat- dicts whether a track contains no vocals. alogue with the same title as the source song in ‘Ooh’ and ‘aah’ sounds are treated as instru- the IMDB soundtrack. The names of performers mental in this context. Rap or spoken word are not standardized across datasets (e.g., IMDB tracks are clearly ‘vocal.’ The closer the in- may list an artist as The Velvet Underground, while strumentalness value is to 1.0, the greater Spotify may list the same performance as The Vel- likelihood the track contains no vocal con- vet Underground and Nico). To account for this, tent. Values above 0.5 are intended to rep- we identify exact matches between songs as those resent instrumental tracks, but confidence is that share the same title and where the longest higher as the value approaches 1.0.” common substring between the source and target Acousticness. “A confidence measure from • performers spans at least 75% the length of ei- 0.0 to 1.0 of whether the track is acoustic. 1.0 ther entity; if no exact match is found, we iden- represents high confidence the track is acous- tify the best secondary match as the target song tic.” with the highest Spotify popularity among target candidates with the same title as the source. In The dataset totals 189,340 songs (96,526 the example above, if this particular performance unique) from 41,143 movies/television shows, 2https://help.imdb.com/article/ contribution/titles/soundtracks/ 3https://developer.spotify.com/ GKD97LHE9TQ49CZ7 web-api/get-audio-features/

35 0.60 0.4 0.52

0.55 0.50 0.3

0.48 0.2 0.50 0.46

0.1 0.44 1950 1970 1990 2010 1950 1970 1990 2010 1950 1970 1990 2010

(a) Average danceability (b) Average tempo (c) Average instrumentalness

0.7 0.8


0.7 0.5

0.4 0.6

0.3 1950 1970 1990 2010 1950 1970 1990 2010

(d) Average acousticness (e) Average mode (major = 1)

Figure 1: Change in audio features over time, 1950–2015. Films and TV shows in the late 1980s peaked for danceable soundtracks, with electric instrumentation and singing. Each plot displays the average value for that feature, with 95% bootstrap confidence intervals. along with a paired script for each movie/show. 4 Analysis Figures1 and2 provide summary statistics of this dataset (using only the metadata and audio fea- We present two analyses here shedding some light tures) and begin to demonstrate the potential of on the role that music plays in the narrative form this data. Figure1 illustrates the change in the of films, demonstrating the relationship between average value of each feature between 1950–2015. fine-grained topics in a film’s script and specific audio features described in 3 above; and mea- Soundtracks featuring acoustic songs naturally de- § cline over this time period with the rise of elec- sure the impact of audio features in the sound- tric instruments; as time progresses, soundtracks track on the reception to the storytelling in the feature quicker tempos and include more songs in form of user reviews, attempting to control for the minor keys. The 1980s in particular are peaks for topical and generic influence of the script using danceable soundtracks, with electric instrumenta- topically-coarsened exact matching in causal in- tion and voice, while the 1990s appear to react ference (Roberts et al., 2016). against this dominance by featuring songs with comparatively lower danceability, higher acoustic 4.1 Topic analysis of audio features instrumentation, and less singing. While we can expect to see trends in the music em- ployed in films over time and across genres, these Figure2, in contrast, displays variations in se- surface descriptors do not tell us about the actual lected audio features across genres. Movies and contents of the film. What kind of stories have television shows tagged by IMDB users with genre soundtracks in minor keys? Is there a relationship labels of “war” and “” tend to have songs between the content of a movie or that are slow and in major keys, whereas game and the tempo of its soundtrack? We investigate shows and adult films more often have faster songs this by regressing the content of the script against in minor keys. We can also see from figure2 that each audio feature; rather than representing the different audio characteristics can have different script by the individual words it contains, we seek amounts of variation across genres; mode varies instead to uncover the relationship between broad more with genre than tempo does. thematic topics implicit in the script and those fea-

36 (a) Top and bottom 5 film genres with the fastest and slowest soundtracks. Blue indicates faster tempos and red indicates slower tempos.

(b) Top and bottom 5 film genres whose soundtracks are most major and most minor. Blue indicates major and red indicates minor.

Figure 2: Top and bottom 5 film genres in terms of average tempo and mode. Heights of the bars represent percentage difference from the mean across the entire dataset. Actual values are at the tops of the bars. tures. the relationship between the topic representations To do so, we identify topics in all 41,143 scripts using binary logistic regression. using LDA (Blei et al., 2003), modeling each Table1 illustrates the five strongest positive and script as a distribution over 50 inferred topics, re- negative topic predictors for each audio feature. moving stopwords and names. While the latent topics do not have defined cate- We model the impact of individual topics on gories, we can extract salient aspects of the stories audio features by representing each document as based on the words in the most prevalent topics. its 50-dimensional distribution of topic propor- The words describing the topics most and least as- tions; in order to place both very frequent and sociated with an audio feature can give us some infrequent topics on the same scale, we normal- insight into how songs are used in soundtracks. ize each topic to a standard normal across the dataset. For each of real-valued au- Mode. Major versus minor is typically one • dio features y = danceability, acousticness, of the most stark musical contrasts, with the { instrumentalness, and tempo , we regress the major key being characteristically associated } relationship between the document-topic repre- with joy and excitement, and the minor key sentations in turn using OLS; for the binary-valued being associated with melancholy (Hevner, mode variable (major vs. minor key), we model 1935). We see songs in major being used in

37 0.141 captain ship sir 0.044 mr. sir mrs. 0.017 baby yo y’all 0.074 town horse sheriff 0.028 boy huh big 0.010 woman married sex 0.055 sir colonel planet 0.020 christmas la aa 0.008 sir brother heart 0.050 boy huh big 0.019 sir dear majesty 0.007 dude cool whoa 0.050 mr. sir mrs. 0.019 leave understand father 0.006 spanish el la

-0.045 sir brother heart -0.017 um work fine -0.006 father lord church -0.052 leave understand father -0.018 kill dead blood -0.007 remember feel dead -0.054 gibbs mcgee boss -0.020 fuck shit fucking -0.008 sir dear majesty -0.066 agent security phone -0.021 school class college -0.011 captain ship sir -0.083 baby yo y’all -0.024 dude cool whoa -0.015 sir colonel planet (a) mode (major/minor) (b) acousticness (c) danceability

0.003 dude cool whoa 0.025 sir colonel planet 0.003 sir colonel planet 0.020 mr. sir mrs. 0.002 music show sing 0.017 years world work 0.002 um work fine 0.017 captain ship sir 0.002 kill dead blood 0.016 agent security phone

-0.002 baby yo y’all -0.011 school class college -0.002 remember feel dead -0.012 baby yo y’all -0.002 sir dear majesty -0.013 woman married sex -0.004 mr. sir mrs. -0.014 sir brother heart -0.005 captain ship sir -0.015 music show sing (d) tempo (e) instrumentalness

Table 1: Script topics predictive of audio features. For each feature, the 5 topics most predictive of high feature values (1-5) and the 5 topics most predictive of low feature values (46-50) are shown. Topics are displayed as the top three most frequent words within them. Coefficients for mode (as a categorical variable) are for binary logistic regression; those for all other features are for linear regression.

films with polite greetings, those with sher- “baby yo”—dominant in movies like Mal- iffs and horses, and with ship captains. Mi- ibu’s Most Wanted (2003), Menace II Society nor songs appear more often in stories involv- (1993) Hustle & Flow (2005)—and “women ing separation from parents, FBI agents, or married”—dominant in episodes of viruses. The strongest topic associated with and Sex and the City. major key is “captain ship”; productions Tempo. Musical tempo is a measure of pace; • mostly strongly associated with this topic in- the strongest topic associated with fast pace is clude episodes from the TV show : the “dude cool” topic, include episodes from The Next Generation. the TV show Workaholics and The Simpsons. Acousticness. The acousticness of a sound- Perhaps unsurprisingly, the mannered “mr. • track captures the degree of electric instru- sir mrs.” topic is associated with a slow mentation; as figure 4(d) shows, acoustic- tempo. ness shows the greatest decline over time Instrumentalness. Instrumentalness mea- • (corresponding to the rise of electric instru- sures the degree to which a song is en- ments); we see this also reflected topically tirely instrumental (i.e., devoid of vocals like here, with the topic most strongly associ- singing), such as . The “mr. ated with acoustic soundtracks being “mr. sir sir mrs.” topic again rates highly along this mrs.”; this topic tends to appear in period dimension (presumably corresponding with pieces and older films, such as Arsenic and the use of classical music in these films); Old Lace (1944). also highly ranking is the “sir colonel” topic, Danceability. The danceability of a song is which is primarily a subgenre of science fic- • the degree to which is it suitable for danc- tion, including episodes from the TV show ing. The topics most strongly associated with and the movie : Episode consistently danceable soundtracks including III: Revenge of the Sith (2005).

38 4.2 Impact on ratings causal claims but rather to provide a stricter crite- rion against which to assess the significance of our While the analysis above examines the internal re- results. lationship between a film’s soundtrack and its nar- Here, let us define the “treatment variable” to rative, we can also explore the relationship be- be the variable (such as “acousticness”) whose re- tween the soundtrack and a film’s reception: how lationship with rating we are seeking to establish. do respond to movies with fast sound- The original value for this variable is real; we bi- tracks, to acoustic soundtracks, or to soundtracks narize it into two treatment conditions (0 and 1) that are predominantly classical (i.e., with high by thresholding at 0.5 (all values above this limit instrumentalness)? We measure response in this are set to 1; otherwise 0). To test the relation- work by the average user rating for the film on ship between audio features and user ratings in this IMDB (a real value from 0-10). procedure, we place each data point in a stratum One confound with this kind of analysis is the defined by the values of its other covariates; we complication with the content of the script; as 4.1 § coarsen the values of each covariate into a binary demonstrates, some topics are clearly associated value: for all numeric audio features, we binarize with audio features like “acousticness,” so if we at a threshold of 0.5; for topic distributions, we identify a relationship between acousticness and coarsen by selecting the argmax topic as the single a film’s rating, that might simply be a relation- binary topic value. For each stratum with at least ship between the underlying topic (e.g., “dude,” 5 data points in each treatment condition, we sam- “cool,” “whoa”) and the rating, rather than acous- ple data points to reflect the overall distribution of ticness in itself. the treatment variable in the data; any data points In order to account for this, we employ meth- in strata for which there are fewer than 5 points ods from causal inference in observational studies, from each condition are excluded from analy- drawing on the methods of coarsened exact match- sis. This, across all strata, defines our matched ing using topic distributions developed by Roberts data for analysis. We carry out this process once et al.(2016). Conventional methods for exact for each treatment variable mode, danceability, { matching aim to identify the latent causal exper- acousticness, instrumentalness, and tempo . iment lurking within observational data by elimi- } nating all sources of variation in covariates except Coefficient Audio feature for the variable of interest, identifying a subset of 0.121∗ Acousticness the original data in which covariates are balanced 0.117∗ Instrumentalness between treatment conditions; in our case, if 100 0.031 Tempo films have high tempo, 100 have low tempo and 0.024∗ Mode the 200 films are identical in every other dimen- 0.103∗ Danceability sion, then if tempo has a significant correlation − with a film’s rating, we can interpret that signif- Table 2: Impact of audio features on IMDB aver- icance causally (since there is no other source of age user rating. Features marked ∗ are significant variation to explain the relationship). at p 0.001. ≤ True causal inference is dependent on accurate model specification (e.g., its assumptions fail if an Table2 presents the results of this analysis: important explanatory covariate is omitted). In all audio features of the soundtrack except tempo our case, we are seeking to model the relation- have a significant (if small) impact on the average ship between audio features of the soundtrack and user rating for the film they appear with. A highly IMDB reviewers’ average rating for a film, and in- danceable soundtrack would lower the score of a clude features representing the content of a film film from 9.0 to a 8.897; adding an acoustic sound- through a.) its topic distribution and b.) explicit track would raise it to 9.121; and adding an instru- genre categories from IMDB (a binary value for mental soundtrack with no vocals would raise it to each of the 28 genres). We know that this model 9.117. Our experiments suggest that certain musi- is mis-specified—surely other factors of a film’s cal aspects might be generally more effective than content impacting its rating may also be corre- others in the context of a film score, and that these lated with audio features—but in using the ma- attributes significantly shape a viewer’s reactions chinery of causal inference, we seek not to make to the overall film.

39 5 Previous Work point in the opposite direction, indicating a de- crease of 0.1 stars. Because the ability to forecast box office success is We hope that one of the primary beneficiaries of practical interest for who want to decide of the line of work introduced here will be mu- which scripts to “Greenlight” (Eliashberg et al., sic supervisors, whose job involves choosing ex- 2007; Joshi et al., 2010) or where to invest mar- isting music to license or hiring composers to cre- keting dollars (Mestyan´ et al., 2013), a number of ate original scores. Understanding the connections previous studies look at predicting movie success between the different modalities that contribute to by measuring box office revenue or viewer ratings. a story can be useful for understanding the history Eliashberg et al.(2007) used linear regression on of film scoring and music licensing as well as for metadata and textual features drawn from “spoil- making decisions during the production process. ers”, detailed summaries written by moviegoers, Though traditionally the music supervisor plays a to predict return on investment in movie-making. well-defined role on a film, in contemporary prac- Joshi et al.(2010) used review text from critics tice many people contribute to music supervision to predict revenues, finding n-gram and depen- throughout the production process for all kinds of dency relation features to informative complement media, from movies and television to advertising, to metadata-based features. Jain(2013) used Twit- social media, and games. ter sentiment to predict box office revenue, further There are several directions of future research classifying movies as either a Hit, a Flop, or Aver- that are worth further pursuit. First, while we age. Oghina et al.(2012) used features computed have shown that strong relationships exist between from Twitter and YouTube comments to predict films and their soundtracks as a whole and that a IMDB ratings. soundtrack is predictive of user ratings, this rela- Though a number of previous works have at- tionship only obtains over the entirety of the script tempted to predict film performance from text and the entirety of the soundtrack; a more fine- and metadata, little attention has been paid to the grained model would anchor occurrences of indi- role of the soundtrack in a movie’s success. Xu vidual songs at specific moments in the temporal and Goonawardene(2014) did consider sound- narrative of the script. While our data does not tracks, finding the volume of internet searches for directly indicate when a song occurs, latent vari- a movie’s soundtrack preceding a release to be pre- able modeling over the scripts and soundtracks in dictive of box office revenue. This work, however, our collection may provide a reasonable path for- only considers the popularity of the soundtrack ward. Second, while our work here has focused as a surface feature; it does not directly measure on descriptive analysis of this new data, a poten- whether the musical characteristics of the songs tially powerful application is soundtrack genera- in the soundtracks are themselves predictive. tion: creating a new soundtrack for a film given 6 Conclusion the input of a script. This application has the po- tential to be useful for music supervisors, by sug- In this work, we introduce a new dataset of films, gesting candidate songs that fit the narrative of a subtitles, and soundtracks along with two empir- given script in production. ical analyses, the first demonstrating the connec- Music is a vital storytelling component of tions between the contents of a story, (as mea- multimodal narratives such as film, television sured by the topics in its script) and the musical and theatre, and we hope to drive further work features that make up its soundtrack, and the sec- in this area. Data and code to support this ond identifying musical aspects that are associated work can be found at https://github.com/ with better user ratings on IMDB. Soundtracks us- jrgillick/music_supervisor. ing acoustic instruments, as measured by Spotify’s “acousticness” descriptor, and those with instru- 7 Acknowledgments ments but no vocals, as measured by the “instru- Many thanks to the anonymous reviewers for their mentalness” descriptor, are each linked with more helpful feedback. The research reported in this ar- than a 0.11 increase in ratings on a 10-star scale, ticle was supported by a UC Berkeley Fellowship even when controlling for other musical dimen- for Graduate Study to J.G. sions, topic, and genre through Coarsened Exact Matching. Soundtracks that are more “danceable”

40 References the North American Chapter of the Association for Computational Linguistics: Human Language Tech- Apoorv Agarwal, Jiehan Zheng, Shruti Kamath, Sri- nologies. pages 1066–1076. ramkumar Balasubramanian, and Shirin Ann Dey. 2015. Key female characters in film have more to Tanaya Guha, Che-Wei Huang, Naveen Kumar, Yan talk about besides men: Automating the bechdel Zhu, and Shrikanth S Narayanan. 2015. Gender test. In Proceedings of the 2015 Conference of representation in cinematic content: A multimodal the North American Chapter of the Association for approach. In Proceedings of the 2015 ACM on In- Computational Linguistics: Human Language Tech- ternational Conference on Multimodal Interaction. nologies. Association for Computational Linguis- ACM, pages 31–34. tics, Denver, Colorado, pages 830–840. http:// www.aclweb.org/anthology/N15-1084. Kate Hevner. 1935. The affective character of the ma- jor and minor modes in music. The American Jour- Stacey Anderson. 2012. Electronic dance music goes nal of Psychology 47(1):103–118. http://www. hollywood. The Times . jstor.org/stable/1416710. David Bamman, Brendan O’Connor, and Noah A. Vasu Jain. 2013. Prediction of movie success us- Smith. 2013. Learning latent personas of film char- ing sentiment analysis of tweets. The International acters. In Proceedings of the 51st Annual Meet- Journal of Soft Computing and Software Engineer- ing of the Association for Computational Linguistics ing 3(3):308–313. (Volume 1: Long Papers). Association for Computa- tional Linguistics, Sofia, Bulgaria, pages 352–361. Mahesh Joshi, Dipanjan Das, Kevin Gimpel, and Noah A Smith. 2010. Movie reviews and rev- Giuseppe Becce, editor. 1919. Kinothek. Neue Film- enues: An experiment in text regression. In Human musik. Schlesinger’sche Buchhandlung. Language Technologies: The 2010 Annual Confer- ence of the North American Chapter of the Associa- David M. Blei, Andrew Ng, and Michael Jordan. 2003. tion for Computational Linguistics. Association for Latent dirichlet allocation. JMLR 3:993–1022. Computational Linguistics, pages 293–296. James Buhler, David Neumeyer, and Rob Deemer. Toma´sˇ Kociskˇ y,` Jonathan Schwarz, Phil Blunsom, 2010. Hearing the movies: music and sound in film Chris Dyer, Karl Moritz Hermann, Gabor´ Melis, history. Oxford University Press New York. and Edward Grefenstette. 2017. The narrativeqa reading comprehension challenge. arXiv preprint . 2012. How music works. Three Rivers arXiv:1712.07040 . Press. Edith Lang and George West. 1920. Musical accom- Cristian Danescu-Niculescu-Mizil, Justin Cheng, Jon paniment of moving pictures; a practical manual for Kleinberg, and Lillian Lee. 2012. You had me at pianists and organists and an exposition of the prin- hello: How phrasing affects memorability. In Pro- ciples underlying the musical interpretation of mov- ceedings of the 50th Annual Meeting of the Associ- ing pictures, by. The Boston Music Co., Boston. ation for Computational Linguistics: Long Papers - Volume 1. Association for Computational Linguis- Pierre Lison and Jrg Tiedemann. 2016. Opensub- tics, Stroudsburg, PA, USA, ACL ’12, pages 892– titles2016: Extracting large parallel corpora from 901. http://dl.acm.org/citation.cfm? movie and tv subtitles. In Nicoletta Calzolari (Con- id=2390524.2390647. ference Chair), Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Jehoshua Eliashberg, Sam K Hui, and Z John Zhang. Joseph Mariani, Helene Mazo, Asuncion Moreno, 2007. From story line to box office: A new approach Jan Odijk, and Stelios Piperidis, editors, Proceed- for green-lighting movie scripts. Management Sci- ings of the Tenth International Conference on Lan- ence 53(6):881–893. guage Resources and Evaluation (LREC 2016). Eu- ropean Language Resources Association (ELRA), Hans Erdmann, Giuseppe Becce, and Ludwig Brav. Paris, . 1927. Allgemeines Handbuch der Film-Musik. Schlesinger. Marton´ Mestyan,´ Taha Yasseri, and Janos´ Kertesz.´ 2013. Early prediction of movie box office suc- Lea Frermann, Shay B Cohen, and Mirella Lapata. cess based on wikipedia activity big data. PloS one 2017. Whodunnit? crime drama as a case for 8(8):e71226. natural language understanding. arXiv preprint arXiv:1710.11601 . Andrei Oghina, Mathias Breuss, Manos Tsagkias, and Maarten de Rijke. 2012. Predicting imdb movie rat- Claudia Gorbman. 1987. Unheard melodies: Narrative ings using social media. In European Conference on film music. Indiana University Press. Information Retrieval. Springer, pages 503–507. Philip John Gorinski and Mirella Lapata. 2015. Movie Trevor J Pinch, Frank Trocco, and TJ Pinch. 2009. script summarization as graph-based scene extrac- Analog days: The invention and impact of the Moog tion. In Proceedings of the 2015 Conference of synthesizer. Harvard University Press.

41 Roy M Prendergast. 1992. Film music: a neglected art: a critical study of music in films. WW Norton & Company. Anil Ramakrishna, Nikolaos Malandrakis, Elizabeth Staruk, and Shrikanth Narayanan. 2015. A quan- titative analysis of gender differences in movies us- ing psycholinguistic normatives. In Proceedings of the 2015 Conference on Empirical Methods in Nat- ural Language Processing. Association for Compu- tational Linguistics, Lisbon, Portugal, pages 1996– 2001. https://aclweb.org/anthology/ D/D15/D15-1234. Margaret E. Roberts, Brandon Stewart, and R. Nielsen. 2016. Matching methods for high-dimensional data with applications to text. Working paper. Ronald Rodman. 2006. The popular song as leitmotif in 1990s film. Changing Tunes: The Use of Pre- existing Music in Film pages 119–136. Anna Rohrbach, Marcus Rohrbach, Niket Tandon, and Bernt Schiele. 2015. A dataset for movie descrip- tion. CVPR . Anna Rohrbach, Atousa Torabi, Marcus Rohrbach, Niket Tandon, Christopher Pal, Hugo Larochelle, Aaron Courville, and Bernt Schiele. 2017. Movie description. International Journal of Computer Vi- sion 123(1):94–120. https://doi.org/10. 1007/s11263-016-0987-1. Michael James Slowik. 2012. Hollywood film music in the early sound era, 1926-1934 . Makarand Tapaswi, Martin Bauml, and Rainer Stiefel- hagen. 2015. Book2movie: Aligning scenes with chapters. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Haifeng Xu and Nadee Goonawardene. 2014. Does movie soundtrack matter? the role of soundtrack in predicting movie revenue. In PACIS. page 350. Yukun Zhu, Ryan Kiros, Richard S. Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Aligning and movies: Towards story-like visual explanations by watching movies and reading books. ICCV .